Following a good question in part Iv2 @jeremy mentioned the usage of skip connection in RNNs rather than CNNs. I did a quick poking around and found:
Which presents a nice study of several different ways of doing it. and:
Which provides another interesting alternative and source code in tensorflow. Hopefully I can find some time to dig into this deeper as it looks very powerful and a pytorch port might be a worthy effort.
Anyway I found it interesting and thought I’d share.