I just wanted to drop an idea here for a network architecture that could be included in FastAI. I hope this isn’t redundant (i.e. it has already implemented).
Recently, there have been some implementations of CNNs that are (at least from the published research) seemingly outperforming RNN models in different tasks. These so-called Temporal Convolutional Networks has various advantages over RNNs (faster, longer memory, training, etc.). The most famous perhaps architecture of this is WaveNet (https://arxiv.org/pdf/1609.03499.pdf).
I would say that, at least for someone at my level, it is not the easiest architecture to decipher and certainly not the easiest to write in code which I feel makes it a good candidate, apart from the perfomance, for FastAI.
There are a few implementations out there, the most popular pyTorch seems to be this one: https://github.com/locuslab/TCN/. There is also a higher level API for Keras, which I think is quite amazing. It offers simple calls that generate different types of TCNs with various input parameters for the user to use (https://github.com/philipperemy/keras-tcn). It might be a good example to base a FastAI implementation on.
I started working on TCN with fastai (v0.7) a few months back (github repo). The goal was to train a TCN language model and finetune it for text classification, similar to Jeremy’s ULMFIT approach which used LSTMs.
However, I couldn’t get the perplexity low enough. I would like to revisit this project with fastai v1.
Did you make any progress on this project? I played with TCN last year for stock prediction (without much success). Now would like to experiment with audio classification using fastai2 DataLoaders and Learner.
The first step would be to implement pmnist_test in a demo notebook, using fastai2.