Really excited to share that we’re about to release the final version of our NLP classification paper. Here’s a post @sebastianruder and I wrote about it, along with a new mini-site for all the work we hope to do with NLP and transfer learning: http://nlp.fast.ai/
The new version of the paper should appear in the next day or two on Arxiv. I’ll post here when it’s up. In the meantime, any feedback regarding the new post and site above would be much appreciated.
I read your paper carefully. However, I don’t quite understand the figure 1: you talk about a 3-layer AWD-LSTM in your paper but figure 1 shows a fully connected 3-layer network. Why isn’t it a 3-layer AWD-LSTM?
I’m using your paper and I am getting very promising results. I’ll let you know as soon as I’m done!
Thanks in advance!
@sebastianruder created that figure so let’s ask him
Mostly for clarity. I was afraid showing the entire unrolled LSTM would take attention away from the important bits. Besides, the main thing missing are arrows to indicate the temporal dependence; that’s not a huge difference IMO.