I’ve discovered that using the get_language_model
fixes my shape problem.
# Does not work
model = AWD_LSTM(len(dls.vocab), emb_sz=400, n_hid=1152, n_layers=2)
# Works!
model = get_language_model(AWD_LSTM, len(dls.vocab))
Looking at the source for get_langauge_model
, it looks like this is doing some things under the hood that I wasn’t aware of.
Primarily, it looks like there is a ‘decoder’ step that I didn’t know about. This seems to explain the shape issue. I’ll read up on this after lunch over at https://machinelearningmastery.com/encoder-decoder-recurrent-neural-network-models-neural-machine-translation/