Expected input batch_size (40) to match target batch_size (3600)

mulholio · March 5, 2021, 12:36pm

I’ve discovered that using the get_language_model fixes my shape problem.

# Does not work
model = AWD_LSTM(len(dls.vocab), emb_sz=400, n_hid=1152, n_layers=2)

# Works!
model = get_language_model(AWD_LSTM, len(dls.vocab))

Looking at the source for get_langauge_model, it looks like this is doing some things under the hood that I wasn’t aware of.

Primarily, it looks like there is a ‘decoder’ step that I didn’t know about. This seems to explain the shape issue. I’ll read up on this after lunch over at https://machinelearningmastery.com/encoder-decoder-recurrent-neural-network-models-neural-machine-translation/