ULMFIT - Spanish

imaginary · December 23, 2018, 8:24pm

Results:

LSTM language model: 4 epochs, 3.140521 for val loss and 0.376913 accuracy. Perplexity was thus 23.1038
QRNN language model: 7 epochs, 3.193912 for val loss and 0.367543 accuracy. Perplexity was thus 24.2884

Pre-trained models can be found here along with the itos file: https://drive.google.com/open?id=1CZftqrMg-MRH9yXV7FRBv6J_NOtBiK-2

I decided to train the LM on fastai v1 myself. I ended up using G Cloud services and taking advantage of their 300 USD credits. This allowed me to set up a V100 instance and just train there. Using QRNNs resulted in ~30 mins per epoch. LSTMS were around ~1:00 per epoch. I used a wiki dump and generated a 100M training set, with a 30k vocab. All this to say there’s definitely room for improvement and anyone could go ahead and improve these results.

Shoutouts to @sgugger for guiding me along the way and fixing a bug just in time for me train.

If someone could do some baseline testing with this LM, that’d be sweet.