Language Model Zoo 🦍


(Piotr Czapla) #350

That is usually achieved via finetuning LM trained on wikitext. Do you have large enough corpus 100m+ tokens to train LM from scratch?

The old code won’t work with new fastai it changed a lot. If you want to start from zero try https://github.com/n-waves/ulmfit-multilingual


(Janne) #351

What kind of total training times have people got when training a total LM with Wikipedia data?

Or what kind of training time could one expect for 450.000 articles, how many days of training for 1080 Ti or Tesla P100 for example?


#352

Hey. My dataset is a mixture of French and English and I have a classification problem. Can you give me some advice on using Ulmfit? Should I train a new LM on mixed French and English wiki? Thanks