Multilingual ULMFiT

piotr.czapla · November 8, 2018, 12:14am

I’ve merged @sebastianruder changes with master branch and created a new branch ulmfit_multilingual in my fastai fork.
While I was adopting the code to recent changes I’ve noticed that fastai.text has some type warnings and inconsistencies so I think it is still a work in progress. Which means that we should have a way to smoothly pull changes from master and propose PR with our fixes to the master branch of fastai. I see two options to do so:

start development of ulmfit in a fork of master branch and occasionally create pull request with fixes and our changes to ulmfit
extract ulmfit code to a separate repo.

Initially, I was for 1, but after some thoughts 2 sounds like a better alternative as we will have quite a few changes to the ulmfit:

new tests to the ULMFiT scripts to be able to quickly check if they work after each refactoring of fastai
scripts to run ulmfit on XNLI
language-specific scripts to play and test with different classification datasets.
notebooks to test BERT against ULMFiT on text classification etc.

So I’m opting for option 2. What do you think @sgugger, @sebastianruder, @jeremy ?