Backwards LM for fastai v1

mboyanov · October 22, 2018, 3:49am

The new language model provided by download_wt103 (i.e. this one http://files.fast.ai/models/wt103_v1/) only has a forward direction. Will a backwards direction also be provided in the future?
If not, is it possible to adapt the old LM (http://files.fast.ai/models/wt103/) for use with v1?

sgugger · October 22, 2018, 9:48am

It will be added in the future (right now we’re focusing on the beginning of the course). The previous one can’t be transferred as we didn’t use the same vocabulary, number of tokens or tokenization process.

mboyanov · October 27, 2018, 5:36am

I believe I managed to copy the weights from the previous version (i.e. the h5 files) to the new .pth expected format.
The only thing that worries me is that the decoder bias seems to be missing in the .h5 files. Not that it matters, since we’re gonna be fine tuning it anyways.
Would you be interested in hosting the new versions at files.fast.ai ? I’m not sure what the rights on the original files are, so I don’t know if I am allowed to distribute them myself.

sgugger · October 27, 2018, 1:11pm

We changed a bit the way we tokenize, so the model probably wouldn’t perform as well. Plus it lacks bias as you noticed.
You can completely share your result while waiting for us to release the backward model, but I’d prefer to host on files.fast.ai only the new backward model (I promise I’ll train it soon!)

mboyanov · October 28, 2018, 10:50am

Awesome, thank you in advance

wgpubs · November 15, 2018, 7:56pm

Just wondering how soon?

pjetro · December 27, 2018, 8:27pm

@sgugger Could you please point me to the exact procedure used to train the forward v1 language model? I could train the backwards version and share it, but I’d like to be consistent.

In any case, I think it would be useful to know how to reproduce the forward model.

I am assuming it was produced by https://github.com/fastai/fastai/tree/master/courses/dl2/imdb_scripts - but which code version, what settings in each step?

Or perhaps ulmfit-multilingual - refactor branch is currently a better choice - since it’s recently had more development done?