Saving and loading a language model with TransformerXL

bachsh · March 3, 2019, 12:25pm

UPDATE: It was probably a bug in v1.0.45 (which has been retracted from PyPi). I’m working with v1.0.46 and it seems to be working great.

Hey,

Has anyone tried saving a TransformerXL language model and then loading it again (I’m talking about the Learner, not the encoder)?

I’ve trained a model and got nice predictions on it. Then I tried saving and loading the model in two different ways but I’m getting junk in the prediction for both ways. It seems as though the state of the model has not been loaded properly. I’m using v.1.0.45.

First option I’ve tried, using export/load_learner:

data_lm = TextLMDataBunch.from_df(...)
data_lm.save('data_lm')
learn = language_model_learner(data_lm, TransformerXL)
learn.validate(data_lm.valid_dl)  # validation error ~= 1.9
learn.predict("seed text")  # gives plausible results
learn.export('lm.pkl')

### another process 
learn = load_learner(path, 'lm.pkl')
learn.predict("seed text")  # gives random stuff
# maybe the random is ok? calculate validation error
data_lm = load_data(path, 'data_lm')
learn.validate(data_lm.valid_dl)  # validation error ~= 5.6

Second option, using the deprecated load:

data_lm.save('data_lm')
learn.save('lm')

### another process 
data_lm = load_data(path, 'data_lm')
learn = language_model_learner(data_lm, TransformerXL)
learn.load('lm')
learn.validate(data_lm.valid_dl)  # validation error ~= 5.6

In both options, it seems that the model is not loaded properly. I don’t know how to tell if the problem is with saving or loading

sgugger · March 3, 2019, 3:25pm

Yes the architecture is new so there were bugs in 1.0.45. Hopefully it will be good to go now