Loading saved language model

(Kam) #1

Is there any way to load a pretrained language model similar to ConvLearner.pretrained? Whenever I restart the training md = LanguageModelData(PATH, TEXT, **FILES, bs=bs, bptt=bptt, min_freq=10) always takes a bit. I’ve tried dumping the entire object with pickle but that doesn’t seem to work.

(Rob H) #2

Have you tried save() and load()?

I’ve been able to do save_encoder() and load_encoder() as in the lesson4 imdb notebook, although I haven’t been successful saving and running the whole model via save() and load(). This is the error I get,

While copying the parameter named 0.encoder.weight, whose dimensions in the model are torch.Size([49346, 200]) and whose dimensions in the checkpoint are torch.Size([49173, 200]), ...

I guess the first dimension is number of words, and maybe if I used the same dataset for training/validation each time it’d be the same, but I haven’t tried that yet.

(Kam) #3

So i’ve been able to load it successfully but the issue is that i m lazy and don’t like waiting for that minute or two of md = LanguageModelData(PATH, TEXT, **FILES, bs=bs, bptt=bptt, min_freq=10)

So afterwards i do a learner.load('em_size_500_bs_32_cycle_5_11_25') and everything is works. Was just trying to see if there was faster way to load the language model

(Sam) #4


Have you found a way to save the datamodel?

I am running the notebook on a local machine and it take more than 30 minutes to build the datamodel md

(Kam) #5

So if you read Rob’s comment it tells you how to save and load the built data models. I was referring to loading the model in memory which takes ~1-2 min. What you want is to call the save function on the model.

(Sam) #6

Maybe I did not articulate well. Its a two part question… can I save md to hard drive and can I read it into memory?

Imagine I started running the notebook cell by cell and reached the cell where the LanguageModelData is built.

md = LanguageModelData.from_text_files(PATH, TEXT, **FILES, bs=bs, bptt=bptt, min_freq=10)

This process takes 30-40 minutes on my local machine. Imagine I want to stop the kernel now. Can I save the object md to the hard drive?

later I may start the notebook again, run all the necessary imports now can I load memory with something like

md = load_from_pickle(blah blah)??