Loading saved language model

cheeseblubber · November 26, 2017, 5:53pm

Is there any way to load a pretrained language model similar to ConvLearner.pretrained? Whenever I restart the training md = LanguageModelData(PATH, TEXT, **FILES, bs=bs, bptt=bptt, min_freq=10) always takes a bit. I’ve tried dumping the entire object with pickle but that doesn’t seem to work.

rob · November 26, 2017, 6:21pm

Have you tried save() and load()?

I’ve been able to do save_encoder() and load_encoder() as in the lesson4 imdb notebook, although I haven’t been successful saving and running the whole model via save() and load(). This is the error I get,

While copying the parameter named 0.encoder.weight, whose dimensions in the model are torch.Size([49346, 200]) and whose dimensions in the checkpoint are torch.Size([49173, 200]), ...

I guess the first dimension is number of words, and maybe if I used the same dataset for training/validation each time it’d be the same, but I haven’t tried that yet.

cheeseblubber · November 26, 2017, 6:40pm

So i’ve been able to load it successfully but the issue is that i m lazy and don’t like waiting for that minute or two of md = LanguageModelData(PATH, TEXT, **FILES, bs=bs, bptt=bptt, min_freq=10)

So afterwards i do a learner.load('em_size_500_bs_32_cycle_5_11_25') and everything is works. Was just trying to see if there was faster way to load the language model

sam2 · March 13, 2018, 4:47pm

@cheeseblubber,

Have you found a way to save the datamodel?

I am running the notebook on a local machine and it take more than 30 minutes to build the datamodel md

cheeseblubber · March 14, 2018, 1:56am

So if you read Rob’s comment it tells you how to save and load the built data models. I was referring to loading the model in memory which takes ~1-2 min. What you want is to call the save function on the model.

sam2 · March 14, 2018, 12:33pm

@cheeseblubber,
Maybe I did not articulate well. Its a two part question… can I save md to hard drive and can I read it into memory?

Imagine I started running the notebook cell by cell and reached the cell where the LanguageModelData is built.

md = LanguageModelData.from_text_files(PATH, TEXT, **FILES, bs=bs, bptt=bptt, min_freq=10)

This process takes 30-40 minutes on my local machine. Imagine I want to stop the kernel now. Can I save the object md to the hard drive?

later I may start the notebook again, run all the necessary imports now can I load memory with something like

md = load_from_pickle(blah blah)??

abhirammv · March 25, 2018, 2:15am

Did anyone find a solution to this? I’m struggling

neves · May 3, 2018, 12:15am

I’m with exactly the same problem. I’m building a language model in my local machine for some data I gathered and it took more than 5 hours!

I see that the TEXT variable is pickled:

pickle.dump(TEXT, open(f'{PATH}models/TEXT.pkl','wb'))

but it is useless without the complete model. Tips for speeding up the process also would be nice. I’ve already reduced the parameters to:

bs=32; bptt=40

It is taking so much time. Maybe it is using virtual memory.

cheeseblubber · May 3, 2018, 12:43am

Which Language model are you guys using? The one in nlp.py is the old version and does a bunch of computes after initialization see:

github.com

fastai/fastai/blob/63d990a2cffbf51dbc5af4c8ccbb4af92898e9d1/fastai/nlp.py#L120


xsize = x.size()
dim = x.dim() + dim if dim < 0 else dim
x = x.view(-1, *xsize[dim:])
x = x.view(x.size(0), x.size(1), -1)[:, getattr(torch.arange(x.size(1)-1,
                  -1, -1), ('cpu','cuda')[x.is_cuda])().long(), :]
return x.view(xsize)




class LanguageModelLoader():


def __init__(self, ds, bs, bptt, backwards=False):
    self.bs,self.bptt,self.backwards = bs,bptt,backwards
    text = sum([o.text for o in ds], [])
    fld = ds.fields['text']
    nums = fld.numericalize([text],device=None if torch.cuda.is_available() else -1)
    self.data = self.batchify(nums)
    self.i,self.iter = 0,0
    self.n = len(self.data)


def __iter__(self):
    self.i,self.iter = 0,0

Where as:

github.com

fastai/fastai/blob/7d0a033cecb3bc01ffdcfd3a4299465e11039c95/fastai/text.py#L155


    sort_idx = np.concatenate(np.random.permutation(ck_idx[1:]))
    sort_idx = np.concatenate((ck_idx[0], sort_idx))
    return iter(sort_idx)




class LanguageModelLoader():
""" Returns a language model iterator that iterates through batches that are of length N(bptt,5)
The first batch returned is always bptt+25; the max possible width.  This is done because of they way that pytorch
allocates cuda memory in order to prevent multiple buffers from being created as the batch width grows.
"""
def __init__(self, nums, bs, bptt, backwards=False):
    self.bs,self.bptt,self.backwards = bs,bptt,backwards
    self.data = self.batchify(nums)
    self.i,self.iter = 0,0
    self.n = len(self.data)


def __iter__(self):
    self.i,self.iter = 0,0
    while self.i < self.n-1 and self.iter<len(self):
        if self.i == 0:
            seq_len = self.bptt + 5 * 5

does not. Try using the new languageDataModel in text and not in nlp. Since it is loading it into memory it should be much faster since it doesn’t have to do processing in initialization.

neves · May 3, 2018, 1:49am

My:

LanguageModelLoader.__module__

really is:

'fastai.nlp'

So after the main import commands, I did:

from fastai.text import LanguageModelLoader

I’m running it now. I’ll report when it finishes.

BTW, I’m using my own data. The dataset is 4.5 times the imdb dataset. The language is Portuguese. I don’t know the impact of this.

Now I also reduced bptt to 20.

sam2 · May 3, 2018, 12:24pm

I did a quick search on the repo for from fastai.text import to find this nugget in dl2/imdb.ipynb

"At Fast.ai we have introduced a new module called fastai.text which replaces the torchtext library that was used in our 2018 dl1 course. The fastai.text module also supersedes the fastai.nlp library but retains many of the key functions."

enod · July 11, 2018, 8:08am

Have you found a way so far?

ranih · September 11, 2018, 5:01pm

Hey, has anyone found a way to do that?

borowis · October 25, 2018, 8:02am

There’s a lesson in part2 on how to use fastai.text. See lecture notes for example https://medium.com/@hiromi_suenaga/deep-learning-2-part-2-lesson-10-422d87c3340c