Part 2 Lesson 10 wiki

(Kevin Bird) #297

If anybody else is having the annoying Can't find model 'en' issue, here is a link to last year where they fixed the issue: Lesson 4 - OSError: Can’t find model ‘en’. It showed up for me when I did the get_all line in imdb.

(Arvind Nagaraj) #298

After the input and the first embedding layer, it’s all tensors…English text, french text, stock market data, video frames…it’s all just real numbers.
RNN doesn’t discriminate.

(Arvind Nagaraj) #299

And the en refers to the small english model provided by spacy 2.0

Spacy also has medium and large models and they have excellent documentation talking about each of those English models.

(Phani Srikanth) #300

Each epoch for the language model takes around 17minutes on the p3.2xlarge AWS instance (Volta V100 GPU).

Does it really take this long? Could anyone confirm?

(Asif Imran) #301

Nice! Do you have a favorite paper/ref to read up more on this?

(blake west) #303

In the video (right here:, Jeremy says, “as per usual, we do a single epoch on the last layer”, which he explains are the embedding weights, and he does this because they’re the thing that will be the most wrong.
But the code, right before he runs the epoch, but he calls learner.unfreeze() just a few cells above his epoch run, which unfreezes all layers. I’m left to assume there’s some missing code that should be like learner.freeze(-1) or something right before he starts training the language model, no? Any insights here?

(Arvind Nagaraj) #304

Models documentation:
Tokenizer documentation:
Language processing pipelines:

(Gerardo Garcia) #305

Do you mind to share?

(Arvind Nagaraj) #306

I think so too…it should have been learner.freeze_to(-1) instead of unfreeze.
But going by the lr the lr_find found, it seems like it is not going to impact the weights if you unfreeze completely and train the whole LM.

Let me look at it carefully again…

(blake west) #307

Yeah it takes forever. While I’m playing around with it, I just did

trn_dl = LanguageModelLoader(trn_tokens[:len(trn_tokens)//10], bs, bptt)
val_dl = LanguageModelLoader(val_tokens[:len(val_tokens)//10], bs, bptt)
md = LanguageModelData(PATH, 1, vs, trn_dl, val_dl, bs=bs, bptt=bptt)

Notice, I just took the first 10th of the trn and val. With the default Paperspace FastAI setup, it took 7 minutes to get through 1 epoch. Owch! Might need to upgrade machines…

(vibhor sood) #309

can you please also add english subtitles to lesson 10 video…thanks

(Phani Srikanth) #310

We sure may have to!

(Xu Fei) #311

~36min/epoch on 1070 in a ubuntu box, it is slow indeed.

(Arnav) #314

@vibhorsood Terminate the other terminals and remove the ‘unsup’ from CLASSES, it’ll reduce your training size from 100k to 50k but it’ll take care of the memory errors.

(vibhor sood) #315

I am getting following error::

tok_trn, trn_labels = get_all(df_trn, 1)

(Arnav) #316

@vibhorsood You don’t have the spacy english model installed.
!python -m spacy download en should do it.

(rachana) #317

@rachel this got more than 3 likes… Just want to know what you think.

(Nikhil B ) #318

That’s a good way to prototype. Whole thing takes ~21 min on 1080ti


While running the classifier is anyone getting this weird error?

After investigating on my side, it’s because the DataLoader spits a target with a size bs x 1, so the target variable here has that size and pytorch really insists on getting something of size bs. I seem to have corrected it by changing the loss to

def custom_loss(inputs,target):
    return F.cross_entropy(inputs, target.view(-1))

but I wonder if there isn’t a more serious bug here.

(Nikhil B ) #320

I haven’t reached there yet, but this kind of error seems to have been resolved elsewhere by passing a 1D tensor …