Running IMDB notebook under 10 minutes

(Anna Bethke) #12

I tried that technique but seem to have the same issue. I’m running the notebook ‘as is’ but added setting md.nt = 34933 and m3.nt = 34933 but the dimension mismatch is present still. Is there a way to use the same method of getting the text on both (e.g., currently the first model is using the downloaded local IMDB files and the second gets them from an online location.


(Francisco Ingham) #13

Can you be more specific? Do you mean later accessing the cloud from paperspace? I really don’t know how to access a browser after sshing into the paperspace machine.


(Benedikt Brandt) #14

Do you have a dropbox account? If so simply upload the zip folder with the pre-trained models to Dropbox and create a shareable link. Use wget and the shareable link to download the zip folder after sshing into paperspace. (You can use any cloud provider you like as long as it gives you a direct access link).


(Benedikt Brandt) #15

Unfortunately I am still getting an import error, after setting md.nt=34945

KeyError                            Traceback (most recent call last)
<ipython-input-29-69866d51d4ab> in <module>()
----> 1 learner.load('imdb_adam3_c1_cl10_cyc_0')

~/fastai_course_playground/dl1/fastai/ in load(self, name)
     62     def get_model_path(self, name): return os.path.join(self.models_path,name)+'.h5'
     63     def save(self, name): save_model(self.model, self.get_model_path(name))
---> 64     def load(self, name): load_model(self.model, self.get_model_path(name))
     66     def set_data(self, data): self.data_ = data

~/fastai_course_playground/dl1/fastai/ in load_model(m, p)
     24 def children(m): return m if isinstance(m, (list, tuple)) else list(m.children())
     25 def save_model(m, p):, p)
---> 26 def load_model(m, p): m.load_state_dict(torch.load(p, map_location=lambda storage, loc: storage))
     28 def load_pre(pre, f, fn):

~/anaconda3/envs/fastai/lib/python3.6/site-packages/torch/nn/modules/ in load_state_dict(self, state_dict, strict)
    488             elif strict:
    489                 raise KeyError('unexpected key "{}" in state_dict'
--> 490                                .format(name))
    491         if strict:
    492             missing = set(own_state.keys()) - set(state_dict.keys())

KeyError: 'unexpected key "1.decoder.bias" in state_dict'

Does anyone have any idea what’s going wrong?



I uses google colab instead of paperspace. My solution is firstly download the file to my PC, then upload it to google drive. Hope this information is helpful.

1 Like

(Vaibhav ) #17

You can use CurlWget extension for chrome. It can retrieve the wget command after initiating the download command.


(Vaibhav ) #18

Have you figured it out yet? I am having the same error.


(Arnav) #20

@jeremy Why would the vocab be different if the language model is created on the same dataset? I tried creating the model with data from as well as the stanford official website since my guess was that the link was later modified in the notebook, but am getting the same result.
@vikbehal If you could share the torchtext field file, it would help out a lot!


(Jeremy Howard (Admin)) #21

I don’t really know - some different kind or preprocessing I guess…


How to fix 'Error(s) in loading state_dict for AWD_LSTM' when using fast-ai
(MTAU) #22

About to start this exercise. A little put off by the unresolved issues flagged here.

Has anyone resolved this problem or should I start from scratch?


(Abhishek Mishra) #23

Hi Folks,

What is the difference between learner.load_cycle(‘adam3_10’,2) and learner.load(‘adam3_10’) ?
Parameters are just for reference.
I am referring lesson4-imdb notebook’s 3 lines as below.


learner.load_cycle(‘adam3_10’,2) doesn’t seem to be related to learner used in language model.


(Sam Lloyd) #24

The difference is pretty much zilch. load_cycle just formats the string ‘f{name}cycle{number}’ and then calls learner.load


(Karl) #25

Was the vocab size difference issue ever resolved? I just tried to use the weights uploaded here and found a pretty sizable vocab difference. I get md.nt = 37392 compared to md.nt = 34945.




I’m trying to reproduce results from scratch for the notebook from lesson 4, could you help with the following questions:

  • When I use the dataset from the notebook’s link, in train/all folder there are 75000 files, but in test/all folder - 25000 files.
    If I download this dataset from - there are 25000 (train) + 25000 (test) files
    Why dataset from server contains 75000 files in train directory? Is it another version of dataset? The description from notebook is about 25000 files, no mentions about 75000 files.

  • If we use this dataset from server for the first part of training, then when we run the following code:
    splits = torchtext.datasets.IMDB.splits(TEXT, IMDB_LABEL, ‘data/’)
    It seems that it downloads the original version of dataset (25000 + 25000 files).
    So if it is true, then we partly train the model using’s dataset, then train the last time using original dataset, and predict on test data from original dataset.
    Is it intended strategy for training? What is the reason for it?
    I’ve also checked that state-of-art result of 94.1% from research paper, that is mentioned in notebook, is reached on the data from
    Is it honest to compare this result with the result from lesson 4, if we have dataset that is 3x times larger than original dataset? I didn’t see the content of the files in dataset, but anyway we have not exact set of training files.
    P.S. I 've checked test files from both datasets, they are equal, but our training process is confusing for me.

  • While creating language model using this code:
    md = LanguageModelData.from_text_files(PATH, TEXT, **FILES, bs=bs, bptt=bptt, min_freq=10)
    We implicitly set the folder for saving the model in subdirectory of the PATH (for example if we try to save the model using learner.save_encoder).
    How can I save the model outside of the PATH directory?
    For example, for Paperspace Gradient notebook with template, all datasets are in read-only directory, and while trying to save the model I’ve got an error that it is not possible. So I need to copy the whole dataset from read-only directory to another directory, which is time/cost consuming. I tried to use P100 GPU for training, so it was enough expensive to waste the time on preparing the dataset.

Correct me please if I am wrong in any of these questions.


(William Collins) #27

I’m getting a copy error even though the sizes are an exact match. I’m very confused.


RuntimeError Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/torch/nn/modules/ in load_state_dict(self, state_dict, strict)
513 try:
–> 514 own_state[name].copy_(param)
515 except Exception:

RuntimeError: cuda runtime error (59) : device-side assert triggered at /pytorch/torch/lib/THC/generic/THCTensorCopy.c:20

During handling of the above exception, another exception occurred:

RuntimeError Traceback (most recent call last)
in ()
----> 1 learner.load(learner_finetune_path)

~/anaconda3/lib/python3.6/site-packages/fastai/ in load(self, name)
95 def load(self, name):
—> 96 load_model(self.model, self.get_model_path(name))
97 if hasattr(self, ‘swa_model’): load_model(self.swa_model, self.get_model_path(name)[:-3]+’-swa.h5’)

~/anaconda3/lib/python3.6/site-packages/fastai/ in load_model(m, p)
25 def children(m): return m if isinstance(m, (list, tuple)) else list(m.children())
26 def save_model(m, p):, p)
—> 27 def load_model(m, p): m.load_state_dict(torch.load(p, map_location=lambda storage, loc: storage))
29 def load_pre(pre, f, fn):

~/anaconda3/lib/python3.6/site-packages/torch/nn/modules/ in load_state_dict(self, state_dict, strict)
517 'whose dimensions in the model are {} and ’
518 ‘whose dimensions in the checkpoint are {}.’
–> 519 .format(name, own_state[name].size(), param.size()))
520 elif strict:
521 raise KeyError(‘unexpected key “{}” in state_dict’

RuntimeError: While copying the parameter named 0.encoder.weight, whose dimensions in the model are torch.Size([50004, 400]) and whose dimensions in the checkpoint are torch.Size([50004, 400]).


(Hans G) #28

These are the dimensions I got in the error as well. No luck sigh


(Karan Purohit) #29

Is there any one able to solve this error yet? @William.Collins @Interogativ @grez911



I am also getting this error. The worst part is the fact that it was working before. When I try to run my notebooks again I start to get this error. I suspect something to do with packages upadtes. Maybe pytorch.


(Tom Halasz) #31

Just to report, I have the same discrepancy for md.nt (number of tokens), 37392 vs 34945.


(Karan Purohit) #32

Did you get any resolution for your error?