A Code-First Introduction to Natural Language Processing 2019

I am working with the lesson 7 notebooks on translation and natural text generation. I was wondering if anyone has had some success saving a trained model and using that for inference. I am running into the ff errors/roadblocks and would like if someone can point me in the right direction.

  1. Error loading saved model on CPU - Error message “ AttributeError: Can’t get attribute 'seq2seq_loss ’”, seq2seq_loss is the loss function specified in the notebook i’m running. Note: I can load the saved model on a GPU successfully.
  2. Prediction error - After i use the add_test function to add new data to the learner, i am a little confused on how to proceed with the prediction. I used the pred_acts function from the notebook with dataset type set to Test but it returns this error “ TypeError: object of type ‘int’ has no len()

#Disclaimer - i’m a newbie so i may not have tried some ‘obvious’ solutions. Thanks in advance

@rachel @jeremy

Finding learning rates for training the imdb classifier. I am currently working through notebook 5-nn-imdb.ipynb. When Jeremy creates the classifier for the imdb movie reviews he is using rather complicated formulas for the learning rates.

I think he is using slice because we want the first layer be training with a smaller learning rate than the last layers.

The first learning rate can be found using lr_find(). But how would I find the lr for freeze_tor(-2), freeze_to(-3), etc.?

1 Like

You’d do the freeze first then call lr_find().

1 Like

You mean do:

find best learning rate
find best learning rate again
… ?

Yes, you do the standard learn.lr_find() learn.recorder.plot() and pick

1 Like

Generating a German language-model, the way Jeremy does for Turkish language. After watching the lecture videos, I was curious to build my own language model for German. I have followed the steps but set language to “de” when downloading the wiki sites.

Nevertheless this line throws an error: MemmoryError: (I have watched the processes with the top on the terminal and the jupyter process never ran over 50% memory usage).
My system has 32GB RAM. (GPU RAM was not used at this stage)

The error occurs after the progresbar finishes. Any idea where to dig deeper to find the error?

I found in NLP lectures that we are using x[:, i] as input instead of x[i] when we want to predict the next number, though, numbers are arranged in x[i] order. Why are we doing so?lesson7-human-numbers%20ipynb%20-%20Colaboratory

Hi @rachel,

I was going through the chapter on seq2seq translation.
I wanted to request you to upload the final ‘questions_easy.csv’ (which is used to creating the final databunch) to the S3 datalake where fastai has hosted all the other datasets.

The problem is creating the datasets uses a lot of resources. I have tried running it on Colab and a couple of other machines but wasn’t able to. It would be tremendously helpful if the ready dataset is hosted somewhere.

Thanks in advance.
Thank you for the wonderful course as well!


1 Like

A post was split to a new topic: Remote NLP Study Group Saturdays at 8 AM PST, starting 12/14/2019

Hi i downloaded a sentiment analysis dataset, a movie review dataset.
The structure of dataset is like this:
1st column: Movie ID
2nd column: movie review (text in english)
3rd column: ratings (numbers in decimal from 0 to 5)

so to predict the rating this has to be a tabluar model but for the model to interpret what the review means, another language model has to created and has to be used with the tabular data.

How do i do that ?


Hi @AjayStark, have a look at @quan.tran’s helpful post & github on mixed tabular & text, it worked well for me:


Hii, Thank you. I’ll Look into it :smiley:

Hi @AjayStark ,

If the task is just sentiment analysis that can be done using only the text_classifier.
The example below is an example of if you have both text as well as tabular data.

For sentiment analysis there are two possible ways you can achieve this:
(a) Classification:
You can treat each rating as a separate category. So the categories you will have are 0,1,2,3,4,5. And you can train a simple classifier. Comparing this to the example from Jeremy’s course. In the course, there were two classes: pos and neg. You have 6 classes here.

(b) Regression:
Instead of treating the ratings as separate classes you can treat them as a continuum. This is covered in bits and pieces/ inspired from the fastai.collab module. First, let’s see what’s happening in the collab module. The concept of ratings can be adopted directly. If the ratings are between 0-5 we first normalize them to 0-1. The main difference comes here, instead of training using the CrossEntropy loss which is used for classification tasks, the MSE loss function is used. The same can be done here.
You can check out the implementation of the collab module that you need to look at here. Checkout how the y_range is being used for the normalization.

So you take a text_classifer module, normalize the ratings. One point to add here is Jeremy in his lectures mentions from practical experience when using ratings between 0-5 normalize them between 0-5.5 because it’s a bit tricky for the model to predict values at the extremities. Then use the MSELoss to train the module.

Hope this helps.

EDIT: I’ve never experimented with sentiment analysis. Thus I cannot say which approach will work better. In my opinion having been through Jeremy’s course I think MSE loss should work better. However expeirments should be relied upon rather than intuition.

1 Like

:sunny: Notice of Serendipity If You Want To Learn NLP!:sunny:

Here’s a great opportunity if you plan to participate in the upcoming TWiML AI x Fastai Study Group for the course A Code-first Introduction to Natural Language Processing.

This Saturday, we will begin a three-week introduction to NLP from Lesson 12 of Deep Learning From the Foundations. Though not a required prerequisite, for the Fastai NLP course, these three sessions should provide a great introduction to and overview of NLP!

Time: 8:30 AM Pacific Time on Saturdays Nov 23rd, Nov 30th, and Dec 7th.
Place: In this Zoom chatroom.

Stay tuned on this channel for an announcement the details of this Saturday’s meeting!

The TWiML AI x Fastai NLP Study Group will meet Saturdays, beginning Dec. 14th, at 8:00 AM, a week after completion of the 3-part mini-review.

1 Like

Issue: Preemptible GCP instances

I’m a complete beginner to ML ( and programming, sort of), and am doing this course.
I am using GCP free credits, preemtible instance as per the instructions in the fastai documentation.

When I try to run train the IMDB full set after loading the wikitext model, the instances keeps
getting shut down at various stages of the training.

What happens to my model / data when I restart?
Is it possible to restart the training from where it was left off? I have been through this cycle about four times, and I am getting sisyphus nightmares.

Besides deleting this instance and starting a non-premtible one, is there a safe way to do this?

Thank you



The Deep Learning From the Foundations Study Group
will meet Saturday Nov 23rd, at 8:30 AM Pacific Time

Topic: Lesson 12, Text Data Preprocessing

Suggested Homework/preparation:

  1. Watch the Lesson 12 video from 1:15:10 to 1:45:17

  2. Familiarize with the notebook 12_text.ipynb

Join the Zoom Meeting when it’s time
To join via phone
Dial US: +1 669 900 6833 or +1 646 876 9923
Meeting ID: 832 034 584

The current meetup schedule is here

Sign up at Sam Charrington’s TWiML & AI x fast.ai Slack Group to receive meetup announcements via email

1 Like

Hi @zerochi,

If you’re talking about Colab then there’s no way to retain the instance.
But what you can do is connect the colab notebook to your drive and save databunches/models which can then be reused.

Stackoverflow answer is here

Essentially it’s this:

from google.colab import drive

And just another tip, change the runtime instacne to GPU. It’s not GPU by default.

Runtime -> Change runtime type

Second, if you somehow manage to crash the notebook due to excessive consumption of RAM Colab automatically suggest upgrading your RAM to 25GB.

Yes, Colab is made for short experiments any long standing process that you leave on and close the notebook will eventually be stopped.
There are a couple of ‘tricks’ JS script to click on the Connect button which apparently hepls in keeoing it connected. But I can’t find it right now.


1 Like

Trying to reproduce the NLP transfer learning from nn-imdb-more.ipynb on Google Colab.

After creating data_lm with my own data and checking the length of vocab.itos and vocab.stoi I’m getting 18744 and 73037 which is ok.

But after data_lm.save('lm_databunch') and data_lm = load_data(path, 'lm_databunch', bs=bs) and checking again vocab.itos and vocab.stoi – now I’m getting 18744 and 18742.

How is this possible? Or is this my mistake?


Thank you. I am not running this on colab, but on google cloud hosting. on google cloud I created a preemptible compute engine. I run jupyter notebooks on this compute engine.

My current understanding is that the data is gone, because when training is interrupted, that’s what happens.



Hi @zerochi. Save your model every epoch (or whenever you like) during training with:
Like this at least you don’t lose too much when your instance gets preempted/interrupted.

1 Like