Inconsistency in Text Classification with ULMFiT (fastai v1)

I am trying to implement the tutorial from this page https://docs.fast.ai/text.html on google colab. However, I am not getting similar results.

My colab notebook: https://colab.research.google.com/drive/1140hSsTvyTY22nbHZG340v7ia5JIhWqD

I would appreciate any ideas on how to fix this.

I don’t have permission to view the notebook after clicking on the link?

@howkhang Sorry for that. I have updated the link, now you can view the notebook.

You’re using the small sample URLs.IMDB_SAMPLE to fine tune the language model, which would explain the classifier’s low accuracy.

Instead, you should be using URLs.IMDB which is the full dataset.

@howkhang thanks for your response. As per the example in this page https://docs.fast.ai/text.html, same URLs.IMDB_SAMPLE data has been used and the accuracy is much higher than what I am getting.

I made a copy of your notebook and ran it in Google Colab and similarly could not replicate the numbers in the docs page.

Try training the classifier on the full IMDB set instead rather than replicate the numbers on the docs page? I managed to get over 94% accuracy by following the lesson3-imdb notebook.

1 Like