Lesson 3 imdb dataset with colab

is it just me or the imdb dataset takes very long to train even with gpu? i did just one epoch- fit_one_cycle(1, 1e-2, moms =(0.8, 0.7)) and it took 1 hour and 25 minutes with gpu… is it normal in colab? i saw that the times that it takes in the github repository are much faster (it took only 3 minutes to train) so i am not sure what is happening

thanks very much!
amotz

1 Like

Try

!pip uninstall torch
!curl -s https://course.fast.ai/setup/colab | bash
1 Like

Also try using .to_fp16() should help speed up training. I average about 10-15minutes /epoch with IMDB (depending on which GPU you get)

3 Likes

hey! thanks very much for the help! i will try that and will see what happens :).

by the way, why do you recommend to unintsall torch is it not important for fastai to work properly? i thought fastai is built on pytorch

thanks very much for guiding me to a good time did you work with colab also?

I can’t note on the pytorch issue, but I use Colab for all of my deep learning problems if I can live with 10-15 minute LM epoch times or so

still run very slowly for me, oh well i will maybe think about it more tomorrow :). anyhow thanks a lot for the help and i will try tomorrow your trick with .to_fp16()

well i tried today all the tricks that you said, first updating fastai and uninstall torch and then also this .to_fp16 trick and nothing seems to work yet :/.

i also tried to change the batch size from 48 to 35 to 16 to 8 and it actually seems to make the fit_one_cycle take even longer!!, with 48 batch size it was like 1 h and 25 minutes and now at 8 its 2:45… which i really don’t understand because i thought that decreasing the batch size should make the traning time go down and not up -.-. i am really suspicious that i might not using gpu but i double tripple checked that, i am clicking runtime change run time and i can clearly see that i am using gpu.

i tried also to play with the moments for some time but this really breaks things or so it seems so i bet i will need to give all of this thing some more thinking time…

if you have any new ideas it will be great and anyway thanks very much for the help!! :slight_smile:
amotz

@amotz
Confirming the same issue wrt training the wiki language model in colab. Wanted to check before giving up and move to aws.

Training the language model takes long.
The classifier takes 10-15 mins that’s right.

Are you sure you’ve changed the runtime to GPU?

for me the problem still persist and i dont think i am knowledgeable enough to understand what cause it yet. i searched in the internet with no luck yet and tried couple of other things but it is still too slow (10-15 minutes is long but ok but for me it takes around 1 hour and 30-40 minutes per 1 epoch with the github repository code) i tried other things and i might try the imdb at kaggle in some other time but for now i just continued with the next lectures…

i also checked if i have gpu with tensforflow commands and also in the runtime and it clearly says i have gpu… i am really a beginner so i might miss something obvious but for now its not working for me too :slight_smile:

Mind creating a github gist?
This should not be happening.
I use colab regularly and have never encountered this.

I want to see the code.

hmmm ok that will be nice of you :). i am really a beginner so i hope its not something totally elementary, but i did try my best for quite a long time to fix that. i will look at it tomorrow and will try the code again to see if the problem still happens (i tried 3 times so i am quite sure it will unless someone fixed it in the last two weeks… if the problem is still there i will upload it to github gist and will notify you here so hopefully you can see it (i never even knew this github gist thing before you mentioned it but it doesnt seem hard to understand how to use it from what i saw in youtube)

thank you for your help and wish you the best :slight_smile:

Finally I was able to do a complete walk through of of IMDB notebook completely in colab. Here’s the git hub gist

@amotz

One key trick when using colab is

  • Use the cell magic %%time , to inform about the time taken to complete the cell.
  • If it takes really long time, then in addition to saving it your colab running instance, save it your drive.

Code snippet

import fastai

# make your Google Drive accessible
from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)
root_dir = "/content/gdrive/My Drive/"
colab_data = f"{root_dir}/Colab Notebooks/data"

gdrive_imdb = Path(colab_data)/'imdb'
local_imdb = Config.data_path()/'imdb'

def copy(src: Path, dest: Path, is_directory: bool=False):
  if is_directory:
    shutil.copytree(str(src), str(dest))
  else:
    shutil.copyfile(str(src), str(dest))

Then do the below in addition to learn.save()

# Copy the saved models from local to drive
copy(local_imdb/'models', gdrive_imdb/'models', is_directory=True)

You can apply the same for data bunch as well if it takes long time for you.

This was the issue that was preventing me to complete the walk through of IMDB notebook as colab run time get disconnected and I was losing the saved models since it connects to a new run time when it reconnects.

Hope this helps.

And thanks @chatuur, your response made me to reevaluate that I am missing something and not to give up. :slight_smile: . So thank you.

2 Likes

well done msivanes :). dont let people like me make you give up, i am just really a beginner and never worked at programming and just doing things alone for fun. now i will first look at your gist and try the things you said in the comment then if the problem still persist, only then i will do a gist myself :).

thank you for such a comprehensive explanation! :).