Yesterday I published a detailed description on how to set up Google Colab and have it sync with your drive, both for development and storing datasets.
When using the Drive desktop app for syncing you can even write your scripts locally and immediately use them in your Colab notebooks. I find that especially useful for part 2 of the course. Anyway, here it is:
“Setting up Google Colab for DeepLearning experiments” von Oliver Müller
Hi, I went through the lesson 1 and used my own data set by importing images into the google drive using collab. I am curious to know if is there a way to upload and use the images directly from the local machine? (without importing the images anywhere (collab or drive)- this is needed due to security reasons). Thanks in advance!
Do you need to manually download all the course files from Github and save them in your drive in order to get the Lesson 0 notebook to work (specifically, the cat example)? The only way I’ve managed to get that line to work is by running the code to mount My Drive as the base_dir, then changing the path in the open() method to match where the image is stored in My Drive (because I downloaded the entire fastai-v3 course repo and saved it).
However, I haven’t seen anyone else mention that they needed to clone the repo from Github and save it in My Drive (and it’s also not mentioned in the tutorial) so I’m afraid I’ve done something wrong. Would anyone be able to help me?
Yes this is possible. First you download the kaggle dataset to your local machine. From there you upload it to your Google drive.
If you mount the drive as described in my post, you can access any folder that is part of the drive. Since your training data is part of the drive you just have to figure out the path. After you mounted your drive You can open the tab on the left (in Colab) to browse your files in the filetree. Let me know if you need help setting this up
My 2 cents, one thing I’ve found is if you instead copy the download link when you download the dataset and wget it in colab, it will typically download much faster for me (since it uses google’s servers) and then mv or cp bash command the folder to your google drive If you want or need a visualization of this I can post that later on today.
You’re right. Loading directly from the drive is really slow. I’ve been working with univariate timeseries lately. They’re so small in size that I kind of forgot about that problem…
Copying the data from drive to the notebook (basically what you just described) works though.
The snippet learn.fit_one_cycle(10, 1e-3, moms=(0.8,0.7)) below takes approximately 20 hours. Google Colab resource limit is set at 12 hours, which means the session is terminated before its finished. (frustrating )
Hey @mrfabulous1! I’d do checkpoints and model saves every x epochs. And save that save into your google drive. Else usually for LM’s I go to paperspace for a few hours…
The output of the command above is shown below, the command saves a list of the epoch results to a csv file and a model is also saved to a file after every epoch.
How can I change the command above to stop at epoch 28 when the training loss is less than the validation loss ?
I have tried using other values such as error_rate but I get the following error, and am not sure how to change the command to achieve the result I require.
/anaconda/envs/fastai_uvicorn_0_7_1/lib/python3.6/site-packages/fastai/callbacks/tracker.py:50: UserWarning: <class ‘fastai.callbacks.tracker.SaveModelCallback’> conditioned on metric error_rate which is not available. Available metrics are: train_loss, valid_loss warn(f’{self.class} conditioned on metric {self.monitor} which is not available. Available metrics are: {", ".join(map(str, self.learn.recorder.names[1:-1]))}')
Hi,
Need some help with lr_find function on Colab. I am trying to do lesson 2 of the fast ai course - https://course.fast.ai/videos/?lesson=2
This involves picking up urls of images from the internet, and running a cnn to categorize it.
I picked 3 items - forks ladles and spoons (attached). I ran the download images, stored and verified them. This is all fine.
Ran fit one cycle cnn through it, which gave some numbers.
The next steps ask you to unfreeze the model, and run a learning rate finder through it. Here is where I get stuck. Instead of giving me some numbers and a graph it gives me #na#.