Lesson 1 In-Class Discussion ✅

Looking forward to the discussions on this later in the course, but if you’re interested parts of this article might be of interest https://arxiv.org/pdf/1608.08614


My best guess is that different mini batches were used when executing lr_find. Nevertheless, the first plot is still quite troubling to me. I guess what it says is that it already is close to the best weights.

Would be interesting to maybe turn this into a computer vision problem - for example chord recognition from a live video recording!

1 Like

I had few hours to study and set up a working google cloud fastai v1 environment running the first lesson notebook.

My idea was to use this “learning-time” to try to work on trying to solve problems impacting people’ lives.

So I looked around for a dataset and I come across this one (https://rdm.inesctec.pt/dataset/nis-2017-003/resource/df04ea95-36a7-49a8-9b70-605798460c35) after reading the paper behind it (https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0177544) and, doing that, I come into several questions also considering some learnings from other inspiring class-mates, maybe we can get a better understanding of how things work.

A - Can we operate using a different approach rather than transfer learning?
I mean, is there any benefit on defining a set of custom layers on our own, as we can do with Keras/Tensorflow? I see this interesting, for instance, to replicate network architectures as described by several papers or to explore different problems (the like of classifying galaxies, x-rays, etc…)

B - How can we define and use custom normalisation criteria rather than the ones used on ImageNet?
Is there any best-practice and/or scientific advice on how and when to use such a custom approach?
For instance, I found a different set of criteria in this paper, and I got curious about it.

C - Where the “resnet” weights file is locally saved?
Can I define in which folder I would like to save such “weights”?
From reading other forums posts it seems exists a default “~/.fastai/data/oxford-iiit-pet/images/models/” folder (https://forums.fast.ai/t/lesson-1-chat/27332/636)

D - Can we save our “trained” net weights and re-use it in others projects?
Where is this file saved instead?

E - Which are all the available “models” inside Fastai other than ResNet?
Es: model.ResNet34 and ResNet50 are already available. Which other pre-trained models are already available?

F - How we define the local file system path where we are going to store data out of the DataBatch object?
Is the “project folder” or the “fast.ai default” one?

G - How can we use AWS S3 (or Google equivalent) to store files/etc (e.g. datasets to be used for training) rather than the local file system? Does it eventually make sense?

H - Can we store (and later retrieve) our “trained” model on a AWS S3 file?
If so, how can we do that?

I - Can we define a custom set of data to be used as train/validation/test or fast.ai doesn’t allow that?

J - After we are ok with our model, how can we use it in real life?
I saw someone using the “eval” method but, I didn’t understand if we can make the outcome available in real-time (using “stored/saved” weights) or if we need to run our model once again.



I’ve few questions. I’d be very grateful if someone helps.

Is cuda92 strict requirement for successfully running first lesson ?

I’ve installed both fastai and pytorch v1. Everything was going right but then I found out that it’s running only on cpu when training started.
I think it is because my Nvidia driver is 384.81 but requirement for cuda92 is 396.26.

Last question :
Do I need to install new driver(>=396.26) and then install cuda92 to setup everything ?
Can I change Nvidia driver for one conda environment without disturbing base system ?

@insoluble You dont need to install cuda92. Pytorch comes with cuda drivers. You just need to uninstall your nvidia drivers and install drivers > 396.xx.

This post discussed the same issue. https://forums.fast.ai/t/setting-up-gpu-for-fastai-v3/27678/5?u=magnieet

Not sure you can install Nvidia drivers for particular environment. But you can have different cuda version for different environment.

1 Like

Is there a way to get the filenames for the images returned via interp.top_losses(9)?

That method returns the indexes of the top 9 worse predictions, but I don’t see how I can tie them back to the actual filenames in my file system. I’d love to be able to do this as that in my dataset I’m finding that many of the top losses are actually labeled incorrectly and I’d love to move them to the correct folders.

1 Like

I run the first lesson on cuda 90 and torch 0.4 (i will have to only specify padding mode )
In ImageDatabunch

data = data = ImageDataBunch.from_csv(words_path,

not mentally ready to rebuild my PC :smile:


You can (could? the library moves fast) find them in data.valid_ds.x. In a current kaggle comp I am plotting my filenames against images in a copy of ClassificationInterpretation:plot_top_losses with self.data.valid_ds.x[idx] for exactly the purpose you describe, uncovering ground truth label errors to make a better classifier.


(I am on GCP)
Is it ok to update pip:

jupyter@my-fastai-instance:~/tutorials/fastai$ pip list | grep torch
torch                              1.0.0.dev20181022
torchvision                        0.2.1            
You are using pip version 10.0.1, however version 18.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

Or, is that something that can be adjusted in a global requirements.txt file?

Thanks much for the tip.

What’s funny is that .x doesn’t come up in the intellisense so I had no idea it was there. It works though which is awesome :slight_smile:


… works great.


Re: learn.fit_one_cycle(3, max_lr=slice(1e-7,1e-5))

Is the slice(1e-7,1e-5)) setting the mix/max LR for one cycle … so that it goes from 1e-7 to 1e-5 and then back down to 1e-7 and then some?

1 Like

Here you go:



I’m trying to figure out what is the recommended way to load a trained model purely for prediction. In my training notebook I can call:


which will save a my-model.pth file. How can I load this model? I was looking for a class-method on Learner that would instantiate it from the saved file. Does anyone have a recommendation here?

When loading in my own image data, I got the following error from PIL:

IOError: image file is truncated (1 bytes not processed)

I found an answer on SO that advised changing a PIL attribute:

from PIL import ImageFile

Doing this removed the error for me and i was able to train my learners without error going forward (with decent results), but as I don’t really understand what I did I’m not sure what else I may have changed in setting this value.

Has anyone else come across this error? Any suggestions on whether this is a viable solution or not? Any better alternatives? TIA!


@jeremy Hi , is there no precompute feature in fastai_1.0x ?

1 Like

My data is quite large, my Learner seems to be getting stuck and its taking 9-10 hours for one epoch. Is there any efficient way to take only a sample of dataset or training it on subset and then continuing with another subset without disturbing proportion of classes?

A - I am not sure if I understood your question. Anyway, when you use pretrained model, fastai create layers that cutomize that arch to a specific problem, based on your databunch. You can also change these layers.
C - Exactly that, can’t you find it?
D - check the SaveModelCallback which is in tracker.py.
E - All pytorch models + models defined in text/models, vision/models etc.
I - yes, you can. There are many ways to do that. You could use a csv file to define what is train, valid or test (from_csv) . You can use from_lists. There are several ways.
J - Yes, you can.

1 Like

Are you using all your gpu RAM?

fit one cycle is a training strategy that help you achieve better results in less epochs. The idea is to start from a very small lr and linearly increase it each epoch up to the lr you specified in fit. Than it will linearly decrease it. There are more than one discussion in the forum about this. And also, there is a very good notebook from @sgugger in Github. You should check.