Wiki: Lesson 1

I have questions about cell 15 from the Lesson1 Notebook about precomputed activations.

# Uncomment the below if you need to reset your precomputed activations# #shutil.rmtree(f'{PATH}tmp', ignore_errors=True)

  1. Why does learn.fit(0.01, 2) in the next cell execute faster with precomputed activations and slower if I execute shutil.rmtree(f'{PATH}tmp', ignore_errors=True) ? Isnt it running learn.fit on the same data in either case (i.e data from dogscats/train)

  2. If I delete the precomputed activations, how do I get them back? Do I have to do a fresh download of the “tmp” folder from the paperspace script?

  3. Just to clarify using precomputed activations - our network has already learned weights using a large image data and we are going to fine-tune them using our “train” folder. Correct?

Thanks a whole bunch
Vikas

1 Like

My models folder is blank. Did anyone else have the same issue?

I am assuming it should not be blank because this is the “pretrained” model we start with in Lesson 1 yes?

Hi,

First let us explained the idea of precomputed activations
Let’s go back to the network, consider resnet50.

resnet50

When finetuning a model we train only the last layer and previous ones (freeze layers) remain unchanged. So all computations on the freeze layers will be the same ie: the result of image 1 (of my training set) going from C1 to FC7 layers will never change or we do it many times with same images. In fact an epoch is a tour over all images of the training set so training during 10 epochs means I’ll do the same computation 10 times. This is a waste, so we take a shortcut, we precompute all training image results from 1st to last fixed layers, here from C1 to FC7 and save them: this is the precomputed activations. In doing so it’ll be fast because we’ll use results of fixed layers computations (precomputed activations) and compute only from FC7 to FC8 for each image.

So the question 1:
The response is clear, it will take much time because it’ll apply all fixed layers to all training images. But the next times will be faster.

Question 2:
No, you have to run fit but make sure to use the attribute precompute=True:
learn.fit(0.01, 2, precompute=True)

Question 3:
Precomputed activations and pretrained weights are two differents things.
Precomputed activations help speeding computations during training a network while pretrained weights are like the level of knowledge acquired by the network after some training on some data.

Hopefully, it helps.

4 Likes

I have tried distinguishing between flying birds and flying planes. It worked perfect. 98% accuracy :slight_smile:

Where is located your models folder? if possible show us the full path.

Yes absolutely. And thank you for responding.

it is located in /data/dogscats/models

The Lecture Notes suggested that ResNet model will be downloaded and I did not see that when I ran the Lesson1Jupyter Notebook (hence the confusion). Search for “Let’s run the Model!” in the Lecture Notes you will see what I mean.

The confusion also stems from the fact that the Lesson1 notebook says we use “pretrained model” but I dont see it anywhere on the disk to be able to use it.

I see. Let me see if I can rephrase it (confirm if I am understanding this right)

Let’s say that C1-FC7 are represented by the equation y= wx (x is the input image, w is the weights and y is the result aka activation)

If we did not save precomputed activations before, we “precompute” y i.e. we compute y first time an image x passes through the network. But, everytime after that we do not compute y but use the cached values saved in the tmp folder.

And last but not the least, the weights w.
Our jupyter notebook uses a pretrained model i.e. w learned from some other data sets. As we train it on our data set, w’s get updated but the starting point was the pretrained model we began with?

Is that accurate?
(Thank you very much for investing your time and giving a detailed reply)

Hello Rachel,

Thank you for the summary post. Helps me keep my thoughts organized. I have one quick question

In Lesson 1 video, Jeremy asks us to download images of our choice from google and try the image classification. My worry is that if we download say 100 images (and use 50 to train, 20 for validation, and 20 to test) - wouldn’t the model over fit?

Perhaps I am missing something - What is the purpose of trying our own data?
Is there a number of images you recommend ?

Thanks
Vikas

Could you share the number of samples you used for train/validation/test ?

For the training set I have:
Birds: 409 images
Planes: 322 images

For the validation set I have:
Birds: 146 images
Planes: 80 items

I have grabbed images from flickr.com

2 Likes

In the lecture Jeremy Howard gives examples about cricket and baseball with 10 images, but personally I think you should try with more images.

The purpose of training our own data is to see for what kinds of images we can use this pretrained model. And besides that I found it more interesting when you train own model and see for which images model gave incorrect labels. Beside that with our own images we need to fine tune some hyparameters.

1 Like

They did not ask any verification. I just followed this https://github.com/reshamas/fastai_deeplearn_part1/blob/master/tools/paperspace.md

And you can use FASTAI15 credit code for paperspace. For me it took about 5-7 minutes.

Thank you

When you say “Train” your own model - how do we do this? Is there a setting to “not use the pretrained model”?

Anybody knows where the data folder is in here https://github.com/fastai/fastai/tree/master/courses/dl1?
I am not able to run the examples.

I’ve been trying to run lesson1-rxt50.ipynb on my own hardware, a i7-6700K box, with 16GB of memory and an GTX1070 graphics card. I’m running Fedora Linux 27. I have installed CUDA-9, the latest version of CudNN and installed PyTorch from git. To run the notebook on my CPU I had to reduce the batch size, and then the network fitted inside the 8GB of GPU memory. When running the network, the irst few steps of the work fine, and I can see the improvement of the error rate. But, while the network is running, my CPU (not GPU) memory is constantly increasing. In the last learning step, when the whole network is unfrozen, and we try to learn by three different learning rates, the virtual memory of the process reaches more than 50GB, and eventually the system freezes, and needs to be rebooted.

Does anyone know whether this is “normal” and should I just buy more RAM, a configuration issue, or is this a memory leak in pytorch?

Sorry, for confusion. I mean training our own data. So far I do not know about fastai library documentation.

Hi @vikasbahirwani,
sorry for the delay.
Yes, you got the point about precomputed activations.
Concerning the pretrained weights w, it is also correct.

1 Like

This models folder is where are located models that you saved using:
learn.save('my_model_name')
So if you don’t run the above instruction you won’t see anything there.
And also when you call:
learn.load('my_model_name')
The load method looks for the your ‘my_model_name’ model in the “models” folder.

Pretrained models are located elsewhere.
This is a nice default that comes up the Fastai library, every model trained on a specific data set is saved by default in a subfolder of the data folder named models.

It’s a great exercise that allows you to go through all the steps from finding data, cleaning them, organize them and finally train them.
Concerning the number, you must experiment but always start small and increase.
About overfitting, it depends you figure out whether it is overfitting or underfitting or good by looking at the different losses.
Also it’s a good idea to experiment different architecture: resnext50, denset, …

1 Like