Why are we doing this - learn.fit_one_cycle(1)?

saarahasad · July 1, 2019, 6:21pm

What is this purpose of this - learn.fit_one_cycle(1) ?

Could someone make the following explanation more clear.

So, its very unlikely if we can make layer 1 features better. It’s very unlikely that the definition of the diagonal line will ever change for dog or cat breed or any kind of image for that matter versus the Imagenet data it was originally trained on.

But the last layer considers 5th here. we would like to change faces of dogs as in our dataset.

So intuitively you can understand that different layers of convolutional neural net represent different levels of semantic complexity.

So this is why out attempt of fine-tuning this network didn’t work as we expected.

By default, it trains all the layers at the same speed. So it updates the things that look like diagonal line or circles same as it updates the things that have specific details of a particular dog or cat breed. So we have to change that.

kushaj · July 1, 2019, 8:08pm

learn.fit_one_cycle(1) runs a training loop over your dataset. One here indicated the number of epochs/iterations to do. fit_one_cycle refers to the method we are using for training our model, more specifically it is using cyclic learning to train our model.

Now consider a pretrained model. So you have 5 CNN layers and then in the end some Linear layers to classify as dog or cat.

CNN’s work by first learning simpler things and then complex things. So the initial layers (that are close to your input image) would learn simpler things and the layers at the end of your model would learn complex things.

What are some simpler things? Horizontal lines, edges, diagonals are some examples. So these are the things that are learned in the first CNN layer of your model. Now as you see it doesn’t matter how much you train your model a diagonal would still remain a diagonal. So there is nothing much to learn. Now, this diagonal would remain the same whether I am identifying a dog or car or anything.

So, its very unlikely if we can make layer 1 features better. It’s very unlikely that the definition of the diagonal line will ever change for dog or cat breed or any kind of image for that matter versus the Imagenet data it was originally trained on.

Now as we move deeper in the model, the CNN layers learn complex stuff, which is also task-specific. So if you are identifying dogs, these layers would learn dog faces. Or if you are identifying cars these layers would learn car images. As you can see these layers need to be updated as per our dataset.