Training a model from scratch: CIFAR 10

yes, you’re correct @radek, we’re using also SGDR. To rephrase my question: how many epochs does it take to reach the same precision using SGDR, but starting with 32x32?

I agree with you that it’s easy to test this with current notebook, but there are so many things to do and so little time :slight_smile:

1 Like

Absolutely, I can relate o that :slight_smile:

I am not even sure if same results can be achieved without the resizing but would be really cool to see the comparison!

I modified the code in models/cifar10/ so that we are able to pass in the number classes we want.

Previous: def init(self, block, num_blocks, num_classes=10)
Now: def init(self, block, num_blocks, num_classes)

Previous: def SENet18(): return SENet(PreActBlock, [2,2,2,2])
Now: def SENet18(num_classes=10): return SENet(PreActBlock, [2,2,2,2], num_classes)

It works fine for me after that for the iceberg challenge.


Absolutely yes! I was also surprised at how much overfitting this training seemed to handle.

Did you use the pre-trained version? I tried this same change but it only works when using the model from scratch.

It didn’t work for me too when I used the pretrained version. Seems that the weights were saved according to the num of classes (10 vs 2).

Yea interestingly when I used the pre-trained version, without changing the num_classes, while I do get 10 predictions per image but the first two of those predictions corresponding to 0 and 1 seem to be accurate? Even tho its a bit unorthodox I wonder if we could just grab those first 2 predictions and just use those on binary problems like iceberg.


I’m going to try to improve this workflow. Although @jamesrequa’s workaround is a nice hack! :slight_smile:


got a .241 log loss using resnet 152.

What i find is that the overall loss is decreasing (0.5->.28) on increasing the parameters of


I noticed that last night
I’m going to try that tonight.

But this hack sounds promising because I already have it implemented :slight_smile:


Did you solve the cuda runtime error? I have the same error

How do you load the intermediate model?

First you would need to have saved the model'model')

Then you can load that same model back in with the saved weights and pick up training where you left off.

1 Like

What is stats = (np.array([ 0.4914 , 0.48216, 0.44653]), np.array([ 0.24703, 0.24349, 0.26159])) and what does tfms_from_stats do? I googled a little, but couldn’t figure it out.

I see, thanks!

I thought you can interrupt the model while training - kill the instance - go to sleep :slight_smile: and the reload some temporary model from a temporary file.

When you use a pretrained net for classification, you should use the same mean values of the train set to normalize your data (during prediction step). Often, the means of the images in imagenet are published and you could just use it to normalize your data - which is exactly what is happening here.


(In this case - the averages are of the CIFAR10 data, which I just calculated using numpy, before I trained the CIFAR10 model.)


Ah. My bad. CIFAR 10 !

Thanks, so is it similar to the preprocess_input step in keras?

You’re right. preprocess_input in Keras also does scaling in its series of steps.