Tricks in deep learning

Jeremy mentions in one of the videos that it is a good idea to train images on a low resolution and then use those weights to train them on a higher resolution in order to achieve better results. I tried this on a couple of applications and it does work really well. I think after a point, it’s usually these tricks that take our models from good to better or best (and win us Kaggle competitions.)

I would like to create this topic for people to share these kinds of little tricks they’ve come across (maybe at work) or learned from some book. It will be really helpful for people like me who are just beginning their journey in deep learning.


I think this is the point of this thread here:

1 Like

Hey @ilovescience
I did not mean all the tricks that Jeremy mentions. Tricks learnt outside, maybe through a Kaggle competition or a fellow colleague, all of which together will make a good corpus of tricks. The thread you mentioned is not entirely focused on tricks or tweaks.

1 Like

Ah ok, if such a list of tricks is compiled, it would definitely be helpful! Sorry about the confusion…

This seems like a great place to start – Sanyam published a great little blog post on the “Bag of Tricks” paper for image classification :slight_smile:


rain images on a low resolution and then use those weights to train them on a higher resolution in order to achieve better results.

Is there any notebook that describes how to do this in fastai?

@Emmarof The simplest way to do this is when you make your ImageDataBunch object, when you pass in size, gradually increase it. And each time make your learner, and pass learn.load(mysavedmodel) and just be sure you save your model after each step. Then continue training. Does this make sense?

1 Like

Hi Dipam

In order to do below effectively, shouldn’t we do the followings:

  1. train with freeze on small image size data

  2. unfreeze the layers and find the appropriate learning rates

  3. then train on full architecture

  4. save this model
    then, use = data_large

  5. unfreeze the layers again

  6. find the appropriate learning rate

  7. train the model on full architecture

Best Regards

yes you’re right. I did not want to train for too many epochs and I was just experimenting if progressive image resizing works better than using a static size hence I did what I did. You can try different ways and see which one gives better results. Cheers

1 Like

Also I would like to mention that in the kernel I’ve created 2 data bunches which is not required. Do the data loading, splitting and labeling only once. For transforms and size write a helper function that will help you vary the image size and return different size data bunches. So you can train for a few epochs with 64, then 128, then 256 and so on. You can also plot the results after each size. Another thing to remember would be to divide the batch size by half when you double the image size to make sure you don’t run out of memory


Simple trick that works with traditional models and deep learning models. Train multiple models and take the average prediction. If you don’t have the resources to train multiple models you can save a model after every epoch then take the average prediction from these models.

1 Like