Pytorch training loop vs Fastai Learner

So I’ve been trying both a vanilla pytorch training loop BUT with one cycle learning and the LR finder.
But the Fastai Learner does much better on the same dataset, which uses the same two methods. What other major thing is happening in the fastai learner that it does things so well?

I think you will need to go through the fastai source code, compare with your own training loop code, and see what differences you spot. For details and citations, a lot of the underlying innovations and tricks are described in this paper.

Yijin

1 Like

What kind of problem are you working on?

There are various defaults that fastai has that leads to improved performance. Here are some things to check (somewhat specific to pretrained image models):

  • The optimizer used (fastai uses AdamW)
  • The hyperparameters used (fastai has different defaults for betas, weight decay, etc.)
  • Custom model modification by fastai (ex: fastai adds a custom head when fine-tuning pretrained image models)
  • One-cycle actually starts decreasing 25% into training (pct_start default). Other defaults to be aware of include the starting and ending LR for one-cycle.
  • One-cycle also schedules the momentum as well

There are probably a lot that I am missing but these are the main ones I can think of (and see in the code).

4 Likes

This is essentially what I’m trying to do. Pass an image through an encoder, then take the encoding and pass it through a decoder for reconstruction.
I don’t use anything pretrained.
I need to use the image encodings, and use them to find similar images. The notebook here does it in one sweep, I’m trying to do progressive resizing.

Yeah I guess so. I just wanted to see if there are any other big ideas in there other than the lr-finder and one-cycle that is making big differences.