Hello, I’m just learning Lesson 1 so obviously understanding could be far off.
As to how I understand it, using
learn.lf_find method to find
learning rate is using fixed “parameters” to train “super-parameters”. If this is true, this seems to a chicken-egg problem to me.
I think the core of tasks for a Deep Learning practitioner to solve a “Dogs v.s. Cats” problems are:
- Pick Model. In this case, resnet34 model is chosen.
- Pick Other Super-Parameters. (i.e. “Learning Rate”, and “Epoch”)
2.a) The example ran
learn = ConvLearner.pretrained(arch, data, precompute=True) lrf=learn.lr_find()
this effectively runs the model once with the default parameters (i.e. the model is un-learned, all parameters are the default values)
2.b) at some point in time,
lr_find() method will stop when the “loss stops decreasing”, and the “learning rate” is the chosen one
3) Set Parameters. This is achieved by running model a few more times.
4) A few more steps to improve the model, basically the repeat process of 2) and 3) for different layers.
Is my understanding correct? If so, is this a common strategy?
a) “fix parameters” and train “super parameters”
b) “fix super parameters” and then train “parameters”
that is, it’s never a waterfall process, it’s kinda of an iterative and interleaving process?