(Lesson 2) Question about 'easy steps to train a world-class image classifier'

In the course notebook lesson1.ipynb (but in the video lecture 2), there are cells looks like the following

Review: easy steps to train a world-class image classifier

  1. precompute=True
  2. Use lr_find() to find highest learning rate where loss is still clearly improving
  3. Train last layer from precomputed activations for 1-2 epochs
  4. Train last layer with data augmentation (i.e. precompute=False) for 2-3 epochs with cycle_len=1
  5. Unfreeze all layers
  6. Set earlier layers to 3x-10x lower learning rate than next higher layer
  7. Use lr_find() again
  8. Train full network with cycle_mult=2 until over-fitting

but I find it’s a bit weird that you would set differential learning rate before running lr_find() again. Wouldn’t you want to find the learning rate first and then set the differential learning rate so that earlier layers have lower rates compared to the newly found learning rate?