When we first unfreeze the model and train for one cycle with **learning rate = 0.003**, we see that the result gets worse. We then load the saved model, and run the **Learning Rate Finder** from which we can estimate that **1e-6** is a good learning rate. Note: The Learning Rate Finder was ran on the **frozen model**, and as a result , I feel that the optimal learning rate we get from the **Finder** should be the **optimal Learning rate of the LAST layers**.

However, in the following cells, we see that that the unfreezed model is trained with a slice of LR varying from **1e-6 to 1e-4**, where **1e-6** is the learning rate for the **first layer** and **1e-4** is the learning rate for the **last layer**. In order to train the unfreezed model, shouldnâ€™t we have ran the Finder on the unfrozen model itself, and then use the new optimal Learning Rate?