The following papers by Leslie N. Smith are covered in this notebook :-
- A disciplined approach to neural network hyper-parameters: Part 1 β learning rate, batch size, momentum, and weight decay. paper
- Super-Convergence: Very Fast Training of Neural Networks Using Learning Rates. paper
- Exploring loss function topology with cyclical learning rates. paper
- Cyclical Learning Rates for Training Neural Networks. paper
This notebook covers all the topics discussed with theory as well as the fastai implementations of the relevant topics.
Table of Contents:
- Summary of hyper-parameters
- Hyper-params not discussed
- Things to remember
- Underfitting vs Overfitting
- Deep Dive into Underfitting and Overfitting
- Underfitting
- Overfitting
- Choosing Learning Rate
- Cyclic Learning Rate (CLR) and Learning Rate Test
- ResNet-56
- Cyclic Learning Rate
- Difference from Original paper
- One-cycle policy summary
- Learning rate finder test
- Introducing Super-Convergence
- Testing Linear Interpolation tests
- How it was found in the first place?
- Coding Linear INterpolation
- Explanation behind Super-Convergence
- Choosing Momentum
- Some good values of momentum to test
- Choosing Wight Decay
- How to set the value
- Train a final classifier model with above param values
There is cyclic momentum and weight decay left to cover, but seeing most of the stuff from the papers is covered I decide to share now.
Notebook link Reproducing Leslie N. Smithβs papers using fastai