The following papers by Leslie N. Smith are covered in this notebook :-

- A disciplined approach to neural network hyper-parameters: Part 1 β learning rate, batch size, momentum, and weight decay. paper
- Super-Convergence: Very Fast Training of Neural Networks Using Learning Rates. paper
- Exploring loss function topology with cyclical learning rates. paper
- Cyclical Learning Rates for Training Neural Networks. paper

This notebook covers all the topics discussed with theory as well as the fastai implementations of the relevant topics.

Table of Contents:

- Summary of hyper-parameters
- Hyper-params not discussed
- Things to remember
- Underfitting vs Overfitting
- Deep Dive into Underfitting and Overfitting
- Underfitting
- Overfitting

- Choosing Learning Rate
- Cyclic Learning Rate (CLR) and Learning Rate Test
- ResNet-56
- Cyclic Learning Rate
- Difference from Original paper
- One-cycle policy summary
- Learning rate finder test

- Introducing Super-Convergence
- Testing Linear Interpolation tests
- How it was found in the first place?
- Coding Linear INterpolation

- Explanation behind Super-Convergence
- Choosing Momentum
- Some good values of momentum to test

- Choosing Wight Decay
- How to set the value

- Train a final classifier model with above param values

There is cyclic momentum and weight decay left to cover, but seeing most of the stuff from the papers is covered I decide to share now.

Notebook link Reproducing Leslie N. Smithβs papers using fastai