I was trying know about the mixed precision training in more detailed form. It turned out that to utilize its full potential, the parameters have to chosen very carefully. For example, Jeremy repeatedly chose the size of input and output dimensions of convolution layers as multiples of 8 during Lecture 7 of part I.
Some resources provided by NVIDIA that helped a lot to understand mixed precision training:
- Mixed Precision Training of Deep Neural Networks
- Taking Advantage of Mixed Precision to Accelerate Training Using PyTorch
- Tensor Core Performance: The Ultimate Guide
- Automatic Mixed Precision in PyTorch
- Automated Mixed-Precision Tools for TensorFlow Training
- MXNet Models Accelerated with NVIDIA Tensor Cores
- TRAINING WITH MIXED PRECISION - Use Guide
PS:
The slides for the videos can be directly accessed, but NVIDIA Developer Program membership may be required to watch the video contents.