Part 2 should cover model debugging

I did all fastai courses and loved them all. I am still a newbie in deep learning but I have a lot of experience as a software engineer and one thing that I think could be really beneficial to any deep learning practitioner is model debugging. And I think Jeremy should cover some of it in a futur part of the course. I think there’s two part of model debugging that could be useful.

First, being able to detect common model learning problems and potentially fix them. Why is my model not improving? Why is the loss going to infinity? Should I use gradient clipping or not? Maybe decrease momentum or tune something else… What intuition or statistics could student use to help them fix their models themselves… I started using the excellent fastai tensorboard integration by @jsa169 (thanks for your contribution!) to try to see what was going on under the covers of my model.
There’s a lot of information there and it would actually be very useful to be able to leverage that to identify common problems and hopefully eventually fix them. But to do that you need to be able to interpret this information and this would be really interesting to learn.

The second problem is being able to leverage the debugger as a learning tool to understand what is going on in your model by executing line by line and inspecting variables. pdp in a Jupyter notebook is just painful compared to the experience of debugging inside an editor like VS Code. Jupyter notebook are the best in general, but when your model is not working no matter how strongly you stare at the screen, it wont fix your problem. Being able to stop execution with breakpoints, inspect variables accelerated tremendously my learning rate as a student when I first started coding. I feel like deep learning teaching would benefit a lot from exposing students to a real debugger like the debugging experience in VS Code for example. It sure helped me a lot in my deep learning learning journey.


You should probably post this under the Part 2 topic if you have access to it.

There’s a part 2 for 2018 topic, but not 2019 yet?

Then you do not have access to it:

Part 2 is already happening right now