Do yourself a favor and learn PDB

Just spent a day tracking down the root cause of “RuntimeError: running_mean should contain 3 elements not 1024” when I used learn.predict_dl.

This problem has been mentioned on other threads but the only solutions I’ve found was to try a different method. Instead of doing that, I was determined to figure out what was 1024 elements and why it should be 3.

The solution happened to be that I forget to unfreeze my learner. It’s passing an image whose first axis is of length 3, to the first unfrozen module (a BatchNorm1d) which is expecting the input’s first axis to be of length 1024.

Anyhow, I buckled down and learned pdb.

Jeremy has mentioned this a couple time in the middle of a few posts but this is really worth it’s own thread in order to call out how important it is to understand the structure of the code. It might not be as easy as a full IDE debugger, but this is really all we have outside of print statements.

Specifically, when you get a Runtime Error. type the following in a new cell

import pdb
pdb.pm()

this will start an interactive session at the point where the exception occurred where you can enter the following commands.

‘l’ - will list the code surrounding your current position in the stack
’u’ - will move you up the stack
’d’ - will move you down the stack
’p [some variable]’ - will print the value of any variable at that scope
’c’ - quit

In my case I hit an exception embedded deep in some unknown module’s forward method. I moved up the stack until I reached the main Sequential module and then printed the current module the sequential was iterating through when it crashed, as well as the summary of the Sequential and the self._modules it was iterating through.

Turns out the module dictionary it was iterating through was only a small subset of the full model summary, which clued me in that my data was being sent straight to the final fully connected layer instead of to the initial CNN. Why would it skip those layers? well because they were frozen.

I also cloned the fast.ai repo locally on my desktop and was following along in the code with PyCharm. Using a good IDE like PyCharm is another great tool.

also, pdb has a lot more features than what I listed above, like setting breakpoints, and stepping through code, etc. but just being able to examine variables up and down the stack is a really easy way to get started.

Anyway, hope this convinces people to take another look at PDB, although it’s not as nice as a full IDE debugger, it’s really not that hard to use and it has the potential to really give you a deeper understanding of the codebase.

22 Likes

Couldn’t agree more. I love pdb, and ipdb. You can invoke from the command line as well via python -m ipdb script.py

Also if you run your code in ipython you can use %debug to get into ipdb at the point where the exception happened :slight_smile:

4 Likes

this!

actually I started to use ipdb instead of pdb because of the syntax highlighting. stumbled on %debug as a replacement for pdb.pm().

I’m using jupyterlab and have a scratch pad running off the same kernel as the notebook I am working on with this and various debugging commands. pretty convenient.

the one thing that bothers me, is how the ipdb> prompts accumulate at the bottom of the output. I take it, its because the ipdb output is considered widget output, while the actual stack traces are print statements and jupyter.

does know how to interleave them properly.

Anyone found a solution to this? I ran into this before when I wanted to print labels to plots and all the plots were loaded after all the print statements.

Don’t know of a solution, but on a related note regarding the awkwardness of using pdb and friends in a Juptyer notebook (e.g. the prompt disappearing from view and having to scroll around to find it), I found using ‘jupyter console’ from the command line with the --existing argument to provide an incomplete-but-possibly-better-in-some-respects experience.

very very interesting…

Great insight. Thanks