Just spent a day tracking down the root cause of “RuntimeError: running_mean should contain 3 elements not 1024” when I used learn.predict_dl.
This problem has been mentioned on other threads but the only solutions I’ve found was to try a different method. Instead of doing that, I was determined to figure out what was 1024 elements and why it should be 3.
The solution happened to be that I forget to unfreeze my learner. It’s passing an image whose first axis is of length 3, to the first unfrozen module (a BatchNorm1d) which is expecting the input’s first axis to be of length 1024.
Anyhow, I buckled down and learned pdb.
Jeremy has mentioned this a couple time in the middle of a few posts but this is really worth it’s own thread in order to call out how important it is to understand the structure of the code. It might not be as easy as a full IDE debugger, but this is really all we have outside of print statements.
Specifically, when you get a Runtime Error. type the following in a new cell
import pdb
pdb.pm()
this will start an interactive session at the point where the exception occurred where you can enter the following commands.
‘l’ - will list the code surrounding your current position in the stack
’u’ - will move you up the stack
’d’ - will move you down the stack
’p [some variable]’ - will print the value of any variable at that scope
’c’ - quit
In my case I hit an exception embedded deep in some unknown module’s forward method. I moved up the stack until I reached the main Sequential module and then printed the current module the sequential was iterating through when it crashed, as well as the summary of the Sequential and the self._modules it was iterating through.
Turns out the module dictionary it was iterating through was only a small subset of the full model summary, which clued me in that my data was being sent straight to the final fully connected layer instead of to the initial CNN. Why would it skip those layers? well because they were frozen.
I also cloned the fast.ai repo locally on my desktop and was following along in the code with PyCharm. Using a good IDE like PyCharm is another great tool.
also, pdb has a lot more features than what I listed above, like setting breakpoints, and stepping through code, etc. but just being able to examine variables up and down the stack is a really easy way to get started.
Anyway, hope this convinces people to take another look at PDB, although it’s not as nice as a full IDE debugger, it’s really not that hard to use and it has the potential to really give you a deeper understanding of the codebase.