Suggested workflow for diagnosing non-error-producing performance issues in Jupyter? (Specific issue in Chapter 4 of the book)

EDIT I’ll leave this up in case anyone has best practices to recommend, but after some experimentation I was able to determine that my specific problem was with my Jupyter installation – most likely some extension or set thereof that I had enabled – and not with my Anaconda environment or anything else. (Clean Anaconda environments yielded the same results, but running the same command sequence in IPython worked fine, and everything works properly when I open the notebook in Jupyter Lab.) So maybe one fix for Jupyter performance issues is, disable your extensions, or simply switch to Jupyter Lab.

Original Post
I’m having an odd issue in the clean Jupyter notebook for Chapter 4 – if I run cells 1-14 (i.e. up to the cell len(stacked_threes.shape), non-inclusive), they execute correctly and almost instantly. But the next cell (len(stacked_threes.shape)) takes several minutes to execute. Yet my Jupyter nbextension reports an execution time of 3ms. If I change the cell to something else – e.g. type(stacked_threes) or even 1+1 the same thing happens: 2-3ms reported execution time, but the result takes several minutes to appear. I am running it locally, and see unusually high Python CPU utilization during this period (6% on an AMD 3700x) and normal, minimal GPU utilization (3% on an NVIDIA RTX 2070 Super). While I’m certainly very curious if anyone has an idea of what might be going wrong in my particular case (is there some sort of GIL-blocking happening?), I realize that the problem may be a quirk of my local install (which is an Anaconda environment, with Python 3.8.5 and merely the base packages plus fastai-v2 installed).

So, my more general question: can anyone recommend steps or resources on diagnosing performance issues in Jupyter that do not actually throw errors?

Thanks in advance…