Notebook and notebook exports mismatch caused headaches

exynos7 · July 26, 2019, 5:23am

I haven’t seen anything mentioned about this anywhere. Maybe I missed the memo. I’m sure someone will tell me if this is common knowledge. Anyways, this cost me a good bit of time, but that probably says more about my debugging skills than it does about the nature of the problem itself. I hope this will help someone else.

So, for anyone wondering: The notebook export scripts, exp/nb_*.py, contain minor changes not reflected in the notebooks themselves. Meaning, if you run notebook2script.py at the end of the notebook, (like I evidently do) you will override necessary code changes. So, if you are getting weird errors when you run your code, this might be the cause. Just git clone the repo again and you should be good to go.

The changes that led to my discovery of this were in nb_05b.py, and nb_05.py, but I’m guessing there are more. In 05b, there was a line added to the beginning of the Runner __init__ method: self.in_train = False. In 05, there was a function added: cos_1cycle_anneal. I checked the latest version on github, and both of these still exist.

I’m sorta curious as to why these changes weren’t simply added to the notebooks themselves. It would have made my last hour or so a lot more productive. In my ignorance of any reason/purpose for the mismatch, I advocate that the notebooks and their export scripts be made identical.

To cure my ignorance: does anyone know why this is?

jimlou · August 3, 2019, 6:06am

Thank you. I got the same problems when I run the notebook.

Vinod · September 19, 2019, 11:25pm

Thank you @exynos7 this was helpful, saved my time!

g13e · November 10, 2019, 1:26pm

I run into the same issue and realized the changes to the notebook seat on a branch that was never merged into the main notebook. @jeremy should the changes be merged ?

e.g. this is the changes to the 05_anneal notebook that adds the cos_1cycle_anneal(start, high, end) function

github.com

fastai/course-v3/blob/1b860b5c981bd4b778f4635ea864ba483e138eb0/nbs/dl2/05_anneal.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%load_ext autoreload\n",
    "%autoreload 2\n",
    "\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [

This file has been truncated. show original

g13e · November 13, 2019, 8:24pm

I made a pull request with some minimal change (just adding the cos_1cycle_anneal function)

lawrence · November 16, 2019, 12:48am

Thanks @exynos7 and @g13e for this. I ran into the same thing in Lesson 12 when the missing function broke the notebook and I couldn’t find it anywhere. I should have realized that there might be differences between the notebooks and the .py files!

And in case anyone else is also on Azure, I used the default fastai installation, and I just realized that I may have TWO git repositories:

~/fastai/, which has the fastai code, and
~/notebooks/fastai/course-v3/, which has the notebooks as well as the subdirectories with the .py files in them.
So when I cd to ~/fastai/ and run git pull, I’ve apparently I’ve been updating the code repository but not the notebook repository. I am guessing that the solution is to cd to the notebook directory and do git pull there, too.

One thing I’ve wanted to do is to add ‘help’ annotations to the export cells to remind myself of what functions do, so that I’ll see the help when I use Jupyter Lab to search on the function definition in the notebook. But if I re-load the .py files from a git repository, I guess I’ll have to merge my changes, which seems painful. Suggestions for a better workflow would be welcome!

alohia · November 26, 2019, 6:00am

The ParamScheduler class in 09_optimizers is also different from the exports. Make sure to use the one in the exports.

g13e · November 26, 2019, 9:01am

@alohia my pull request was merged by Jeremy so at least the cos_1_cycle_anneal problem should be solved now!
maybe you should consider creating a pull request for the ParamScheduler? I found it a useful exercise in it self to familiarize with the process

alohia · November 26, 2019, 2:22pm

Yes @g13e , I found a few other mismatches as well. Planning to compile all of them and then send a pull request.