Examples/tabular.ipynb: 'split_idx' is not defined

Hi. I’m posting here prior to submitting an Issue on GitHub.
Earlier today I cloned the latest version of the repository and have been running the examples.

I’m getting an error in examples/tabular.ipynb about an undefined variable for cell number 10, which reads

data = (TabularList.from_df(df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
                           .split_by_idx(split_idx)#(list(range(800,1000)))
                           .label_from_df(cols=dep_var)
                           .add_test(test)
                           .databunch())

This produces the error:

NameError: name 'split_idx' is not defined

Ctrl-F shows that split_idx appears nowhere else in the notebook. It’s odd to me that others have apparently been able to run this code ok, even with an undefined variable.

Can anyone suggest what to do to run this code?

You should use the list(range()) that is commented out right there

Thanks. That runs.
So… should that be un-commented in the official version of the notebook? Or some kind of note added for new users, so that they know they need to do this?

It is in the course notebooks, see here: https://github.com/fastai/course-v3/blob/master/nbs/dl1/lesson4-tabular.ipynb (I’d generally run out of this repository for examples as it is the course)

1 Like

Ok thanks. I’ll submit a PR for this particular notebook example so that people don’t end up in my boat.

1 Like

PR is here: https://github.com/fastai/fastai/pull/2478
Checks passed! :slight_smile:
Update: And accepted, merged into repo!

1 Like