Developer chat

jeremy · September 14, 2018, 8:22pm

This is a chat thread for fastai v1 developers. Use it like a Slack or IRC channel. You can keep it open in a separate tab or window and see comments in real time, or check it from time to time to see what’s going on. If you’re working on fastai v1 and get stuck and need help, or want to discuss possible ways to solve a design issue, etc, feel free to add a reply here.

Using these Discourse forums for real-time chat works much better if you know the keyboard shortcuts. Hit shift-/ to bring up the list of shortcuts. Most important to know are: shift-R to reply; ctrl-enter to send. If you’re discussing a line of code or a particular commit, please include a link to it. If you’re discussing some chart or image, paste it in your post.

It would be very helpful if people contributing code on a regular basis could try to use this chat thread to mention:

What you’re going to start working on, if you’re starting on a new feature/bug/issue
When you commit code, tell us some context about what you did and why, and how it might impact other devs.

See the following post for an example.

jeremy · September 14, 2018, 8:22pm

jeremy · September 14, 2018, 8:23pm

Discussion of changes in this commit - note that many include a link to the relevant line of the commit:

Added a little section to the 001a tutorial showing how to use CUDA. I’m planning to update the mnist_sample notebook to use that too
Added some simple helpers like conv2d. I’m not sure they’re being used everywhere they could yet. Will look at this
Added ifnone which is simply return b if a is None else a, a very common pattern. Haven’t used it everywhere we could yet
Added functions like children to grab a list of children of a model, number of children, etc
Made some changes so that model init is only applied to layers that aren’t frozen. That can be handy for running init on a pre-trained model
Added convenient hook classes and callbacks for registering and automatically removing Pytorch hooks
Added TTA

jeremy · September 14, 2018, 8:34pm

Here’s a discussion of a rather lovely Unet refactoring, so it’s now <1 screen of code!

Added __getattr__ to DatasetTfm and DeviceDataLoader that pass unknown attributes to their composed dataset/dataloader
Simplified UnetBlock so the channels are easier to read
Changed DynamicUnet so it now builds on init, not on first forward, and uses nn.Sequential

sgugger · September 14, 2018, 8:36pm

This commit adds two notebooks:
- 009a is copied from previous lesson3_rossmann and contains all the necessary steps to feature-engineer the data
- 009 contains a first attempt at creating a TabularDataset, that is consistent with the way other APIs (ImageDataset and TextDataset) are built in the library.
This one is a wrapper that looks like the previous training phase API, to easily experiment various schedules.

jeremy · September 14, 2018, 9:06pm

@sgugger there’s a lot of interesting ideas for tabular data in the new tensorflow tfx stuff:

KDD 2017 Paper: https://ai.google/research/pubs/pub46484
Data Validation: https://github.com/tensorflow/data-validation
Transform: https://github.com/tensorflow/transform
Model Analysis: https://github.com/tensorflow/model-analysis
End-to-end example: github.com/tensorflow/model-analysis/.../examples
TensorFlow Hub: https://tensorflow.org/hub

Possibly too late and too different for us to consider these ideas for v1, but might be a source of inspiration…

stas · September 14, 2018, 11:46pm

By @jeremy’s request I have been working on a new build tool which will copy a fully completed notebook w/ outputs to dev_nb/run/ so that it can be shown to users.

It’s difficult to come up with a perfect logic to programmatically validate a notebook to be complete to be shown to users, so at the moment it uses two checks:

check the last code cell with code and see that it has outputs - but it’s possible that a cell was run yet had no outputs, so it’s not a solid check
check that the execution_count numbers are contiguous. i.e. if after completing the run of the notebook you went back up and re-run some cells, it’ll reject such notebook, as chances are it won’t be “perfect”.

Bottom line - run the notebook from the beginning to an end without any errors and then it’ll accept it.

It also pushed a disclaimer cell to the very top of the notebook to indicate that this is not be modified/PRed/bug-reported and to use the source notebook instead.

It’s still a work in progress but give it a try. You can run it on all notebooks:

tools/sync-outputs-version -v

or on specific notebooks:

tools/sync-outputs-version -v dev_nb/001b_fit.ipynb ...

Please use -v for now (verbose mode), I haven’t quite decided how verbose we want it to be without -v.

There are copious notes of me thinking aloud in the code, so the logic should be quite clear.

Feedback is welcome.

p.s. obviously run it only if you make significant changes to the dev_nb notebooks, otherwise don’t waste your time on syncing the two versions.

jeremy · September 15, 2018, 1:55am

Just to clarify: anything in this directory does not have its outputs stripped automatically. I’m not sure I remembered to post anything about that when I added it!

stas · September 15, 2018, 4:14am

I created a Readme file explaining that.

sgugger · September 16, 2018, 2:08pm

Pushed a few things here.

the last version of the imdb classifier (007b) that (finally) goes up to the correct accuracy
put it in run notebooks as well, just because it takes a lot of time to train all of this
also moved in run notebooks the cifar check notebook.

sgugger · September 16, 2018, 3:10pm

Commit to fix the way the vocabulary was hashed inside the TextDataset (it was a random hash, going for a deterministic one).

sgugger · September 17, 2018, 3:21pm

Commit to refactor the way the layer_groups are split in RNNs. It required ti change a bit the split method of learner in notebook 004a.

sgugger · September 17, 2018, 7:16pm

In this commit a big refactor of TabularDataset.

Then in this commit :

a small bug fix in 007 to correct the order of the arguments in fit_one_cycle,
then the tabular model, not working as well as it should yet but it’s a first draft.

sgugger · September 17, 2018, 8:23pm

Fixed the problem in this commit then cleaned out a bit in this one.

stas · September 17, 2018, 10:53pm

tools/sync-outputs-version is now good to go. Please let me know whether it works on Windows.

It can now execute notebooks from CLI, besides checking/copying successful ones. see the top of the script for examples, or run with -h.

Any suggestions for a better name for this tool? Currently its name is not intuitive at all, but at the moment I’m not feeling creative so nothing comes to my mind.

We use it to copy successfully executed notebooks in jupyter to dev_nb/run/ and optionally execute those from CLI.

Thank you.

jeremy · September 18, 2018, 12:18am

copy-to-run ? render-notebook ? run-notebook ?

stas · September 18, 2018, 12:25am

Thank you for the suggestions, Jeremy.

Frankly, I find the run directory to be unintuitive to start with.

Perhaps, a better name would be snapshot? As, we are taking a snapshot of the notebook’s outputs.

Then the script could be take-snapshot? take-nb-snapshot?

jeremy · September 18, 2018, 12:28am

Yes snapshot is much better. I’ll rename it now. And I’ll call your script take-snapshot. Very nice!

jeremy · September 18, 2018, 12:38am

@313V is visiting us this week!

Lots in this commit, including:

DataBunch now has a path attribute, which is copied by default to Learner, and is where stuff like models will be saved to. There’s also a new data_from_imagefolder function that creates a DataBunch for you
You can create a transform now with is_random=False to have it not do any randomization
Used this feature to create ‘semi-random TTA’, which does 8 TTA images, one for each corner of the image, for each of flip and non-flip. These are combined with whatever augmentation you have for lighting, affine, etc. This approach gives dogs v cats results up to 99.7% accuracy with rn34 224 px! (Previously around 99.3-99.4%.)
You can call DataBunch.holdout(is_test) to get either test set or validation set. Most prediction methods now take an is_test param
loss_batch now moves losses and metrics to the CPU
Learner now saves models inside path/‘models’
get_transforms now defaults to reasonable defaults for side-on photos
Added Learner.pred_batch for one batch and Learner.get_preds for a full set of predictions
show_image_batch now has an optional denorm function argument
Added a pytorch fbeta function

stas · September 18, 2018, 2:44am

Awesome! thank you!

I did some more improvements, including an important change: now the execution doesn’t overwrite the original .ipynb - so it doesn’t interfere with git and open in jupyter notebooks. Everything happens in a tmp file.

If you have your notebooks’ data setup, and lots of resources, you can now run:

tools/take-snapshot -v -e dev_nb/0*ipynb

and then make a big fat commit with many snapshots that aren’t under git yet.

Also, I disabled the execute-all-nbs by default option:

$ tools/take-snapshot -e 
When using -e/--execute, pass the *.ipynb files to execute explicitly

reasoning that it’ll take too many resources, and perhaps it’s better to specify the files to run explicitly. Nothing stops you from running dev_nb/0*ipynb though. But of course, if you believe it should work unimpeded let me know and I will remove that sanity check.