Planning to get rid of `precompute` in fastai v1. Comments welcome

jeremy · September 6, 2018, 4:44am

In the upcoming v1 of the fastai lib I’m considering removing the precompute parameter for ConvLearner. My reasoning is:

I never find I use it in practice
Data augmentation doesn’t work, so it leads to more overfitting
Training the head rarely takes more than a couple of epochs, so it saves little if any time
It’s extra data to store (quite a lot, sometimes)
Makes the codebase quite a bit more complex

(Regardless of whether it stays in the lib or not, I certainly wouldn’t keep it as part of lesson 1, since it’s been very confusing for students to understand the difference between freeze and precompute!)

Comments welcome.

brtknr · September 6, 2018, 5:53am

Would the default behaviour then be to precompute non-data augmented dataset?

What do you mean data augmentation doesn’t work? Do you mean that its contribution to improvements is not significant?

sjdlloyd · September 6, 2018, 5:55am

I’m just trying to think of any situations I’ve used it…

The only cases I can think for it is for

experimenting on things like DeViSE, and could be useful for training other large custom heads eg Unet? (not that I’ve tried this)
It’s also quite cool to show a model training really quickly when running a demo, although a bit smoke and mirrors.

On the whole though, I’d agree it’s extra unneeded clutter, and causes confusion

radek · September 6, 2018, 6:01am

I have not found it useful and from what I recall people were finding it quite confusing vs freezing.

binalpatel · September 7, 2018, 1:46am

I think getting rid of it is a good idea, especially if it makes the code base simpler and easier to understand.

I was one of those students when watching Part 1 trying to figure out the difference between freezing and precompute, and aside from that lesson I’ve never actually used it in practice (and haven’t been the worse for it).

TheShadow29 · September 7, 2018, 3:08am

I agree. Even I was confused by this. However, I found the code quite helpful. It helped me write a code for saving intermediate features in bcolz and then getting it back.

If the code base becomes simpler, its best to remove it imo. From experience I have never really found it much useful except for sanity checks if the model is working or not.

jeremy · September 7, 2018, 5:29am

Yeah I think some easy way to do this would be nice to provide - but perhaps we can provide some class or functions that specifically does just that, rather than integrate it into Learner.

(If anyone wants to try to come up with an API for that - feel free! )

TheShadow29 · September 7, 2018, 5:23pm

Here is the saving intermediate features stuff I wrote FAI-notes/notebooks/Using-Forward-Hook-To-Save-Features.ipynb at master · TheShadow29/FAI-notes · GitHub. Specifically, the class FeatureSaver does the saving and loading of the saved activations. It takes input a learner, a template for naming files, and the model data.

The current notebook is definitely not the best api, however it gets things done. Happy to discuss how to integrate this into a nice looking function or class. Backward hooks can also be integrated in the same way I believe but might be a little more involved.

TheShadow29 · September 9, 2018, 10:18pm

After giving some thought, I propose the following api for saving features.

Have this inside the learner itself. Have a function called save_features_of_module_index. This takes an integer index or a list of integers, which would correspond to m[index] or m[index[0]][index[1]] corresponds to the layer after which the values should be saved. The list of integers is when the model m has sequential blocks inside sequential blocks. This internally uses SaveFeatures similar to that used in lesson-7 (CAM). Rest of the process is same as I have described in the notebook.

I am not sure if this is the correct thread to add, but I think, the learn.freeze_to should also have a sister function learn.freeze_to_index which uses index in a similar way.

Open to more suggestions.

radek · September 10, 2018, 10:27am

One of the things I have been hoping for in v1 is greater modularity. Having a lot of things in Learner is not very useful when you want to figure out what it does or make it do something in a slightly different way. We end up with multiline function signatures and a lot of if conditions.

Here I feel that saving the activations is not helpful to newcomers, is confusing, and has nearly no applications (unless we want to predict on a subset of Imagenet classes). And people with some experience should be able to figure out how to do this and many other related things if they are given the correct, basic building blogs. Something like this

learn = <instantiate learner with no head, shuffle data = False>
acts = learn.predict(...)
np.save('path_to_file', acts)

And then we can do this:

acts = np.load('path_to_file')
learn = <instantiate learner using the np array as data and create only the head part>

Anyhow - sorry if this comment misses the point. I only started getting into v1 and am having a great time.

jeremy · September 10, 2018, 1:24pm

Right - Learner is now tiny in v1 and everything is done with callbacks.

TheShadow29 · September 10, 2018, 4:06pm

Point noted. @radek your formulation looks way cleaner. @jeremy I didn’t notice that everything is done with callbacks. I will look into that more now.