Learn.get_preds() memory inefficiency - quick fix

Much nicer than my hacky fix - I was trying for minimal changes as I don’t yet grok the framework.

However I’m pretty sure my fix to fastai/vision/widgets.py is safe - can you confirm?

Happy for you to put it all into one PR at once if you’re happy with it.

Ideally we’d put this in 2 PR’s, one to Interp notebook and py and one to the widgets notebook and py. Jeremy likes incremental PR’s :slight_smile: Haven’t checked on the ImageCleaner yet to see if that can be adjusted at all, should be able to review that in the next day or two

Also I’ve only tested this on images. We need to test this out on all supported applications to make sure it works in all scenarios

This does break on tabular, so we need to rethink a few things

You mean the new Interpretation class you’ve built, or the small-fixes approach I tried?

I’ve only ever done the video walk-throughs on Tabular, so can’t possibly claim any expertise there - would not be at all surprised if my fixes broke :wink:

The one I built. it should do something similar to what you did, so I’ll need to sit down and figure out why that’s not quite working

Ok - if you have a simple test-case that breaks yours, send it over and I’ll see if it breaks on mine as well.

For the testing I’m using this notebook: https://github.com/fastai/fastai/blob/master/nbs/examples/app_examples.ipynb

And simply doing:

dls = dls
learn = application_learner()
interp = Interpretation.from_learner(learn)
interp[0] # or interp.plot_top_losses(1)

Ok, my fixes above seem to work for everything for which an Interpretation class is implemented (e.g. plot_top_losses not implemented for segmentation or image regression). Tabular definitely works though.

1 Like

I should confirm that for testing Tabular and Text, I used
interp = ClassificationInterpretation.from_learner(learn)
interp.most_confused()

@muellerzr How is your improved Interpretation class looking?
I made a PR for ImageClassifierCleaner as you suggested to keep them separate, and it is a fix that shouldn’t have any deeper consequences. I’ll leave the Interpretation class fixes to you as you’re clearly doing more major surgery.

1 Like

Hello @muellerzr .
The memory inefficiency is still exist on the fastai version released three days ago (2.5.0). When I try to predict a dataset with 500K samples using tabular model, the 128G system RAM gets full after 74% of prediction and the computer freezes. The prediction works using 200K samples. I noticed that the class Interpretation on the current version of fastai is not using the getitem() strategy proposed here…
When I copy the class interpretation proposed here to the library, and run the interp = ClassificationInterpretation.from_learner(learn), I got the following error:
“TypeError: init() missing 1 required positional argument: ‘losses’”
Hope you can help address this issue…
BR

Yes, it is not in fastai, as the method proposed does not work in every situation.

Re your losses, please post the version of fastai and fastcore you tried with, as I was not able to recreate that with:

from fastai.vision.all import *

set_seed(99, True)
path = untar_data(URLs.PETS)/'images'
dls = ImageDataLoaders.from_name_func(
    path, get_image_files(path), valid_pct=0.2,
    label_func=lambda x: x[0].isupper(), item_tfms=Resize(224))

learn = cnn_learner(dls, resnet34, metrics=error_rate)
interp = ClassificationInterpretation.from_learner(learn)
1 Like

Appreciate your quick feedback. Here are the version used:
fastai: 2.5.0
fastcore: 1.3.25
BR

Try upgrading your fastcore version. IE pip install fastcore -U

I Upgraded to fastcore version 1.3.26. Still memory get full.
BR

There isn’t a memory fix currently is what I’m attempting to say. You can try making a test_dl with less items and run it with a dl param

1 Like

ok, got it.
Thank you for your time. :grinning:

Try my slightly hacky fixes documented at Learn.get_preds() memory inefficiency - quick fix - #19 by hushitz

They passed all the tests I did at the time, but @muellerzr was worried about their generality so I never did a formal PR for them. So there’s a good chance these will work for whatever your application is, and if they don’t work, please let us know!

2 Likes

@hushitz , Appreciate your suggestion. I run it, however, the memory inefficiency is still occurring.

Actually, a couple of weeks ago, I achieved some improvement after trying the codes proposed here. Initially, I could execute the prediction using a dataset having 200k samples. After trying both codes, one at a time while installing different versions of fastai library, I managed to run prediction on 500k samples. The issue is that the environment was accidentally deleted, and unfortunately, I didn’t backup it neither noted what code worked. Please let me know if you need additional information for further investigation.
BR

@fastai_geek maybe this might help you … had a similar issue previously