How to use Interpretation API in case of multilabeled dataset

balnazzar · January 11, 2019, 5:33pm

Hi!

ClassificationInterpretation doesn’t seem to work for multilabeled datasets (in particular plot_top_losses()).
Note also that the reference notebook for multilabeled image classification (planet) does not make use of ClassificationInterpretation.

Should we use the interpretation api differently as we deal with a multilabeled dataset?

Thanks.

balnazzar · January 22, 2019, 5:16pm

Folks, since some of you liked this post, I imagine you’d need an interpretation tool for multilabeled contexts.

I wrote plot_multi_top_losses() and Sylvain, after a duly review, kindly added it to the library.
The documentation is here: https://docs.fast.ai/vision.learner.html#ClassificationInterpretation.plot_multi_top_losses

Thanks!

raimanu-ds · February 13, 2019, 4:33pm

That’s really cool ! Thanks a lot

I tried to make a PR (but couldn’t find how to do so) because I was wondering if it was possible to change the argument’s name figsz to figsize as in the original plot_top_losses() function.

Thanks

balnazzar · February 14, 2019, 1:11pm

Yes, apologies for having named that param in a different way w.r.t. the rest of the library

I’ll do it ASAP.

raimanu-ds · February 15, 2019, 4:35pm

Hello ,

When I plot_multi_top_losses() and view the images, it displays only one label (category), as illustrated in the screenshot below.
At first, I thought it was because I used a high threshold (0.5). I lowered it to 0.1, and the plot function still only displayed 1 label.

Since you worked on such a function and problems before, I was hoping you could help me.

More info about my dataset and steps I did:

Each images in my train set has 2 labels:

the name of a simpson character (homer / marge)
and its expression (happy / angry).

I created the databunch object, set the accuracy thresh to 0.1 and trained a resnet50 model, no errors popped up.

Here is the result:
I got a 70% accuracy, which would be too high if it predicted only 1 label for every picture in the validation set. What do you think ?

balnazzar · February 15, 2019, 4:42pm

ok, If I’m not misunderstanding you, you expected it to have shown something like “Predicted: --happy-- marge”.

Am I right?

raimanu-ds · February 15, 2019, 4:46pm

Yup, exactly !

But I don’t know if this a multi-label classification problem.

Or if I should use 2 classifiers, one to recognize the characters, another to recognize their expression, then ensemble the 2 models.

balnazzar · February 15, 2019, 7:48pm

Ok. You get a multi-label classification task when your data points can have more than one label (each), and this does actually happen to be the case at least once.

Now, you got a domain like this (if I’m not making mistakes): each scene is a data point, and each of your data points can in fact have more than one label: Indeed, it can simultaneously belong to up to two classes taken from a set of 4, that is, the cartesian product [homer,marge] X [happy, angry] (once you throw the ordering away).

If the above setting describes your domain correctly, you did good in using plot_multi_top_losses().

Why doesn’t it display the predicted class correctly? It should display an element from the cartesian product above, and indeed it does so for the actual class. The reason lies here:

classes_ids=[k for k in enumerate(self.data.classes)]

Since it acquires the names of the predicted classes this way, it seems that the learner just predicts over two classes: homer and marge. It should predict over four classes:
homer - happy
homer - angry
marge - happy
marge - angry

It would be instructive to look at your data as a whole. Can you provide additional details about the dataset and, more importantly, its labeling?

raimanu-ds · February 16, 2019, 3:15pm

First of all, thanks for your help, I really appreciate it !

Here’s the information about the dataset.

Number of images:

training set: 580 images
validation set: 70 images

Both the training and validation images are contained in a single folder.

I’ve used a text file to identify the images of the validation set.

Labelling - I’ve tried to use the same labelling system as the one in the planet dataset:

a csv file, with 2 columns:
- column 1: contains the images’ names
- column 2: contains the labels

Capture

I’ve tried to rerun the training process and plotted the results: the model seem to have only predicted the expression

Let me know if you need anything else. Thanks again.

balnazzar · February 16, 2019, 4:32pm

You are welcome. Please, try a learn.predict() over an image of yours. Like learn.predict(data.valid_ds[0][0].

Let’s see what the model predicts, apart from the interpretation api.

Thanks!

raimanu-ds · February 16, 2019, 7:20pm

Here is the result of the predictions for 2 images:
It looks like it indeed tried to guess the character and the expression in each image which is pretty cool.

balnazzar · February 16, 2019, 7:30pm

We are almost there… I got an idea about the probable culprit.

I need another thing or two.

The first handful of csv rows (not the dataframe).
The output of print({interpreter_name}.data.classes)

Thanks!

raimanu-ds · February 16, 2019, 7:55pm

Here are the first few rows from the csv (copy/paste from csv):

img_name	tags
homer_happy_0	homer happy
homer_happy_1	homer happy
homer_happy_2	homer happy
homer_happy_3	homer happy
homer_happy_4	homer happy
homer_happy_5	homer happy

The output of print({interpreter_name}.data.classes)

I will catch up with you later ()
Thanks again for your help !

balnazzar · February 16, 2019, 8:59pm

No problem.
It seems we have our answer. Note that it is not like I thought in the beginning, that is:

Rather, we have four distinct classes homer, marge, angry, happy. Now it seems the images are labeled with exactly two labels each (neither less nor more than two). Given this fact, the learner learns to predict always two labels over four, and indeed you get (see the examples you provided above) just two entries of the tensor which are above the prediction threshold.
plot_multi_top_losses(), in this case, just picks up the largest one. One modification i’ll certainly do will be to pick both the relevant entries, but note that in your case it would be better to reprocess the csv in order to generate labels like the ones I was speculating above: in such manner you’ll be able to feed the model with scenes contaning both homer and marge, getting a prediction like “homer-angry, marge-happy”.

raimanu-ds · February 17, 2019, 9:00am

You are right about the fact that each image has exactly 2 labels. I didn’t know it would affect the learner to predict 2 labels, that’s interesting.

I don’t quite understand the part where you said:

“One modification i’ll certainly do will be to pick both the relevant entries”

Could you clarify this part a bit ?

I will follow your advice and reprocess the csv, as when I fed the model an image with both marge and homer, it predicted: ‘homer; happy’.

From here, I may rework the dataset a little so that the goal will be recognizing several characters in 1 image to have a better understanding of how/what the model learns (thanks to you, my understanding has already improved ). Then, l will add the expression (emotions) element.

Thank you so much for your help!

balnazzar · February 17, 2019, 5:24pm

Yes. Look at the tensor containing the probabilities: is has the outputs of four sigmoids (vs. the output of a softmax you get when you work with single-labeled data). Two of those four entries are near zero. The other two are much greater than zero. Every entry which stands above the threshold you established is a positive prediction.

I engineered the method plot_multi_top_losses() to pick the dominant prediction, the one closest to 1, since doing otherwise would have been a bit tricky and prone to potential errors.
One modification I should do would be to look at the threshold established by the user and label as positive prediction every class for which the probability is above that threshold (mimicking the behaviour of learn.predict(), but please appreciate that you can do different training cycles with different thresholds, and this is the part which makes the task a bit tricky.

Great. In that setting, you will be able to predict over a scene which has both personas in it, an ability your current model doesn’t possess.

Thank you for having inspired several ideas about how to improve the method. I’ll work upon them!

armheb · December 10, 2019, 9:48am

Do you think it would be possible to add heatmap(GradCAM) to multilabel interpretation?

balnazzar · December 10, 2019, 5:23pm

Yes that’s something I should have added from the beginning. Now with fastai v2 on its path for final release, I think it would be wiser to invest our time developing addons for it!

armheb · December 10, 2019, 8:14pm

Sure. I’m currently working on a multilabel problem using fastai V2, I’d be glad to help implement the heatmap for multilabel problems in V2. currently my problem in the backward part of gradcam. I’m not sure how I should use the predictions in the backward pass. I’d appreciate your guidance.

s.s.o · December 10, 2019, 9:42pm

Did you check the course notebook lesson6-pets-more.ipynb there are some example use cases for single label.