Doubts regarding multi-label classification

Thank you so much, this is super helpful.

@kodzaks I’ll also mention, if you want a good example of this, I repurposed PETS to do multi label (as a way to try to have a model that says I don’t know) which does a single-label problem in a multi-label framework

7 Likes

So, this is a label that means “not-all-or-any–other-labels”?

Yes. It works by just not returning a label whatsoever (see the donkey example in the end of the notebook)

1 Like

Interesting! So you did not train your model for “nothing” category where you just put some random stuff, instead when it sees something that is "not-all-or-any–other-labels " it returns it as “nothing”?

1 Like

Yes! Exactly. The reason is multi label now gives the model what’s called a sigmoid function which puts everything on its own threshold. A basic example is in multi-label problems, we can have 1,2, or 3 labels present in a given moment, which means there are times label 2 or label 3 may not be present. Which then holds true the idea there are cases where it can have no labels that reach above a threshold we put. This is different from our regular classification because we apply argmax, where the general idea is we gather all the raw probabilities and scale them to 0-1 total so all our probability sum to 100%, and we take the highest one as our answer. Instead here we look at each raw probability and see if they’re above a % threshold. For example I could have say 15%, and if a particular label is above .85, then it’ll show up

2 Likes

Ok, but what about situations when we have both a dog and a cat in the same image, a single cat in another image, and 2 dogs in 3rd image and a donkey on 4th. I think I am confused with what “multilabel” means. I though it was for situations when several classes are present on the same image.

It is, but if used creatively we can repurpose our models to tell us if something may not be what we want our input to be (such as a user inputing a car on an animal classification model).

If we assume that this is for instance the binary classifier from lesson 1 (where it’s either cat or dog), your outputs should look like so:

  1. Dog and cat are present
  2. Cat is present but not dog
  3. Dog is present
  4. Nothing is present

Note here it does not check for the number of instances something is there, just simply if it is present. You’d further want to combine ideas of say object detection or image regression to do some form of a counting mechanism.

On 3 though, if it was instead the PETs model from lesson 1 last year, one would presume that 3 would give species 1 and species 2 (like Labrador retriever, husky)

1 Like

Thank you! I will need to think about it more, i.e. what is the best way to go about it. I have, let’s say, 5 classes (object names) and on any images there could be just one class present, or 3 or 5, or 2 in various combinations. Probably object detection will work better then? I need the results to be like this: this image has class 1, class 3 and class 5. But this image has class 1 only, etc, various combination of classes (i.e object names?).

I’d say multi-label is exactly what you need here. What I was describing above was the following situation:

I have a picture like so:

(We’ll use puppies because who doesn’t love puppies!)

Object detection (or regression) would be advisable in this situation if we wanted to know how many Rottweiler puppies were present (in this case 5). In a multi-label perspective like discussed earlier, our model would return “Rottweiler”.

Also multi-label (like your describing) could be if we re-write the scenario like so:


In this case now, our model would return saying there is a Rottweiler present, a husky, and a golden retriever, despite the fact there are a few more rottweilers present. Image regession or bounding boxes would tell us where each one is and what each dog’s class is (in bounding box, regression would simply return a number. You could also then go down a rabbit hole with image regression, but if that’s of interest I’ll explain he whole concept on zoom and upload that :slight_smile: (and what we just discussed here) )

Does this help? :slight_smile:

1 Like

Yes, yes, all of it is of interest :slight_smile: and bounding boxes too, because, I think, ideally, it would be awesome to get as much info as possible from that image, how many dogs, what class each, etc.

1 Like

Awesome! It’s an interesting idea certainly, I’ll do it on the zoom chat here in say 10 minutes or so, I’ll post about it on the study group thread too :slight_smile:

1 Like

@kodzaks link to it if you (or anyone reading this now) needs it:

Zoom

https://scikit-learn.org/stable/modules/multiclass.html

Video discussion: part 1 part 2

3 Likes

Thank you so much! It was super useful!

1 Like

My pleasure :slight_smile:

Hi @muellerzr. I am trying to run this notebook you shared and I get the same error I posted here

Any suggestions on how I can resolve this?

I tested just now. Notebook seems fine to me. See my reply here. Thanks.

Yijin

Thanks. I did manage to find a more convoluted work around. But I will take a look.

I guess the code is looking to see if the imdb_tok folder exists, and will create folder and the pickle file only if the folder does not exist.So have to remove the folder for it to create the pickle file.