Creating a mutli-label classification with datablock api

src = (ImageList.from_folder("images")
   .split_by_rand_pct(0.2)
   .label_from_folder()
  )
data = (src.transform(get_transforms(), size=299)
    .databunch(bs=bs//2)
    .normalize(imagenet_stats)
   )

i think this should be creating a multilabel classifier but predict() still gives me a single Category and i believe i’ve still got a single label model. (not sure what attributes to inspect to tell).

possibly I’m just trying to do something stupid it’s not meant to do.

i’m using folders each containing single classes but i want it to train a multilabel classifier so when i give it an image with 2 things in it i can get a pred of both classes.

Your problem is this:

i’m using folders each containing single classes

Using .label_from_folder() with that above statement means that the databunch created will only have one label per image. The model treats this as a single label classification exercise due to the nature of the output (1 label). You’ll need to find a different way to label your images so that you can provide an array of labels per image, so the model knows that it has to predict all that it can see. Example here does it via an auxiliary dataframe.

1 Like

thank you! that was exactly what i needed. essentially i did the same thing, i fed it one label per image, but once i gave it a delimiter it gave me a multilabel model.

ok, i definitely have a multilabel predictor now but when i feed it a picture of a nice clown and a scary clown together i still just get 1 predicted label.


the probs i get for this image are approx 4e-06 and 1. my single label classifier was very sure this was a nice clown. the multilabel model says it’s a scary clown.

i’m getting my pred the same way:

pred_class,pred_idx,probs = learner.predict(img)

what am i missing? do i need to train it with some mixed images that have both together??