Multi-output classifier

AhmedZaitoon · March 13, 2024, 2:36pm

I am making a multi-output classifier (classifier with two outputs each one is a softmax ) here is my code

This is dataloader and it works fine

from fastai.vision.all import *

df = pd.read_csv('dataset_2_column.csv')

label1 = df['label1'].unique()
label2 = df['label2'].unique()

getters = [
    ColReader('path'),
    ColReader('label1'),
    ColReader('label2')]

dls = DataBlock(
    blocks=(ImageBlock, CategoryBlock(vocab=label1), CategoryBlock(vocab=label2)),
    getters=getters,
    splitter=RandomSplitter(valid_pct=0.2, seed=42),
    item_tfms=[Resize(224, method='squish')],
    n_inp=1
).dataloaders(df, bs=8)

I just need to add to heads to resnet18 , define loss and train


class Multihead(Module):
  "A two-headed model given a `body` and `n` output features"
  def __init__(self, body:nn.Sequential, n:L):
    nf = num_features_model(nn.Sequential(*body.children()))
    self.body = body
    self.label1 = create_head(nf, n[0])
    self.label2 = create_head(nf, n[1])
  
  def forward(self, x):
    y = self.body(x)
    label1 = self.label1(y)
    label2 = self.label2(y)
    return [label1, label2]

net = MultiModel(body, dls.c)
learn = Learner(dls, net)

Now i got this error:

TypeError: forward() takes 2 positional arguments but 3 were given

it just seems that i don’t know how to make a proper custom 2 head output and a a proper loss function

vbakshi · March 13, 2024, 11:00pm

Dropping some resources related to this topic that Jeremy has created in case they help. He explores creating a multi-head classifier in Live Coding 12 and you can see a bit more of his code in Live Coding 13.

In Live Coding 12 he defined a classifier multi-head model as follows (for the Paddy Doctor Kaggle Competition where the goal is to classify images by disease type—the second class he adds is the variety of the plant in the image) where m in the example is a convnext_tiny_in22k model that is modified inside the __init__ method:

class DiseaseAndTypeClassifier(nn.Module):
    # the constructor
    def __init__(self, m):
        super().__init__() # always call the superclass init to construct the object
        self.l1 = nn.Linear(512, 10, bias=False) # variety
        self.l2 = nn.Linear(512, 10, bias=False) # disease
        del(m[1][-1]) # delete the last layer of the model's head
        self.m = m # model
        
    def forward(self, x):
        x = self.m(x)
        x1 = self.l1(x) # variety output
        x2 = self.l2(x) # disease output
        return x1, x2

He also has a simpler built-in approach to multi-target classification in his notebook Multi-Target: Road to the Top, Part 4.

AhmedZaitoon · March 13, 2024, 11:16pm

Thank you very very very much for guiding me , I will check them

AhmedZaitoon · March 14, 2024, 1:36am

@vbakshi
I want to ask you another question may you can help me also:

I am making a vehicle make recognition for 17 class (bmw , kia , …) after i finished the classifier i faced an issue which is the other classes (out of the 17 classes) may classified as one of the 17 class even with using high threshold for softmax probability i still can’t filter them as cars is highly similar .

so i am trying to make a classifier with two outputs one with the 17 class and one with if it is other class or not (if i tried to just add other class to the classifier it will ruined the classifier accuracy ) but with 2 outputs the network should learn both tasks and after this will also use high threshold at the head with 17 class to be sure that i removed most of other classes

do you have any other ideas may be better than this .

ie how to detect any car out of the 17 class and got classified as unkown or none

AhmedZaitoon · March 14, 2024, 3:32am

in the built-in approach of multi-target classification notebook , the output will be one softmax with the two outputs so i can’t get probability for each output , although in the notebook mentioned that i can get the two probabilities.
do i miss something

vbakshi · March 14, 2024, 4:09am

Here is a tutorial from Zach Mueller (a former fastai student) who walks through how to use MultiCategoryBlock, accuracy_multi (with a high threshold) and a loss function (with a low threshold during training and high threshold during inference) in order to get your model to return an empty list when it predicts an unknown class.

I have recreated his tutorial code in this Colab notebook and was able to achieve results, although I don’t fully understand why this works, it does seem to work.

The model is trained on the PETS dataset which has cats and dogs. When I give the trained model an image of a bulldozer, it returns an empty list because it is not confident (due to the high loss threshold) in its prediction:

I think if you apply the same approach to your car classification problem, you might get similar results.

AhmedZaitoon · March 14, 2024, 7:58am

I have tried Zach Mueller but didn’t get good results , but as it worked with you i will give it another shot , but the problem is that cars are very similar so may be car is look like another one with very slight different so it will be hard for the network to say that it is not from the 17 class especially may be two cars for example are BMW and they are different

AhmedZaitoon · March 14, 2024, 8:00am

also with built-in approach of multi-target classification notebook i got same error like here : FastAI Errors: Vision learner with 2 outputs | Kaggle

vbakshi · March 14, 2024, 3:33pm

In the multi-target notebook, the outputs seem to be the raw activations, since they do not add up to 1.0:

Here are the 20 outputs (first 10 for disease class and second 10 for variety class):

In order to classify the image, he takes the argmax of the first 10 predictions for disease and of the second 10 predictions for variety:

Here are the sum of the preds I’m getting in total, and for each target (note that they do not sum to 1):

I am also getting an error when running learn.show_results() in the multi-target notebook, but I haven’t looked into why yet:

Can you share your code as a Colab or Kaggle notebook? It will be easier to troubleshoot with the actual data and code you are running.

AhmedZaitoon · March 18, 2024, 11:07pm

first thank you for your guidance which lead me to the solution

I followed the notebook from multi-task-learning-with-pytorch-and-fastai and it works fine with two heads each head has outputs probabilities sums to 1

i have now only one problem , i used resnet (alson and convnext ) as retrained model then add two heads to it
i want to use beitv2_base_patch16_224 as pretrained model and add two heads to it but this didn’t work as beitv2_base has different structure and i tried but failed to make it work

class MultiTaskModel(nn.Module):
    def __init__(self, arch, ps=0.5):
        super(MultiTaskModel, self).__init__()
        self.encoder = create_body(arch, cut=-4)
        self.fc1 = create_head(384, 2, ps=ps)
        self.fc2 = create_head(384, 3, ps=ps)

    def forward(self, x):
        features = self.encoder(x)
        class1 = self.fc1(features[-1])  # Assuming the last layer's output is used
        class2 = self.fc2(features[-1])  # Assuming the last layer's output is used

        return [class1, class2]


model = resnet18(pretrained=True)

#model = timm.create_model("beitv2_base_patch16_224", pretrained=True) # to try it 

# Create the body of the model
#body = create_body(model, pretrained=True)

model = MultiTaskModel(model, ps=0.25)

do you have any guidance ?

vbakshi · March 18, 2024, 11:50pm

Awesome, glad I could help! Can you provide more detail on what you mean by you “failed to make it work”?

I was able to run the following code without error on Kaggle:

import timm
from fastai.vision.all import *
from fastcore.all import *

class MultiTaskModel(nn.Module):
    def __init__(self, arch, ps=0.5):
        super(MultiTaskModel, self).__init__()
        self.encoder = create_body(arch, cut=-4)
        self.fc1 = create_head(384, 2, ps=ps)
        self.fc2 = create_head(384, 3, ps=ps)

    def forward(self, x):
        features = self.encoder(x)
        class1 = self.fc1(features[-1])  # Assuming the last layer's output is used
        class2 = self.fc2(features[-1])  # Assuming the last layer's output is used

        return [class1, class2]

model = timm.create_model("beitv2_base_patch16_224", pretrained=True)
MultiTaskModel(model, ps=0.25)

The resulting model now shows fc1 and fc2 layers:

AhmedZaitoon · March 19, 2024, 8:34am

yes the part you mentioned works but when i go to train part :

learn = Learner(dls, model, loss_func=loss_func, metrics=metrics)

learn.fine_tune(10)

i got this error

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Input In [10], in <cell line: 3>()
      1 #learn.fit_one_cycle(0,lr=1e-2)
      2 lr = 0.001
----> 3 learn.fine_tune(0, lr)

File ~/.local/lib/python3.8/site-packages/fastai/callback/schedule.py:165, in fine_tune(self, epochs, base_lr, freeze_epochs, lr_mult, pct_start, div, **kwargs)
    163 "Fine tune with `Learner.freeze` for `freeze_epochs`, then with `Learner.unfreeze` for `epochs`, using discriminative LR."
    164 self.freeze()
--> 165 self.fit_one_cycle(freeze_epochs, slice(base_lr), pct_start=0.99, **kwargs)
    166 base_lr /= 2
    167 self.unfreeze()

File ~/.local/lib/python3.8/site-packages/fastai/callback/schedule.py:119, in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt, start_epoch)
    116 lr_max = np.array([h['lr'] for h in self.opt.hypers])
    117 scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
    118           'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
--> 119 self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd, start_epoch=start_epoch)

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:264, in Learner.fit(self, n_epoch, lr, wd, cbs, reset_opt, start_epoch)
    262 self.opt.set_hypers(lr=self.lr if lr is None else lr)
    263 self.n_epoch = n_epoch
--> 264 self._with_events(self._do_fit, 'fit', CancelFitException, self._end_cleanup)

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:199, in Learner._with_events(self, f, event_type, ex, final)
    198 def _with_events(self, f, event_type, ex, final=noop):
--> 199     try: self(f'before_{event_type}');  f()
    200     except ex: self(f'after_cancel_{event_type}')
    201     self(f'after_{event_type}');  final()

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:253, in Learner._do_fit(self)
    251 for epoch in range(self.n_epoch):
    252     self.epoch=epoch
--> 253     self._with_events(self._do_epoch, 'epoch', CancelEpochException)

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:199, in Learner._with_events(self, f, event_type, ex, final)
    198 def _with_events(self, f, event_type, ex, final=noop):
--> 199     try: self(f'before_{event_type}');  f()
    200     except ex: self(f'after_cancel_{event_type}')
    201     self(f'after_{event_type}');  final()

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:247, in Learner._do_epoch(self)
    246 def _do_epoch(self):
--> 247     self._do_epoch_train()
    248     self._do_epoch_validate()

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:239, in Learner._do_epoch_train(self)
    237 def _do_epoch_train(self):
    238     self.dl = self.dls.train
--> 239     self._with_events(self.all_batches, 'train', CancelTrainException)

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:199, in Learner._with_events(self, f, event_type, ex, final)
    198 def _with_events(self, f, event_type, ex, final=noop):
--> 199     try: self(f'before_{event_type}');  f()
    200     except ex: self(f'after_cancel_{event_type}')
    201     self(f'after_{event_type}');  final()

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:205, in Learner.all_batches(self)
    203 def all_batches(self):
    204     self.n_iter = len(self.dl)
--> 205     for o in enumerate(self.dl): self.one_batch(*o)

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:235, in Learner.one_batch(self, i, b)
    233 b = self._set_device(b)
    234 self._split(b)
--> 235 self._with_events(self._do_one_batch, 'batch', CancelBatchException)

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:199, in Learner._with_events(self, f, event_type, ex, final)
    198 def _with_events(self, f, event_type, ex, final=noop):
--> 199     try: self(f'before_{event_type}');  f()
    200     except ex: self(f'after_cancel_{event_type}')
    201     self(f'after_{event_type}');  final()

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:216, in Learner._do_one_batch(self)
    215 def _do_one_batch(self):
--> 216     self.pred = self.model(*self.xb)
    217     self('after_pred')
    218     if len(self.yb):

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

Input In [5], in MultiTaskModel.forward(self, x)
     22 def forward(self, x):
---> 23     features = self.encoder(x)
     24     class1 = self.fc1(features[-1])  # Assuming the last layer's output is used
     25     class2 = self.fc2(features[-1])  # Assuming the last layer's output is used

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/container.py:139, in Sequential.forward(self, input)
    137 def forward(self, input):
    138     for module in self:
--> 139         input = module(input)
    140     return input

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:201, in _forward_unimplemented(self, *input)
    190 def _forward_unimplemented(self, *input: Any) -> None:
    191     r"""Defines the computation performed at every call.
    192 
    193     Should be overridden by all subclasses.
   (...)
    199         registered hooks while the latter silently ignores them.
    200     """
--> 201     raise NotImplementedError(f"Module [{type(self).__name__}] is missing the required \"forward\" function")

NotImplementedError: Module [ModuleList] is missing the required "forward" function.

vbakshi · March 19, 2024, 5:52pm

I found this PyTorch forum post which discusses a similar issue, which makes me think that perhaps the problem is that you are returning a list from your forward method?

So instead of this:

def forward(self, x):
    features = self.encoder(x)
    class1 = self.fc1(features[-1])  # Assuming the last layer's output is used
    class2 = self.fc2(features[-1])  # Assuming the last layer's output is used

    return [class1, class2]

What if you tried something like this:

def forward(self, x):
    features = self.encoder(x)
    class1 = self.fc1(features[-1])  # Assuming the last layer's output is used
    class2 = self.fc2(features[-1])  # Assuming the last layer's output is used
    output = torch.cat([class1, class2])
    return output

AhmedZaitoon · March 19, 2024, 11:42pm

gives me almost same error :

---------------------------------------------------------------------------
NotImplementedError                       Traceback (most recent call last)
Input In [11], in <cell line: 3>()
      1 #learn.fit_one_cycle(0,lr=1e-2)
      2 lr = 0.001
----> 3 learn.fine_tune(0, lr)

File ~/.local/lib/python3.8/site-packages/fastai/callback/schedule.py:165, in fine_tune(self, epochs, base_lr, freeze_epochs, lr_mult, pct_start, div, **kwargs)
    163 "Fine tune with `Learner.freeze` for `freeze_epochs`, then with `Learner.unfreeze` for `epochs`, using discriminative LR."
    164 self.freeze()
--> 165 self.fit_one_cycle(freeze_epochs, slice(base_lr), pct_start=0.99, **kwargs)
    166 base_lr /= 2
    167 self.unfreeze()

File ~/.local/lib/python3.8/site-packages/fastai/callback/schedule.py:119, in fit_one_cycle(self, n_epoch, lr_max, div, div_final, pct_start, wd, moms, cbs, reset_opt, start_epoch)
    116 lr_max = np.array([h['lr'] for h in self.opt.hypers])
    117 scheds = {'lr': combined_cos(pct_start, lr_max/div, lr_max, lr_max/div_final),
    118           'mom': combined_cos(pct_start, *(self.moms if moms is None else moms))}
--> 119 self.fit(n_epoch, cbs=ParamScheduler(scheds)+L(cbs), reset_opt=reset_opt, wd=wd, start_epoch=start_epoch)

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:264, in Learner.fit(self, n_epoch, lr, wd, cbs, reset_opt, start_epoch)
    262 self.opt.set_hypers(lr=self.lr if lr is None else lr)
    263 self.n_epoch = n_epoch
--> 264 self._with_events(self._do_fit, 'fit', CancelFitException, self._end_cleanup)

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:199, in Learner._with_events(self, f, event_type, ex, final)
    198 def _with_events(self, f, event_type, ex, final=noop):
--> 199     try: self(f'before_{event_type}');  f()
    200     except ex: self(f'after_cancel_{event_type}')
    201     self(f'after_{event_type}');  final()

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:253, in Learner._do_fit(self)
    251 for epoch in range(self.n_epoch):
    252     self.epoch=epoch
--> 253     self._with_events(self._do_epoch, 'epoch', CancelEpochException)

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:199, in Learner._with_events(self, f, event_type, ex, final)
    198 def _with_events(self, f, event_type, ex, final=noop):
--> 199     try: self(f'before_{event_type}');  f()
    200     except ex: self(f'after_cancel_{event_type}')
    201     self(f'after_{event_type}');  final()

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:247, in Learner._do_epoch(self)
    246 def _do_epoch(self):
--> 247     self._do_epoch_train()
    248     self._do_epoch_validate()

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:239, in Learner._do_epoch_train(self)
    237 def _do_epoch_train(self):
    238     self.dl = self.dls.train
--> 239     self._with_events(self.all_batches, 'train', CancelTrainException)

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:199, in Learner._with_events(self, f, event_type, ex, final)
    198 def _with_events(self, f, event_type, ex, final=noop):
--> 199     try: self(f'before_{event_type}');  f()
    200     except ex: self(f'after_cancel_{event_type}')
    201     self(f'after_{event_type}');  final()

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:205, in Learner.all_batches(self)
    203 def all_batches(self):
    204     self.n_iter = len(self.dl)
--> 205     for o in enumerate(self.dl): self.one_batch(*o)

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:235, in Learner.one_batch(self, i, b)
    233 b = self._set_device(b)
    234 self._split(b)
--> 235 self._with_events(self._do_one_batch, 'batch', CancelBatchException)

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:199, in Learner._with_events(self, f, event_type, ex, final)
    198 def _with_events(self, f, event_type, ex, final=noop):
--> 199     try: self(f'before_{event_type}');  f()
    200     except ex: self(f'after_cancel_{event_type}')
    201     self(f'after_{event_type}');  final()

File ~/.local/lib/python3.8/site-packages/fastai/learner.py:216, in Learner._do_one_batch(self)
    215 def _do_one_batch(self):
--> 216     self.pred = self.model(*self.xb)
    217     self('after_pred')
    218     if len(self.yb):

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

Input In [6], in MultiTaskModel.forward(self, x)
     22 def forward(self, x):
---> 23     features = self.encoder(x)
     24     class1 = self.fc1(features[-1])  # Assuming the last layer's output is used
     25     class2 = self.fc2(features[-1])  # Assuming the last layer's output is used

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/container.py:139, in Sequential.forward(self, input)
    137 def forward(self, input):
    138     for module in self:
--> 139         input = module(input)
    140     return input

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:1130, in Module._call_impl(self, *input, **kwargs)
   1126 # If we don't have any hooks, we want to skip the rest of the logic in
   1127 # this function, and just call forward.
   1128 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1129         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1130     return forward_call(*input, **kwargs)
   1131 # Do not call functions when jit is used
   1132 full_backward_hooks, non_full_backward_hooks = [], []

File ~/.local/lib/python3.8/site-packages/torch/nn/modules/module.py:201, in _forward_unimplemented(self, *input)
    190 def _forward_unimplemented(self, *input: Any) -> None:
    191     r"""Defines the computation performed at every call.
    192 
    193     Should be overridden by all subclasses.
   (...)
    199         registered hooks while the latter silently ignores them.
    200     """
--> 201     raise NotImplementedError(f"Module [{type(self).__name__}] is missing the required \"forward\" function")

NotImplementedError: Module [ModuleList] is missing the required "forward" function

vinayjose · March 21, 2024, 4:53pm

Hi Ahmed,

I looked through your code and implemeted the same in a Kaggle notebook. It is working without errors, even the training part. I am sharing the link to the notebook here. Hope this helps.

Let me know if I am missing something in the notebook.

vbakshi · March 21, 2024, 10:03pm

@vinayjose thanks for putting together a working example. I think one thing missing in your notebook is that you are not assigning the MultiTaskModel to the model variable so the model variable stays as the original pretrained beitv2_base_patch16_224.

If I assign it to model and try to train it, I get the same error as Ahmed:

Haven’t looked into debugging it yet. Will take a look this weekend.

vbakshi · March 21, 2024, 10:30pm

Okay I at least got it to train (changed to resnet34, changed MultiTaskModel definition to match the original blog post, ignoring the metrics for now):

Next thing need to figure out is what breaks/changes when switching over to beitv2_base_patch16_224