Simplest example to implement multi-task learning

[Moving this question from applications to part1-v2 here]

I have collected some image data for classification tasks A and B. A pre-trained model (say ResNet, Inception-v3) has transfer-learned task A and am not happy with 50% accuracy, so I plan to loop in task B labels because I plan to learn B as well and transfer learning on task B training data does not help either task A or task B, which is when I stumbled upon Multi-task Learning. I am quoting from Han Xiao’s blogpost (link below), for a better description:

‘‘One interesting observation here is that the tasks are highly related: knowing the labels of one task could help one to guess the labels of another task. So perhaps it would be a good idea to train all those tasks in the same neural network simultaneously, hoping that the commonality across the tasks will be exploited during the learning and improve the final performance.’’

I have come across a few tutorials online:

Jonathan Goodwin’s MTL
Han Xiao’s MTL with Fashion-MNIST
Kajal Gupta’s MTL with Deep NNs
Ruder’s Multi-task learning

Am I missing out on any other good CNN-centric-example to help me understand this?

What is the easiest way to get started? I strongly feel this can give us an edge in competitions such as Kaggle.


I’m also interested in this topic.
I never tried that. My first thought is to construct a model that has multiple outputs, e.g. get ResNet, add one dense layer for task 1, another dense layer for task2 (it’s easy to do in Keras, see this for example; not sure how easy it will be to do in library). Then train the model using data from both tasks alternatively: train task1 on one mini-batch, then task2 on one mini-batch, then task1 again. Switching between tasks frequently will prevent the model from forgetting one task while learning another.
This is just speculation, needs to be tested.

1 Like

Yeah I’m planning to teach this either towards the end of part 1, or early part 2 - I haven’t written anything in fastai to support it yet, although I’m not sure it’ll need much if any additional code.

In keras, lesson 7 of shows how to do it, BTW.


Can we implement the multitasking model using fastai custom_head?