Multi-Task loss function


I have a question about a deep learning model I’m building. My goal is to use a CNN with multiple heads that each predict an attribute (single-label as well as multi-label) or regress a value.

I’m using a Resnet as a shared encoder.
Every head has its own loss function, but I want to optimize them jointly. For that, summing the individual losses is one way. But that requires weighting the losses since some will be larger or smaller.
Since hand-tuning these weights is really time-intensive if you have more than 5 heads, I’m looking for an easier way to implement this multi-task loss function.

Doing some research, I have stumbled upon the following papers:

But I’m not sure how I would implement the first, and the second papers leaves out critical information (estimating the task-specific variance).

Do you know of a way to handle this kind of multi-task loss?

Thanks, Maik

@MaikRos Did you found any solution. I am also stuck in similar in similar problem. i am looking to image classification and localization using same model.