Hello,
I have a question about a deep learning model I’m building. My goal is to use a CNN with multiple heads that each predict an attribute (single-label as well as multi-label) or regress a value.
I’m using a Resnet as a shared encoder.
Every head has its own loss function, but I want to optimize them jointly. For that, summing the individual losses is one way. But that requires weighting the losses since some will be larger or smaller.
Since hand-tuning these weights is really time-intensive if you have more than 5 heads, I’m looking for an easier way to implement this multi-task loss function.
Doing some research, I have stumbled upon the following papers:
But I’m not sure how I would implement the first, and the second papers leaves out critical information (estimating the task-specific variance).
Do you know of a way to handle this kind of multi-task loss?
Thanks, Maik