Hello !

I don’t know if I’m posting this in the right category, but here is my problem :

I have as inputs 4 channel images with aerial images for the three first channels, and a raster of a generated displaced polygon as the 4th channel.

I want to predict an affine transformation matrix and the segmentation mask.

I plugged the Spatial Transformation Network into a regular Unet, where I output the mask and the affine transformation matrix.

I already look into a few ways to have multiple losses in fastai so that the two outputs are optimized.

But I don’t know how to have two labels/target, one for the segmentation (of shape (256,256,3) and one for the affine transformation (of shape (2,3)). I’m guessing I should look into the data block API but I don’t really know in which direction I should go…