Image Segmentation: Loading a third image for a loss function

Hi everyone,

I came across the boundary loss function (published: and This function uses a distance map computed from ground truth. While I have successfully ported their pytorch implementation to work with fastai, it is by no means fast if the distance map is computed on the fly. In their implementation, they use pre-computed maps and surely, it would be trivial to compute and save these maps as images such that they could be loaded like the other images. I have been looking around for some hours now, without identifying a simple and practical way of loading in an image, ground truth and corresponding third image containing the distance map using the data block API or callbacks etc.

If you were to do this, how would you go about it?

Thank you in advance

1 Like

I’m not an expert, but I would create a ImageTuple Custom class (alike what you can read from the custom data tutorial)

@sebbecht, did you find an elegant way to do this?
I’m currently looking to load a mask into a custom loss function.
My main objective is to do targeted image restoration using a mask to add weighting to certain pixels.

Concatenate mask and your weights as a single target tensor and split it in your custom loss function

Yes that’s exactly how I decided to solve it. Although I did shelve this issue for a bit while working on my dataset. I think this is the simplest way of doing it with a fastai loader. But I think I might end up doing a custom training loop and loaders and then apply transforms etc manually anyway. Let me know how it goes!

Thanks for your replies guys, that’s a great solution!
I’m struggling however to figure out where exactly to concatenate the tensors…?

So I have:
src = ImageImageList.from_folder( path_scratched_imgs ).split_by_rand_pct( 0.1, seed = 42 )

Followed by:

get_target = lambda x: path_imgs/

data = (src
    .label_from_func( get_target )
    .transform( get_transforms(), tfm_y = True )
    .databunch( bs=bs )
    .normalize( imagenet_stats, do_y=True) )

My understanding may be flawed but I believe ‘src’ and ‘data’ just hold the paths to the images and not the tensors themselves; so would the concatenation occur where the image is actually loaded into memory, maybe some get_item function?

Thanks for your help

I think you can load the mask in get_target method. Instead of making it a lambda function, use PIL/OpenCV to load the mask and return it.

Hi warren, so in my case it’s segmentation which means I have a corresponding ground truth mask on disk as a png. I will for every mask, load it in, calculate my second tensor, I.e. weight map, concatenate it to the mask and re-save it. That way it’s loaded in as normal but in your loss function you split what’s loaded in. Make sense?

Hi Warren, I recently had the same problem especially since I use some extensive augmentation I needed the weights to rotate, stretch and crop the same as my input image and labels. I used a custom ItemBase class for my cutom SegmentationItemLabelList with weights and labels, that are transformed seperately with the same values. You can find my code here. Just let me know if you have any questions.

Thanks guys for all your help!

I actually managed to fix the issue I was having a few months back by loading the mask into the alpha channel of the PNG images. I then split the PNG into the RGB image and mask in the Loss Function, as suggested by @sebbecht. I did have to alter some standard functions however so that the 4-channel images would work in the standard code.

I hope to clean up my own code and make at GitHub repo, then I’ll be able to share more precisely how I went about this. I probably would have been better off creating my own image classes and data-loaders but was under pressure to get the MVP out.

Thanks again.