What is the best way to generate a training batch for a semantic segmentation project with multiple classes

I have a vgg network, where i trained only with two classes. I would like to add additional classes to my trained network.

Not sure if this was the right way to prepare the data, I went through the labeled images and saved images with a single label at a time (essentially one hot encoded the images training set). What would be the best way to generate a batch of those images + labels to feed into my network? I am having a hard time too visualizing if I should stack them, or train on one label at a time?

Thank you so much!

Here is an example of how some of my images look like:


@sgugger , tagging you if you don’t mind :slightly_smiling_face:

I’m not sure what you’re trying to achieve. Do you mean each pixel could belong to several classes? In that case you should process your target masks with pixels values of 0, 1, 2, …, c-1 (with c the number of classes), use tfm_y = TfmType.CLASS and change your network so that instead of applying a sigmoid for each pixel at the end, you apply a softmax.

Ah! Makes sense. Yah so I have an initial image where each pixel could belong to a class from 1 to 13. I did look for two or three specific classes and save the classes I wanted in the new image as 255 and the rest as 0. I guess I might not need to do that then!