U-Net Help!

I’m trying to do an image segmentation problem where I want to segment 5 objects in an image. I’m using a U-net architecture. My final layer looks like this:

conv_final = Conv2D(OUTPUT_MASK_CHANNELS, (1, 1))(up_conv_224)
conv_final = Activation(‘sigmoid’)(conv_final)

model = Model(inputs, conv_final, name=“ZF_UNET_224”)
However I get an error saying:

ValueError: Error when checking target: expected conv2d_24 to have shape (224, 224, 5) but got array with shape (224, 224, 3)

This is the generator that I’m using

image_generator = train_datagen.flow_from_directory(
    'data/train',  # this is the target directory
    target_size=(224, 224),
    color_mode = 'rgb',# all images will be resized to 150x150
    seed = 1)  # since we use binary_crossentropy loss, we need binary labels

this is a similar generator, for validation data

mask_generator = mask_datagen.flow_from_directory(
    target_size=(224, 224),
    color_mode = 'rgb',
    seed = 1)

train_generator = zip(image_generator, mask_generator)

What can I do to fix this? Any help appreciated!

What is the value for this constant?

Your snippet is missing important codes for us to understand the problem completely. Is it possible to post the full code/notebook somewhere?

Hi. Thanks for your response!

I need 5 classes in the final output. I have 8 band images but as far as I see the flow_from_directory generator in Keras only allows 1 or 3 band images.

Someone also suggested that I convert my image masks from RGB images to images where each pixel consist of a binary vector indicating which pixel the class belongs to. I’m not sure how to apply one-hot encoding to image masks.

Yeah, Spacenet challenge is an example of dataset using 8 band multi-spectral. Images with more than 3 channels are common in satellite imagery.

I know that you can probably cheat your way using a 3d convolution treating this as temporal data but thats probably not the proper way to solve this.

Check out this winning solution for the Spacenet challenge. Their solution is an ensemble of three U-Net neural nets and they used Keras in their implementation. The code can be found here. If I am not wrong, the way they solve the 8 channels issue in Keras is by doing image pre-processing. So, I think you should find an answer there.