Unet_binary segmentation

When I change the size, I get ‘ValueError: Expected target size (4, 10404), got torch.Size([4, 10201])’ …


Hi all,

I’ve now been back through this code, and for the life of me can’t get it to work. Specifically, tried to modify Carvana to work with FastAIV1.0 (v1.0.31), and still get the error:

RuntimeError: CUDA error: device-side assert triggered

This is after redefining SegmentationLabelList:

class SegmentationLabelList(ImageItemList):
def __init__(self, items:Iterator, classes:Collection=None, **kwargs):
    super().__init__(items, **kwargs)
    self.classes,self.loss_func,self.create_func = classes,CrossEntropyFlat(),partial(open_mask, div=True)
    self.c = len(self.classes)

def new(self, items, classes=None, **kwargs):
    return self.__class__(items, ifnone(classes, self.classes), **kwargs)

and then calling SegmentationItemList with div=True:

src = (SegmentationItemList.from_folder(train_128_path, div=True)
.label_from_func(get_y_fn, classes=codes, div=True));

as well as tried setting open_masks div to True:

src.train.y.create_func = partial(open_mask, div=True)
src.valid.y.create_func = partial(open_mask, div=True)

Can’t help but feel that I’m missing something obvious here - if anyone in the thread has it working on v1.0.31 please do let me know how you managed it! @sgugger, any hints would be most appreciated…

This is also still unsolved. Changing image sizes should not remove images from training set. These segmentation bugs need fixing

I also encountered this problem. Waiting for a solution.

@jyoti3, @quodatlas,

Does your code still work with the current version of fast.ai ( 1.0.32 )

data.show_batch breaks because it shows black and white masks

Learner.create_unet is replaced by unet_learner and I encounter an error

AttributeError: ‘SegmentationItemList’ object has no attribute 'c’

I haven’t tried carvana with the last version of fastai. Note that we can’t help you with just

RuntimeError: CUDA error: device-side assert triggered

This is a generic cuda error due to a bad-index problem and you need to try one forward pass on the CPU to get more details on it.
We also can’t help you if you don’t give us the version of fastai you’re working with, since there has been a lot of changes as we refined the data block API (which is stabilized now).

As for using a customized open function for the masks (if you want to set div=True), just change (or subclass) the open method of SegmentationLabelList.

For a hopefully helpful reference, I’ve updated my binary segmentation notebook (for mapping buildings from aerial/drone imagery) to work on fastai v1.0.33:


For those having issues changing SegmentationLabelList to open binary masks with div=True by default, this worked for me based on @sgugger’s suggestion:

class SegLabelListCustom(SegmentationLabelList):
    def open(self, fn): return open_mask(fn, div=True)
class SegItemListCustom(ImageItemList):
    _label_cls = SegLabelListCustom

src = (SegItemListCustom.from_folder(path_img)
        .label_from_func(get_y_fn, classes=codes))

In the notebook, I also add a custom loss function (combo of BCE and soft dice loss…not sure that my dice loss function is working entirely correctly yet so please let me know if you spot any bugs!) and make use of a slightly modified SaveModelCallback to auto-save and load weights from the best-resulting epoch.


I have same problem and ran it with cpu based pytorch and got this

RuntimeError: Assertion `cur_target >= 0 && cur_target < n_classes’ failed. at /opt/conda/conda-bld/pytorch-nightly-cpu_1544170178111/work/aten/src/THNN/generic/ClassNLLCriterion.c:93

Now what?

Sorry if this is a noob question but how do you implement this change? Do you go in the source code and update the library or just redefine the class in your own notebook?

You can just add the code block and run the cell in your own notebook. See the notebook link in my post for an example of this.

I am trying to implement the unet paper, but when concatenating the features from the contracting path the upsampled features, I noticed that in the case of the example in the paper, the features from the encoder are 64x64 and the upsampled features are 56x56.
My question is how do concatenate them, do you pad the upsampled features to 64x64 or do you crop the features from the encoder to 56x56.

I think you are using ‘valid’ padding in conv layers, instead use ‘same’ padding

This was such a helpful post. Thank you! I had to change the code a little to get it to work, because I think ImageItemList has been removed. Documenting it here for anyone else who finds this thread:

class SegLabelListCustom(SegmentationLabelList):
    def open(self, fn): return open_mask(fn, div=True)

class SegItemListCustom(SegmentationItemList):
    _label_cls = SegLabelListCustom


I have RGB label-images with 6 labels.

codes = array([‘Impervious_surfaces’,‘Building’,‘Low_vegetation’,‘Tree’,‘Car’,‘Clutter_background’])

below is hoe they are encoded in RGB channel.

colors_LU ={‘Impervious_surfaces’: array([255, 255, 255]),
‘Building’: array([ 0, 0, 255]),
‘Low_vegetation’: array([0, 255, 255]),
‘Tree’: array([0, 255, 0]),
‘Car’: array([ 255, 255, 0]),
‘Clutter_background’: array([ 255, 0, 0])}

mask = open_mask(get_y_fn(img_f))
mask.show(figsize=(5,5), alpha =1)

I get…mask.data

tensor([[[255, 255, 255, …, 255, 255, 255],
[255, 255, 255, …, 255, 255, 255],
[255, 255, 255, …, 255, 255, 255],
[ 29, 29, 29, …, 255, 255, 255],
[ 29, 29, 29, …, 255, 255, 255],
[ 29, 29, 29, …, 255, 255, 255]]]))

So not for 0 to 5.
By examining the I figured that there are indeed 6 classes inside, which seem to be gray representations of RGB colors for each class.

{‘Impervious_surfaces’: 255,
‘Building’: 29,
‘Low_vegetation’: 178,
‘Tree’: 149,
‘Car’: 225,
‘Clutter_background’: 76}

I am using following to create the data source

src = (SegmentationItemList.from_folder(path)
.split_by_folder(train = ‘train’, valid = ‘valid’)
.label_from_func(get_y_fn, classes=codes))

data = (src.transform(get_transforms(), tfm_y=True, size = size)

This creates a problem where numbers in the mask are > number of classes, so I was getting that CUDA error.

I got around that by creating dummy classes (0-255), but then I start running out of memory, even with image size reduced to 200.

I do not know how to solve this problem. Dividing by 255 does not help me, since I have > 2 classes.

Any help would be much appreciated.

1 Like


Not sure if I should start a new thread or post here. Please let me know.

I’m trying lesson 3 with road camera vehicle photos. The task here is creating a privacy mask hiding the inside of the vehicles.

I’m working in segmentating the windows. Right seems promising.

My doubt is: when an image contains no windows, should I present it to the network? With a full zeroed mask? Or should I just use photos where a window is presented?

As an aside, how accurate should I be when annotating data? There is a need to be pixel perfect?

Thank you.

Transform your RGB masks into gray scale masks with values from 0 to 5 (for 6 classes). Use either Python or ImageJ. Using the transformed masks it should work.

1 Like

Yes you should. Otherwise your model will see windows everywhere. But getting the percentage of null images right is hard to guess without experimentation.

Are these external road cameras? I think I would have approached it by segmenting non-window parts of vehicles to discover internal windows. If internal dash cams, I would detect the inside of the car like dashboards and passengers rather than the window.

Good luck!!

I wrote some functions to help people having problems with labels in their masks. See here:


thank you for great feedback. Images I am using are from Potsdam satellite image set.

All the pixels are labeled as something.

In case it may be of some use, this is the code I used to convert RGB lables.

files = get_image_files(path + ‘/original_RGB_lables’)
for file in files:
temp = Image.open(file)
temp = temp.convert(‘L’)
reshape = int(pixels.shape[0] ** .5)
pixels = pixels.reshape(reshape,reshape)
pixels = pixels.astype(int)
pixels = np.where(pixels==255,0,pixels)
pixels = np.where(pixels==225,4,pixels)
pixels = np.where(pixels==178,2,pixels)
pixels = np.where(pixels==149,3,pixels)
pixels = np.where(pixels==76,5,pixels)
pixels = np.where(pixels==29,1,pixels)
#for k in range(7):
# print(str(k) + ‘:\t’ + str(list(pixels.flatten()).count(k)))
array = np.array(pixels, dtype=np.uint8)
new_image = Image.fromarray(array)
#print (pixels)
new_image = new_image.convert(‘L’)
new_image.save(str(path) + ‘/monochrome_lables’ + ‘/’ + str(file.name))
#mask = open_mask(str(path) + ‘/monochrome_lables’ + ‘/’ + str(file.name))
#mask.show(figsize=(5,5), alpha =1)

after this I still got:
cuda error: device-side assert triggered. Apparently, this happens if number in the mask > num_classes.

After manually examining labelled images, it turned out that one of them (label 4_12) had values that were not in [0,5] range. I think something is wrong with this RGB labeled image. I removed it from the set and now it is training…

I’ll investigate latter what is going on with 4_12 image-label file, but in case someone plans to work with the same data set, pay attention to that.

One other thing I had to do is to label ‘background/clutter’ calls as void and exclude it from target matching in the accuracy function. Otherwise it was running out of memory.

codes = array([‘Impervious_surfaces’,‘Building’,‘Low_vegetation’,‘Tree’,‘Car’,‘Clutter_background’])

thanks again

1 Like

thank you, post the RGB label conversion , you didn’t have to use “div=True” right ?