Using fastai for Segmentation, receiving a CUDA device-side assertion error

I am trying to solve this problem, any tips on how I could get good segmentation masks?

  src = (MySegmentationItemList.from_folder(path_img)
       .random_split_by_pct(.2)
       .label_from_func(get_y_fn, classes=['background','solar_module']))
  data = (src.transform(get_transforms(), size=size, tfm_y=True)
        .databunch(bs=bs)
        .normalize(imagenet_stats))

[quote=“sgugger, post:3, topic:30292”]
Not that any time you get a CUDA error, you have to rest
[/quote]But I use:mask = open_mask(get_y_fn(img_f), div=True)
mask.show(figsize=(5,5), alpha=1)
, then the mask shows all black. No mask now, right?

My images are .jpg, masks are .png, I tried mask.data[0][50][20:600] to see the data, and found lots of values are 38, the others are 0

If it’s a binary task and you have values of 38 and 0, you must divide by 38 your images or set every value greater than 0 to 1, for example here’s some code tha I use in a similar case:

def open_mk(fn:PathOrStr, div:bool=False, convert_mode:str='L', cls:type=ImageSegment,
        after_open:Callable=None)->Image:
    "Return `Image` object created from image in file `fn`."
    with warnings.catch_warnings():
        warnings.simplefilter("ignore", UserWarning) # EXIF warning from TiffPlugin
        x = PIL.Image.open(fn).convert(convert_mode)
    if after_open: x = after_open(x)
    x = pil2tensor(x,np.float32)
    x[x>0]=1  #MODIFIDED
    if div: x.div_(255)
    return cls(x)

class CustomSegmentationLabelList(SegmentationLabelList):
    def open(self,fn): return open_mk(fn)
    
class CustomSegmentationItemList(ImageList):
    _label_cls= CustomSegmentationLabelList
2 Likes

For all segmentation tasks, make sure that your labels are always starting at 0 and increasing till the number of classes minus 1. For 10 classes your labels should be 0,…,9.

Best is preprocessing the masks and saving the correct format on disk. Best use ImageJ / Fiji for this task, then you can also check that everything fits. Sure you can do it with python, but transforming your data on the fly when loading it for learning is too resource intense, so I would not do that.

@alex_zhang has the segmentation worked for you ?

@harikrishnanrajeev Yes, I fixed it with pietro.latorre’s method with updated fastai, and I also added '0’class to the class list, then fit_one_cycle works. Maybe I needn’t generat the mask, but just load from coco json file, I haven’t tried that yet.

1 Like

Does anybody know why the mask shows all black ?. Is this as expected ?

1 Like

Change the displayed range to 0 … N (where N is the number of classes).

Hey, I am also facing the same problem. I had values of 76 and 0 when i print the mask.data values. Have u found the trick how to solve. If yes please let me know.

Please have a look at the approach in this link

this works

2 Likes

thanks. it works absolutely fine.

1 Like

My brain was almost exploding because I did not find an error in my code but could not train a segmentation model. Just knowing that the accuracy metric no longer works for segmentation saved my life. Thank you.

Would be good to use DICE or even DICE + BCE . Thanks .

the problem is masks are gone .show batch and prediction doesn’t bring them.did u manage to solve it?

Hello,
hope that’ll be useful for someone…

How I saw something wrong in my code: open_mask(‘path to mask’).data contains values out of [0, classes-1]

2)I had mask in rle format, so I generated files with mask myself. But. I used ImageSegment.save(filename) without png format. So storing mask as png resolved my problem.

(this code from ImageSegment.save, I has copied it to my code, to add ‘png’
x = get_image(mask, shape)
x = image2np(x.data).astype(np.uint8)
PIL.Image.fromarray(x).save(LBL_PATH/file_name,‘png’)

Thanks

Hi everyone,
I found this solution! Just edit the open_mask function with this partial before using SegmentationItemList, so you won’t need to extend it with a custom class :slight_smile:

SegmentationItemList._label_cls.open = partial(open_mask, div=True)

2 Likes

Hello Everyone,

I am also facing the same issue. I have tried the code on both CPU and GPU using colab, but no luck.
Link to notebook file using CPU: http://bit.ly/2I2ZOMX_TGS_CPU
Link to notebook file using GPU: http://bit.ly/2Vq0ROZ_TGS_GPU

In CPU notebook, I am getting the following error:

However, I have set div = True in open_mask and have also set num_workers = 0

In GPU notebook I am getting the following error:

I have taken the custom metric of IOU.
Fastai version: 1.0.60

Kindly have a look and suggest me how to fix the issue.

Hi, @skhandelwal121 in your second screenshot, the error happens when computing loss. It may be the reason that you have 255 in labels, you should also pass the ‘ignore_index=255’ parameter to the loss function.

Please, do you have a practical example of how you worked with:

SegmentationItemList._label_cls.open


I’m having the same problem. I believe that there is some inconsistency in the way the data was constructed. I’m a beginner, I’ve tried all the suggestions here and some others. I’m trying to keep things cool.

The error in my jupyter:

RuntimeError: CUDA error: device-side assert triggered

The error looking at the CPU:

C:/w/b/windows/pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:108: block: [0,0,0], thread: [29,0,0] Assertion t >= 0 && t < n_clas ses failed.

Regarding the mask:

[In] src_size, mask.data
[Out] (array([864, 864]),
tensor([[[163, 132, 132, …, 132, 132, 163],
[132, 90, 90, …, 90, 90, 132],
[132, 90, 90, …, 90, 90, 132],
…,
[132, 90, 90, …, 90, 90, 132],
[132, 90, 90, …, 90, 90, 132],
[163, 132, 132, …, 132, 132, 163]]]

I built my database myself. When I did the manual segmentation I used the PixelAnnotation API. It was built for highway segmentation. Then, its segmentation is predetermined, that is, the colors of the segmentation of the colored mask is still selected in the API. And the corresponding gray mask is then built automatically.

The signature of the segmentation of classes by PixelAnnotation was as

Class Pixel intensity
Background 90
Peca 128
DefeitoGrave 76
DefeitoBrando 177

The mask appears to be coherent. Except for markers 132 (edges) and 163 (extremity). They seem to me to be a kind of signature of the edges and ends. Can anyone tell me if I am correct in thinking like this or if it is a mistake that I need to correct?

About classes:

[In] codes = np.loadtxt(path/‘codes.txt’, dtype=str); codes
[Out] array([‘Background’, ‘Peca’, ‘DefeitoGrave’, ‘DefeitoBrando’], dtype=’<U13’)

So, if I do:

[In] name2id = {v:k for k,v in enumerate(codes)}
print(name2id)
[Out] {‘Background’: 0, ‘Peca’: 1, ‘DefeitoGrave’: 2, ‘DefeitoBrando’: 3}

The classes represented in the script by ‘codes’ are a total of 4, and are indented from 0 to n-1. So, respect the conditions that some comment. However, classes are indented from 1 to 3.

So, what intrigues me. My classes are signed from 0 to 3 but on the mask the same signature is represented by 90, 128, 76, 177. Shouldn’t they be the same? If so, any suggestions or material on how to correct it?

Could this problem be related to the image size? Of course not!?
Is the mask represented by a single channel image? Correct, right?

Environmental information:
Windows 10
pytorch 1.6.0
cuda 10.2
fastai 1.0.61
GeForce GTX 1050 Ti