Dealing with CUDA Device Assist on Segmentation (and other Segmentation issues), some tips! :) - A Wiki

There are so many times CUDA device assist errors happen, and this is due to the pixels in your mask not aligning how they should be! (sometimes). For example, if I have a binary mask that is [0,255], it will not work because fastai expects [0,1]. So how do we fix this? And how should we watch for this? Let’s look at a few tips and examples. (this is also a Wiki, if you have run into other segmentation issues that are not obvious fixes and fixed them feel free to post your solutions here)

Some go-to’s:

Ensure your masks are png's. JPEG’s compression makes the labels get messed up occasionally.

Reassigning pixel values to classes

If we want to adjust it without remaking our entire dataset at one time, we can change how our get_y works:

def n_codes(fnames, is_partial=True):
  "Gather the codes from a list of `fnames`"
  vals = set()
  if is_partial:
    fnames = fnames[:10]
  for fname in fnames:
    msk = np.array(PILMask.create(fname))
    for val in np.unique(msk):
      if val not in vals:
  vals = list(vals)
  p2c = dict()
  for i,val in enumerate(vals):
    p2c[i] = vals[i]
  return p2c
def get_msk(fn, pix2class):
  fn = path/'GT_png'/f'{fn.stem}_mask.png'
  msk = np.array(PILMask.create(fn))
  mx = np.max(msk)
  for i, val in enumerate(pix2class):
    msk[msk==pix2class[i]] = val
  return PILMask.create(msk)

So if we take this example, this is similar to our lambda function in the CamVid example. What we do here is see if our maximum number of classes present (the highest value) is greater than the classes we’re expecting. If so, then we need to re-map everything to make it all work.

To use it, simply make a new lambda function:
p2c = get_pixel_to_class(fnames) (where fnames is your list of masks)
get_y = lambda o: get_msk(o, p2c)


This was such a common problem in fastai V1. Try to do a pull-request to get this solved by default

1 Like

I was debating on it, but this may not be the best way to fix this every time (or mabye it is) I’ll wait for more input on that from the dev’s :slight_smile:

Im hesitant because it doesn’t solve an instance where say we don’t have the second class present. So perhaps a better idea would be a dictionary to adjust of some kind… let me refactor it…

@bluesky314 I made an adjustment, it’s not quite as simple as we would like because there could be a chance that some of the images don’t contain a particular class. I’ve adjusted the code above to include a second function which generates a pix2class (or pixel 2 class) for you. This should be done outside of the DataLoader as it is lazy, so it can’t check them all. We could add it onto the call to the batch but again, I don’t think that’s how we’d want to do it.

A check if it is a jpeg, jpg may be helpful. Because of compression the labels get messed up.

How would you adjust/modify it if so?

I was thinking more like a user tip when you get the cuda error. Check if you images are png, check for number of unique values in the mask == classes. Something like that ???

1 Like

Gotcha, I like that :slight_smile:

Edit: in case more people have found more workarounds (now or in the future), I’ve made it a wiki so we can centralize thoughts on the subject (instead of having a billion threads)


By the way, there was a bug with the code for n_codes, the mapping wasn’t being made properly. I’ve fixed it, also there’s an example of these tricks being used here:

1 Like

Hi @muellerzr ,

I wasn’t able to find the code for get_pixel_to_class in your notebook or here. Can please share the same?
My understanding is that it takes each unique value of the mask and and convert them to classes ie., 1,2,3,4… is that right?

Lemme re-look at that… just noticed that issue sorry! And yes, essentially we’re re-ordering it from 0-n how fastai expects it. (Don’t expect a fix in the next few hours, working on some other business right now.)

Thanks. No hurry, just wanted to understand the logic to see if I could implement on my own.

I think it’s legitimately just missing one step. IIRC, p2c should be vals in this example @imrandude. Can you try that? :slight_smile: (Aka the dictionary we made from n_codes)

Thanks for this guide, just fix my issue.

I think my current issue is related: I have binary mask as png where 0 is background and 255 is target. This leads to error: IndexError: Target 255 is out of bounds. I suspect there is a workaround using Transforms but find myself digging through source code to try and find it. A PR to update this issue would be fantastic.


dls = SegmentationDataLoaders.from_label_func(
    ... batch_tfms=[IntToFloatTensor(div_mask=255)]
1 Like