Image Segmentation on Cells using Camvid Notebook getting error on learn.fit_one_cycle

I have taken cell nuclei dataset from a kaggle competition and attempting to use with the notebook from CamVid tiramisu. I created a toy dataset to run on my laptop to make sure things work before running on my GCP GPU on the full dataset. There are only two classes: ‘Cell’, ‘Background’. I can get everything to run up through lr_find and learn.fit_one_cycle The only thing I changed was getting validation from percentage rather than folder. But, I get the following error for both commands:

RuntimeError: Assertion `cur_target >= 0 && cur_target < n_classes' failed.  at 
c:\a\w\1\s\tmp_conda_3.6_090826\conda\conda- 
bld\pytorch_1550394668685\work\aten\src\thnn\generic/ClassNLLCriterion.c:93

I searched StackOverflow where it says that error is associated with using the wrong loss function.

So, then I explicitly defined the loss function:

learn = unet_learner(data, models.resnet34, loss_func = nn.CrossEntropyLoss(), 
metrics=metrics, wd=wd, bottle=True)

But, then I get the following error:

RuntimeError: invalid argument 3: only batches of spatial targets supported (3D 
tensors) but got targets of dimension: 4 at 
c:\a\w\1\s\tmp_conda_3.6_090826\conda\conda- 
bld\pytorch_1550394668685\work\aten\src\thnn\generic/SpatialClassNLLCriterion.c:59

Any thoughts?

Just as a side note, I also tried to use nn.BCELoss(), but got this error:

ValueError: Target and input must have the same number of elements. target nelement 
(131072) != input nelement (262144)

RuntimeError: Assertion cur_target >= 0 && cur_target < n_classes' failed suggests that the criterion function expects to have two classes but a label was passed which is greater than 1 (classes being numbered as 0, 1).

I would try taking a look at the labels in the train and valid datasets and making sure they only contain 0s and 1s. Try this:

db.train_ds[0][1].data.max()
db.valid_ds[0][1].data.max()

This prints out the dataset’s 0th example’s label’s maximum value it should be 0 or 1.
If there aren’t just 0s and 1s it would mean that the labels are not loaded correctly.
Could be something else of course.

Thanks, yeah, I guess I am definitely doing something wrong. I created the datablock per:

data = (src.transform(get_transforms(), tfm_y=True)
    .databunch(bs=bs)
    .normalize(imagenet_stats))

And when I print out:

data.show_batch(2, figsize=(10,7))

Everything Looks OK, but when I do as you describe:

data.train_ds[0][1].data.max()
returns: tensor(255)

data.train_ds[0][1].data.max()
returns: tensor(255)

The mask pixels are either 0 or 255 (background or cell). But, I guess I am somehow creating labels form 0 to 255 (256 labels). Will need to try and figure that out. Thanks!

You’re welcome. I’ve actually been working on a segmentation problem myself and had the same issue :slight_smile:. Feel free to ask again if you need more help.

Also, you may find this conversation relevant: https://forums.fast.ai/t/unet-binary-segmentation/29833/8

Thanks yes, will re-read that thread, trying to break out the datablock api step by step right now and when I create a label list, does seem strange:

ll = sd.label_from_func(get_y_fn, classes=codes)

ll returns:

LabelLists;

Train: LabelList (16 items)
x: SegmentationItemList
Image (1, 256, 256),Image (1, 256, 256),Image (1, 256, 256),Image (1, 256, 256),Image 
(1, 256, 256)
y: SegmentationLabelList
ImageSegment (1, 256, 256),ImageSegment (1, 256, 256),ImageSegment (1, 256, 
256),ImageSegment (1, 256, 256),ImageSegment (1, 256, 256)
Path: data\tiramisu\cell;

Valid: LabelList (3 items)
x: SegmentationItemList
Image (1, 256, 256),Image (1, 256, 256),Image (1, 256, 256)
y: SegmentationLabelList
ImageSegment (1, 256, 256),ImageSegment (1, 256, 256),ImageSegment (1, 256, 256)
Path: data\tiramisu\cell;

Test: None

I guess I would expect to see the Categories (background, cell) as opposed to the above?

I see in that thread, I should insert

.set_attr(mask_opener=partial(open_mask, div=True))

But getting errors when I try and so that

I saw where to put div=True, but now getting another error:upside_down_face:

RuntimeError: invalid argument 3: only batches of spatial targets supported (3D tensors) 
but got targets of dimension: 4 at c:\a\w\1\s\tmp_conda_3.6_090826\conda\conda- 
bld\pytorch_1550394668685\work\aten\src\thnn\generic/SpatialClassNLLCriterion.c:59

The LabelLists seem correct to me.
At what point do you get the error and what is in data.train_ds[0][1].data.

Thanks Dusan, I actually had to use the custom class, and put Div=True, then I got:

data.train_ds[0][1].data.max()
return:  tensor(1)

Hoorah:grinning: Then I had to set num_workers = 0 to get things to run (still working on the toy dataset on laptop.

I thought I was there when:

lr_find(learn)
learn.recorder.plot()

Worked and gave me a nice graph

But, on the last step:

learn.fit_one_cycle(1, slice(lr), pct_start=0.8)

It starts to run and gives me blue bars progressing, but at the end I get a red bar and the following error:

IndexError: only integers, slices (`:`), ellipsis (`...`), None and long or byte Variables are 
valid indices (got ImageSegment)

Have no idea what that is or why getting that at this point (after Lr_find) worked and after it gives blue bars for a while?

Trying to search on that now. Thanks for all the continued help!

Looks like it has to do with the Metrics:

def acc_camvid(input, target):
target = target.squeeze(1)
#mask = target != void_code
return (input.argmax(dim=1)[mask]==target[mask]).float().mean()

This is causing it:

<ipython-input-244-8e77f302bf1e> in acc_camvid(input, target)
  5     target = target.squeeze(1)
  6     #mask = target != void_code
----> 7     return (input.argmax(dim=1)[mask]==target[mask]).float().mean()

I switched over from acc_camvid to dice and changed .float to .long and got an answer:)

The dice metric is strange 0.000000 but per the class I know metrics do not affect training so at least getting to an output:) Thanks again for sticking with me on this…

def acc_camvid(input, target):
   target = target.squeeze(1)
   #mask = target != void_code
   return (input.argmax(dim=1)[mask]==target[mask]).float().mean()

Well you commented out the mask since you don’t need it for your dataset but it is still present in the third line as [mask]. It shouldn’t be there. And I suspect the error message

IndexError: only integers, slices (`:`), ellipsis (`...`), None and long or byte Variables are 
valid indices (got ImageSegment)

is a result of the fact that you have used the mask variable to visualize some image’s mask (I reckon it would be an ImageSegment in that case.

You can verify this for yourself by inserting a set_trace() call inside the function. This will call the debugger at that point.

def acc_camvid(input, target):
   set_trace()              # you can print out mask here
   target = target.squeeze(1)
   return (input.argmax(dim=1)[mask]==target[mask]).float().mean()

No worries, glad to be of use :slight_smile:

Thanks Dusan,

Really appreciate that. Thanks, I just missed that. After doing as you described, it seemed to run again, but then got an out of memory error on my laptop. So, then, I put everything up on the GCP GPU, but I still got the out of memory error. So, I switched back to metrics=dice and was able to run and while I don’t completely understand the dice metric, (anecdotally, seem to be getting good results when I predict on an image). So, now working through serving on a webpage. Flask is easy without images, but trying to edit the Starlette code on Render. Thanks again so much for looking at that. Yeah, I thought after doing what you said it would run. Not sure why out of memory (I restarted kernel a couple times). Seems like dice is more suited for binary segmentation (maybe that has something to do with this). Thank you so much!

1 Like

No problem. Here is the accuracy fn i used for binary segmentation for the Carvana Image Masking:

def acc_carvana(input:Tensor, targs:Tensor)->Rank0Tensor:
    n = targs.shape[0]
    input = input.argmax(dim=1).view(n, -1)
    targs = targs.view(n, -1)
    return (input == targs).float().mean()

and here is the full related notebook, maybe it will help you (maybe not) :slight_smile:

Thanks Dusan!

I will give that a try. (Trying to figure out Gunicorn right now:). Thank you so much for all the help!

1 Like

This link, is broken.
Can you reshare it, thanks…

Here it is Muhammad, but beware that the code there is from 2019 so fastai has definitely changed since then :).

Thanks, but again, 404…page not found…

Ah sorry, it was private for some reason. Should work now.