Error in learner pipeline for a segmentation task

(Mehdi) #1

Hi friends !

I’m trying to get some experience with FastAI on various Kaggle challenges. Currently working on the TGS Segmentation challenge, I get an error I’m don’t quite know how to deal with.

Here’s my process (I’m using collab:
Getting the files:

im_files = glob.glob('./images/*.png')
masks_files = glob.glob('./masks/*.png')

Creating a function for getting label reference:

def get_y_fn(x): 
  out = 'masks/' + x.stem + '.png'
  return out

Hyperparams and databunch:

src_size = 101
size = 96
bs = 32
classes = ['salt', 'sediments']

src= (SegmentationItemList.from_folder('./images/').split_by_rand_pct(valid_pct =0.1).label_from_func(get_y_fn ,classes = classes))
data = src.transform(get_transforms(), size = size, tfm_y = True).databunch(bs = bs).normalize(imagenet_stats)

Creating the learner and searching for learning rate:

learn = unet_learner(data, models.resnet34, wd = 1e-2)
learn.lr_find()

When running this last line, collab returns this error:

Assertion `cur_target >= 0 && cur_target < n_classes' failed. at /pytorch/aten/src/THNN/generic/ClassNLLCriterion.c:92 

I’m not quite sure what’s wrong here. Could someone point me in the right direction ?
Thanks a lot !

0 Likes

(Dusan Drevicky) #2

Hi Mehdi,

please take a look at this thread where we discussed possibly the same issue. In short, I suspect that the mask does not contain just 0s and 1s as the learner expects but some other values (perhaps 0s and 255s). If that’s the case you will find the solution in the mentioned thread.

1 Like

(Mehdi) #3

Hi Dusan,

Seems like it’s exactly this. I’ll let you know when I solve this ! Thanks a lot !

0 Likes

(Mehdi) #4

Hey, Dusan !

I found a solution in this thread for fixing the mask problems. However, this instruction

learn = unet_learner(data, models.resnet34, loss_func = nn.BCELoss, metrics = [salt_acc],  wd = 1e-2).to_fp16()

returned an error saying that the bool value of a tensor was ambigous.
Finally, I launched the model without defining a particular loss function adding simple a custom metric and it seemed to work fine !

Would you happen to have an idea about this error ?

bool value of Tensor with more than one value is ambiguous

Thanks anyway for your help !

0 Likes

(Dusan Drevicky) #5

Hi Mehdi,

glad you solved your problem (partially at last). I’m not sure what’s going on with the error but a quick googling suggests that the problem might indeed be with the loss function. Maybe that helps, sorry I can’t offer a deeper suggestion :slight_smile:. Have a nice day!

0 Likes