Help needed with dataset preprocessing for segmentation task

Hi mates!
I am struggling with unexpected (for me) behaviour of library.
I am trying to reuse Lesson 3 code for another dataset.
Initially dataset has a lot of classes per mask:

We need only 7 classes (face, nose, eyes …)
So I tried to cleanup label files with preprocessing to assign label values from 0 to 6,
having 0 for background like this:

classes = [76, 37, 225, 178, 149, 29, 58] # original classes
#76 => background
#37 => hair
def cls_cleanup(inp):
    file = np.array(inp)
    for key, cls in enumerate(classes):
        m = file == cls
        file[m]= key 
    file[file > len(classes)] = 0
    return Image.fromarray(file)
for fld in
    f = Path(fld)
    o = out/
    if not os.path.exists(o):
    for file in
        if 'labels' not in str(file.parent):
            s_f = open_image(file)
            s_f = open_mask(file, after_open=cls_cleanup) 
        o_f = crop_pad(s_f, size, 'reflection', 0.5,0.5)'.bmp', '.png'))

after such preprocessing, newly loaded masks somehow
have labels: 0, 250,251,252,253,254,255
but we need 0,1,2,3,4,5,6

Your help will be really appreciated, I am fighting with this issue almost a week :slightly_frowning_face:

P.S. Looks like such class indexes are causing such error once trying to get learning rate:

RuntimeError                              Traceback (most recent call last)
~/.local/lib/python3.6/site-packages/fastai/ in fit(epochs, learn, callbacks, metrics)
     99                 xb, yb = cb_handler.on_batch_begin(xb, yb)
--> 100                 loss = loss_batch(learn.model, xb, yb, learn.loss_func, learn.opt, cb_handler)
    101                 if cb_handler.on_batch_end(loss): break

~/.local/lib/python3.6/site-packages/fastai/ in loss_batch(model, xb, yb, loss_func, opt, cb_handler)
     31     if opt is not None:
---> 32         loss,skip_bwd = cb_handler.on_backward_begin(loss)
     33         if not skip_bwd:                     loss.backward()

~/.local/lib/python3.6/site-packages/fastai/ in on_backward_begin(self, loss)
    288         "Handle gradient calculation on `loss`."
--> 289         self.smoothener.add_value(loss.detach().cpu())
    290         self.state_dict['last_loss'], self.state_dict['smooth_loss'] = loss, self.smoothener.smooth

RuntimeError: CUDA error: device-side assert triggered

Try this:

1 Like

Thank you Patrick,
I`ve tried to use that code.
Once labels are loaded and processed, they look good. np.unique(mask)= [0,1,2,3,4,5,6]
Then we save them (Preprocessing before actual feed into training pipeline)
Once we open saved/processed mask, we have np.unique(mask)= [0, 250,251,252,253,254,255]
Thus preprocessing to get label files ready to feed into training pipeline fails for me ((

Have you tried to open the transformed masks in e.g. ImageJ / Fiji?

See here if you do not know it:

It is a great tool for data quality control and image processing in general. You could check, if labels are loaded correctly there. first, and report.

1 Like