Instance segmentation

@muellerzr With the next order:

manual = DataBlock(blocks=(ImageBlock,MaskBlock(codes),BBoxBlockSegmentation, BBoxLblBlock),
                   get_items=partial(get_image_files,folders=[dataset1]),
                   getters=getters,
                   splitter=RandomSplitter(valid_pct=0.1,seed=2020),
                   item_tfms=Resize((size,size)),
                   batch_tfms=Normalize.from_stats(*imagenet_stats),
                   n_inp=1
                  )
manual.summary(path_images)
dls = manual.dataloaders(path_images,bs=bs)
dls.one_batch()

Different error

Setting-up type transforms pipelines
Collecting items from ../datasets/Images
Found 621 items
2 datasets of sizes 559,62
Setting up Pipeline: <lambda> -> PILBase.create
Setting up Pipeline: get_mask -> PILBase.create
Setting up Pipeline: get_bbox -> TensorBBox.create
Setting up Pipeline: get_bbox_label -> MultiCategorize

Building one sample
  Pipeline: <lambda> -> PILBase.create
    starting from
      ../datasets/Images/manual/165.png
    applying <lambda> gives
      ../datasets/Images/manual/165.png
    applying PILBase.create gives
      PILImage mode=RGB size=1002x1004
  Pipeline: get_mask -> PILBase.create
    starting from
      ../datasets/Images/manual/165.png
    applying get_mask gives
      ../datasets/Labels/manual/165.png
    applying PILBase.create gives
      PILMask mode=L size=1002x1004
  Pipeline: get_bbox -> TensorBBox.create
    starting from
      ../datasets/Images/manual/165.png
    applying get_bbox gives
      [[425, 387, 641, 591]]
    applying TensorBBox.create gives
      TensorBBox of size 1x4
  Pipeline: get_bbox_label -> MultiCategorize
    starting from
      ../datasets/Images/manual/165.png
    applying get_bbox_label gives
      [Class1]
    applying MultiCategorize gives
      TensorMultiCategory([1])

Final sample: (PILImage mode=RGB size=1002x1004, PILMask mode=L size=1002x1004, TensorBBox([[425., 387., 641., 591.]]), TensorMultiCategory([1]))


Setting up after_item: Pipeline: AddMaskCodes -> BBoxLabeler -> PointScaler -> Resize -> ToTensor
Setting up before_batch: Pipeline: mybb_pad
Setting up after_batch: Pipeline: IntToFloatTensor -> Normalize
Could not do one pass in your dataloader, there is something wrong in it

Building one batch
Applying item_tfms to the first sample:
  Pipeline: AddMaskCodes -> BBoxLabeler -> PointScaler -> Resize -> ToTensor
    starting from
      (PILImage mode=RGB size=1002x1004, PILMask mode=L size=1002x1004, TensorBBox of size 1x4, TensorMultiCategory([1]))
    applying AddMaskCodes gives
      (PILImage mode=RGB size=1002x1004, PILMask mode=L size=1002x1004, TensorBBox of size 1x4, TensorMultiCategory([1]))
    applying BBoxLabeler gives
      (PILImage mode=RGB size=1002x1004, PILMask mode=L size=1002x1004, TensorBBox of size 1x4, TensorMultiCategory([1]))
    applying PointScaler gives
      (PILImage mode=RGB size=1002x1004, PILMask mode=L size=1002x1004, TensorBBox of size 1x4, TensorMultiCategory([1]))
    applying Resize gives
      (PILImage mode=RGB size=1002x1002, PILMask mode=L size=1002x1002, TensorBBox of size 1x4, TensorMultiCategory([1]))
    applying ToTensor gives
      (TensorImage of size 3x1002x1002, TensorMask of size 1002x1002, TensorBBox of size 1x4, TensorMultiCategory([1]))

Adding the next 3 samples

Applying before_batch to the list of samples
  Pipeline: mybb_pad
    starting from
      [(TensorImage of size 3x1002x1002, TensorMask of size 1002x1002, TensorBBox of size 1x4, TensorMultiCategory([1])), (TensorImage of size 3x1002x1002, TensorMask of size 1002x1002, TensorBBox of size 1x4, TensorMultiCategory([1])), (TensorImage of size 3x1002x1002, TensorMask of size 1002x1002, TensorBBox of size 1x4, TensorMultiCategory([1])), (TensorImage of size 3x1002x1002, TensorMask of size 1002x1002, TensorBBox of size 1x4, TensorMultiCategory([1]))]
    applying mybb_pad failed.
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-22-e40df62d36e3> in <module>
      7                    n_inp=1
      8                   )
----> 9 manual.summary(path_images)
     10 dls = manual.dataloaders(path_images,bs=bs)
     11 dls.one_batch()

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastai2/data/block.py in summary(self, source, bs, show_batch, **kwargs)
    171     if len([f for f in dls.train.before_batch.fs if f.name != 'noop'])!=0:
    172         print("\nApplying before_batch to the list of samples")
--> 173         s = _apply_pipeline(dls.train.before_batch, s)
    174     else: print("\nNo before_batch transform to apply")
    175 

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastai2/data/block.py in _apply_pipeline(p, x)
    131         except Exception as e:
    132             print(f"    applying {name} failed.")
--> 133             raise e
    134     return x
    135 

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastai2/data/block.py in _apply_pipeline(p, x)
    127         name = f.name
    128         try:
--> 129             x = f(x)
    130             if name != "noop": print(f"    applying {name} gives\n      {_short_repr(x)}")
    131         except Exception as e:

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastcore/transform.py in __call__(self, x, **kwargs)
     70     @property
     71     def name(self): return getattr(self, '_name', _get_name(self))
---> 72     def __call__(self, x, **kwargs): return self._call('encodes', x, **kwargs)
     73     def decode  (self, x, **kwargs): return self._call('decodes', x, **kwargs)
     74     def __repr__(self): return f'{self.name}: {self.encodes} {self.decodes}'

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastcore/transform.py in _call(self, fn, x, split_idx, **kwargs)
     80     def _call(self, fn, x, split_idx=None, **kwargs):
     81         if split_idx!=self.split_idx and self.split_idx is not None: return x
---> 82         return self._do_call(getattr(self, fn), x, **kwargs)
     83 
     84     def _do_call(self, f, x, **kwargs):

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastcore/transform.py in _do_call(self, f, x, **kwargs)
     84     def _do_call(self, f, x, **kwargs):
     85         if not _is_tuple(x):
---> 86             return x if f is None else retain_type(f(x, **kwargs), x, f.returns_none(x))
     87         res = tuple(self._do_call(f, x_, **kwargs) for x_ in x)
     88         return retain_type(res, x)

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastcore/dispatch.py in __call__(self, *args, **kwargs)
     96         if not f: return args[0]
     97         if self.inst is not None: f = MethodType(f, self.inst)
---> 98         return f(*args, **kwargs)
     99 
    100     def __get__(self, inst, owner):

<ipython-input-13-ef644dab5e4a> in mybb_pad(samples, pad_idx)
      2     "Function that collect `samples` of labelled bboxes and adds padding with `pad_idx`."
      3     if len(samples[0]) > 3:
----> 4       samples = [(s[0], *clip_remove_empty(*s[1:3])) for s in samples]
      5     else:
      6       samples = [(s[0], *clip_remove_empty(*s[1:])) for s in samples]

<ipython-input-13-ef644dab5e4a> in <listcomp>(.0)
      2     "Function that collect `samples` of labelled bboxes and adds padding with `pad_idx`."
      3     if len(samples[0]) > 3:
----> 4       samples = [(s[0], *clip_remove_empty(*s[1:3])) for s in samples]
      5     else:
      6       samples = [(s[0], *clip_remove_empty(*s[1:])) for s in samples]

~/anaconda3/envs/seg/lib/python3.7/site-packages/fastai2/vision/data.py in clip_remove_empty(bbox, label)
     26     bbox = torch.clamp(bbox, -1, 1)
     27     empty = ((bbox[...,2] - bbox[...,0])*(bbox[...,3] - bbox[...,1]) < 0.)
---> 28     return (bbox[~empty], label[~empty])
     29 
     30 # Cell

IndexError: The shape of the mask [1002] at index 0 does not match the shape of the indexed tensor [1, 4] at index 0

I’ll have to take a look at this later tonight and get back with you :slight_smile:

1 Like

Nice!! :blush:

Thank you very much for your help.

I am going to take a deep look into the Mask-RCNN model that is available in torchvision. This model returns the losses instead of the predictions.

1 Like

I am going to adjust Mask-RCNN code from torchvision with the aim of that this model returns the predictions, so that we train as usual with fastai. We are just going to need to update the loss_func.

@muellerzr did you found the solution??

For getting Torchvision Mask-RCNN work with learner class I think that the proper way is subclassing Learner.

I have added a topic where I am explaining my concerns and problems that I am struggling with.

Did you have time to look at it?? If not, no problem. Thank you for all your help. :smiley:

I am trying to create a new block for MaskRCNN. It’s working.

However, Masks are not getting resized. You can take a look here.

UPDATE with the progress.

Getting near to solution!!

Dataloader is working:

class MaskRCNN(dict):
    
    @classmethod
    def create(cls, dictionary): 
        return cls(dict({x:dictionary[x] for x in dictionary.keys()}))
    
    def show(self, ctx=None, **kwargs): 
        dictionary = self
        
        boxes = dictionary["boxes"]
        labels = dictionary["labels"]
        masks = dictionary["masks"]
        
        result = masks
        return show_image(result, ctx=ctx, **kwargs)

def MaskRCNNBlock(): 
    return TransformBlock(type_tfms=MaskRCNN.create, batch_tfms=IntToFloatTensor)

def get_bbox(o):
    label_path = get_y_fn(o)
    mask=PILMask.create(label_path)
    pos = np.where(mask)
    xmin = np.min(pos[1])
    xmax = np.max(pos[1])
    ymin = np.min(pos[0])
    ymax = np.max(pos[0])
    
    return TensorBBox.create([xmin, ymin, xmax, ymax])
    
def get_bbox_label(o):
    
    return TensorCategory([1])
    
    
def get_mask(o):
    label_path = get_y_fn(o)
    mask=PILMask.create(label_path)
    mask=image2tensor(mask)
    return TensorMask(mask)

def get_dict(o):
    return {"boxes": get_bbox(o), "labels": get_bbox_label(o),"masks": get_mask(o)}
    

getters = [lambda o: o, get_dict]

maskrccnnDataBlock = DataBlock(
    blocks=(ImageBlock, MaskRCNNBlock),
    get_items=partial(get_image_files,folders=[manual_name]),
    getters=getters,
    splitter=RandomSplitter(valid_pct=0.1,seed=2020),
    item_tfms=Resize((size,size)),
    batch_tfms=Normalize.from_stats(*imagenet_stats)
)
maskrccnnDataBlock.summary(path_images)
dls = maskrccnnDataBlock.dataloaders(path_images,bs=bs)

Testing if data works with model:

b = dls.one_batch()

from torchvision.models.detection.mask_rcnn import *
model=maskrcnn_resnet50_fpn(num_classes=2,min_size=1002,max_size=1002)
model.train()
model = model.to("cuda")

image,target=b
images=[]
for aux in image:
    images.append(aux)
targets= []
for i in range(len(target["masks"])):
    targets.append({"boxes": target["boxes"][i], "labels": target["labels"][i],"masks": target["masks"][i]})
output=model(images,targets)
output

model.eval()
output=model(images)
output

This works. So I decided to create a subclass of Learner for making compatible with all FastAI Library:

class Mask_RCNN_Learner(Learner):
    def __init__(self, dls, model, loss_func=None, opt_func=Adam, lr=defaults.lr, splitter=trainable_params, cbs=None,
                 metrics=None, path=None, model_dir='models', wd=None, wd_bn_bias=False, train_bn=True,
                 moms=(0.95,0.85,0.95)):
        super().__init__(dls, model, loss_func, opt_func, lr, splitter, cbs,
                 metrics, path, model_dir, wd, wd_bn_bias, train_bn,
                 moms)
      
    def all_batches(self):
        self.n_iter = len(self.dl)
        for o in enumerate(self.dl): self.one_batch(*o)

    def one_batch(self, i, b):
        self.iter = i
        try:
            self._split(b);                                  self('begin_batch')
            images =[]
            for aux in self.xb:
                images.append(aux)
            targets= []
            for i in range(len(self.yb["masks"])):
                targets.append({"boxes": target["boxes"][i], "labels": target["labels"][i],"masks": target["masks"][i]})
            loss_dict = self.model(images,targets);       self('after_pred')
            if len(self.yb) == 0: return
            loss = sum(loss for loss in loss_dict.values())
            self.loss = loss;                                self('after_loss')
            if not self.training: return
            self.loss.backward();                            self('after_backward')
            self.opt.step();                                 self('after_step')
            self.opt.zero_grad()
        except CancelBatchException:                         self('after_cancel_batch')
        finally:                                             self('after_batch')

    def _do_begin_fit(self, n_epoch):
        self.n_epoch,self.loss = n_epoch,tensor(0.);         self('begin_fit')

    def _do_epoch_train(self):
        try:
            self.dl = self.dls.train;                        self('begin_train')
            self.all_batches()
        except CancelTrainException:                         self('after_cancel_train')
        finally:                                             self('after_train')

    def _do_epoch_validate(self, ds_idx=1, dl=None):
        if dl is None: dl = self.dls[ds_idx]
        try:
            self.dl = dl;                                    self('begin_validate')
            with torch.no_grad(): self.all_batches()
        except CancelValidException:                         self('after_cancel_validate')
        finally:                                             self('after_validate')                                              
    
    @log_args(but='cbs')
    def fit(self, n_epoch, lr=None, wd=None, cbs=None, reset_opt=False):
        with self.added_cbs(cbs):
            if reset_opt or not self.opt: self.create_opt()
            if wd is None: wd = self.wd
            if wd is not None: self.opt.set_hypers(wd=wd)
            self.opt.set_hypers(lr=self.lr if lr is None else lr)

            try:
                self._do_begin_fit(n_epoch)
                for epoch in range(n_epoch):
                    try:
                        self.epoch=epoch;          self('begin_epoch')
                        self._do_epoch_train()
                        self._do_epoch_validate()
                    except CancelEpochException:   self('after_cancel_epoch')
                    finally:                       self('after_epoch')

            except CancelFitException:             self('after_cancel_fit')
            finally:                               self('after_fit')  
                
    def validate(self, ds_idx=1, dl=None, cbs=None):
        if dl is None: dl = self.dls[ds_idx]
        with self.added_cbs(cbs), self.no_logging(), self.no_mbar():
            self(_before_epoch)
            self._do_epoch_validate(ds_idx, dl)
            self(_after_epoch)
        return getattr(self, 'final_record', None)

Just to mention which are the changes:

 self._split(b);                                  self('begin_batch')
            images =[]
            for aux in self.xb:
                images.append(aux)
            targets= []
            for i in range(len(self.yb["masks"])):
                targets.append({"boxes": target["boxes"][i], "labels": target["labels"][i],"masks": target["masks"][i]})
            loss_dict = self.model(images,targets);       self('after_pred')
            if len(self.yb) == 0: return
            loss = sum(loss for loss in loss_dict.values())

The learner construction:

from torchvision.models.detection.mask_rcnn import *
model=maskrcnn_resnet50_fpn(num_classes=2,min_size=1002,max_size=1002)
model.train()
model = model.to("cuda")
learn = Mask_RCNN_Learner(dls=dls, model=model,loss_func=nn.L1Loss(),
                wd=1e-1).to_fp16()
learn.fit_one_cycle(5, 1e-3)

Gives this error:

Traceback (most recent call last):
Traceback (most recent call last):
  File "/home/david/anaconda3/envs/seg/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/home/david/anaconda3/envs/seg/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
    obj = _ForkingPickler.dumps(obj)
Traceback (most recent call last):
Traceback (most recent call last):
  File "/home/david/anaconda3/envs/seg/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
  File "/home/david/anaconda3/envs/seg/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
_pickle.PicklingError: Can't pickle <class '__main__.MaskRCNN'>: it's not the same object as __main__.MaskRCNN
  File "/home/david/anaconda3/envs/seg/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
    obj = _ForkingPickler.dumps(obj)
Traceback (most recent call last):
_pickle.PicklingError: Can't pickle <class '__main__.MaskRCNN'>: it's not the same object as __main__.MaskRCNN
  File "/home/david/anaconda3/envs/seg/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/home/david/anaconda3/envs/seg/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/home/david/anaconda3/envs/seg/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
Traceback (most recent call last):
Traceback (most recent call last):
Traceback (most recent call last):
  File "/home/david/anaconda3/envs/seg/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/home/david/anaconda3/envs/seg/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/home/david/anaconda3/envs/seg/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/home/david/anaconda3/envs/seg/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
  File "/home/david/anaconda3/envs/seg/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
  File "/home/david/anaconda3/envs/seg/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
    obj = _ForkingPickler.dumps(obj)

So, that’s were I am stucked right now. If this works, just need to figurate how modify the metrics computing

2 Likes

@WaterKnight
Such error is related to pickling. Usually pickling error happens when it feels confused on the class that you are trying to dump. (e.g. you import a class, make a change on it, and then dump an instance of the class/ your class has a conflicting name).

My best guess is that the error comes from name conflict of your data class MaskRCNN.

I see you import Mask RCNN module from torchvision (i.e. from torchvision.models.detection.mask_rcnn import *). In this module, there is also another class called MaskRCNN (see the source code), which in name conflict with your data class MaskRCNN.

Try changing the class name to something else and see if the error goes away

1 Like

Thank you very much. It solved the issue.

MaskRCNN is working now in FastAI2! :smiley:

The remaining work is:

  • Adjust the input for the metrics
  • Solve issues with FP16. Some losses of the model are NaN in FP16 and not in FP32.
  • Improve show_result

For the metrics I was given some intuition:

However, I don’t understand.

3 Likes

You need to write a function that passed your yb["mask"] to the metric function (in fastai) you want to use.

1 Like

Ahhh, okey. Could you link me the default function , please? So I can look at his code and override it.

I don’t know which functions should I redefine. Don’t know which line of code is calling metrics computation.

In that case all fastai metrics would work.


EDIT
@sgugger I am editing this post with a better explanation.

Learner class can receive as parameter an array of metrics. Somewhere in FastAI lib should be a function that pass the data to each metric (self.yb and self.preds). I have been looking for this code inside learner.py and in callbacks.

However, I didn’t manage to find it. Could you post here the function or a link that points there, please?

I discover what you were refering for.

I have seen 2 options for solve this issue:

  • Create differents metrics that get the mask from the dict. However, It will suppose duplicating code in each class.
  • Overriding Recorder.after_batch in line 417 for met in mets: met.accumulate(self.learn). Then no need to code cuplication

Do you think that is a better option to pass a deep_copy of self.learn or other object where pred and yb contains just the masks?

Which option looks better for you?

Thank you very much for the previous help! :smiley:

I don’t understand. Could you explain it a little more please?

Hi @WaterKnight do you have any github repo of your work that is sharable ? seems very interesting.

Hey,I am training MaskRCNN right now for binary segmentation. It is fully working with variable batch size. The only thing not working is Mixed Precission It is owed to an error on PyTorch

I’ll be sharing the repo in a month! This is my final degree project.

2 Likes

no worries – best of luck!

I hope that by the time PyTorch solves the problem with torch.cuda.amp.autocast and maskrcnn.

In this repo you will find semantic segmentation models too!

1 Like

I guess what i am interested to see if you ficgured out a way to use an error metric. Also, does lr_finder work well for MaskRCNN

Yes, I managed to get Dice, Jaccard Coeff and other metrics for segmentation. Lr_find is working. Frezzeing and unfreezing too.

Just the only issue is Mixed Precission.

1 Like

cool – so the code snippet you pasted above actually works ? Like if we just change the name of the data class MaskRCNN, as mentioned in the comments – it will actually work – or are there other changes.
Thanks for your inputs :slight_smile: