Detectron2 anyone?

Hi all.

I’m currently trying to fool around with object detection and semantic segmentation.
Is anyone working with detectron2 in an fastai context?
They have a wealth of pertained models there and since object detection is kinda lacking maybe interfacing to this lib would be a good approach?



I haven’t tried it, but I agree it’s a good avenue to explore. Would love to hear any results from combining fastai v2 with it.


I tried in the last days to integrate fastai2 and detectron2, and I reached the conclusion that is not worth it.

Detectron2 already does things its own ways, it already has its custom training loop with a minimal hook system, it already has a “data API”. Either I would have to substitute almost all functionality already present on detectron2 with fastai2, or barely use fastai2 at all. I wasn’t happy with either (and trying to glue everything together was a very poor coding experience, arrrgh, so many hacks!!!).

The solution? Build everything from the ground up with fastai + torchvision!
Well, I say “from the ground up” but actually there are only some little adjustments that need to be made. Torchvision already provide us with the models and the losses (and those have a lot of moving parts) and fastai provides everything else.

And as we all know, the fastai coding experince is very joyful, torchvision is very solid as well. I got up and running training a Mask RCNN model in only two days!

I’m now working on making the core abstractions I created more robust, there still a lot left to do, I’ll be posting more info here in the forums soon :grin:


@lgvaz, that sounds great! I had a similar experience initially and kinda gave up on it… :man_shrugging:

Any repo to share yet? Would be super interested to see some of your code :+1:

I am working on integrating MaskRCNN in Fastai2 too.

I needed to make a Learner subclass and new Datablocks.

I got this morning working.

However, I am in the same state as you. I need to integrate metrics and generalize code.

We could both work hand by hand, if you are interested send me a message. I’ll be glad to make this out!!


@cwerner I’ll try to complete some basic functionality by tomorrow and share what I got so far. RN there is no gain of sharing the repo because everything still a mess :sweat_smile:

@WaterKnight That’s really cool!! It would be very interesting and insightful to compare our ideas and see the different approaches used to solve the same problem.

How was your strategy for calculating the validation loss? Did you changed the model to training?

It would be very nice to collaborate with you both yes :grin:


This is the latest thing I worked on, learn.predict:


Yes, in the lines before feeding the input to the model. I pass the model to train mode. I also needed to transform my big dict with several values by key, to different dicts so I can feed it into the model.

I am also assigning the result dict with losses to pred :joy:.

At FP_16 the model don’t works, some losses get Nan values. In FP32 it is not happening.

In show results I change it to val an plot original mask at left and predicted at right.

Your plot looks very nice. Could you share it?

I think that my Datablock is not very good. Transforms like Resize,Normalize,Flip… don’t work.

Have you used the Datablock API?

Do you have Telegram? If you want to could send me your user by private message.


I was awaiting for the lib to be more stable before sharing but what the hell, here is it, MantisShrimp!! Here is the initial example I’m working on

I’ll also use this opportunity to talk about some difficulties and design choices I made in the last couple of days developing this

The clearest most obvious difference is that RCNN models accept a list of variable size images instead of a single collated tensor (this breaks one of the core fundamentals of fastai: batch transforms expect collated tensors, I think there is no easy way of solving this, and I imagine this is where you were having problems with transforms as well).

Alright alright, leaving that problem aside, the first step need was to change the collate function, so instead of trying to get a single tensor with bs in the first dimension, we can just use a list of tensors. For that we just need to change the collate function used in DataLoader.create_batch to something like this:

def _zip_collate(t): return Tuple(zip(*t))

Being a tuple also has the added property that all transforms will be dispatched.

While this worked initially, little I knew that pure chaos laid before me. Long story short, Tuple was not working well with functions like tuplify, is_iter, L and many more, e.g. say we call tuplify on a tensor, what we get back is (tensor), but if we call tuplify on a tuple, nothing changes! But what I actually need to happen is to have the inner tuple wrapped.

The solution I found was to create a new type, which I called Bucket. With some clever tricks (that I don’t consider good practice) I made Bucket a type that behaved like a list but is always wrapped by L, tuplify and stuff like that. Transforms also gets dispatched over all items in a Bucket just like a Tuple.

Last thing I did was to patch some fastai2 functions like to_concat and nested_reorder so they know what do to when they encounter a Bucket.

After that I created a callback that injects yb into the model and a callback that gets the loss from the model “predictions”.

That is basically it so far, in the repo I have a list of issues of things I know need to be implemented (but there a still a bunch of things I still don’t know that need to be implemented lol)

@sgugger, @jeremy I would like to ask for your opinions here, does creating this Bucket class seems like the right way to go? I tried and I tried to make it work with a simple List/Tuple, but I’ve failed each time.

Bucket needs to be an iterable so it can be used with functions like map. But it also shouldn’t be considered an iterable so it get’s wrapped by L/Tuplify :woozy_face:

The hack I found for this is to make it iterable via the __getitem__ method without implementing __iter__. Fastcore is_iter checks for the method __iter__ so Bucket does not get considered an iterable.

I don’t want to have to pad everything, this models accepts variable size inputs so it should be like this. But I’m not sure how this is going to play out with all the batch transforms…

While this solution is working, I don’t like it. It’s very fragile. The library is still in the point that I can throw everything away and start from scratch without being afraid, so any light shed by your advices is extremely welcome.


There may be some issues I’m not aware of but since you’re rewriting create_batch and show_batch, why not put all the special behavior there?

1 Like

Can you elaborate more on “put all special behavior there”? Specially inside create_batch, what would that be returning in this case?

The problem I was having is that if create_batch returns a List/Tuple, the rest of the pipeline fails to work.

Another problem I’m currently facing (and this happens even with the Buckets), is that currently the type of my batch is Bucket, Bucket, making it hard for type dispatch… I’m trying two solutions here, either type dispatching based on the type of the first element of the Bucket, or dynamically creating custom Bucket types, like “ImageTensorBucket”. The second solution worked well, but since the types were created dynamically pickling these types does not work straight out of the box (and this is needed for dataloader multiprocessing)

1 Like

I don’t understand that part, create_batch is the thing that returns the batch, so there is nothing else after. That you would need a special type for handling batch transforms, that I get, but you shouldn’t have to monkey-patch other functions.

As for introducing custom new types, they should be exported in the module if you want pickling to work.

1 Like

You’re right, the training process almost work flawlessly if we simply return a Tuple, the only minor issue that happens here is when Recorder tries to accumulate validation losses here and calls find_bs on the tuple, but we can directly patch find_bs for that, so let’s forget about it.

The problems starts to appear on the “predict” methods. I think it’s going to be more useful if I just give you the problem instead of trying to explain it, so here is the first problem that happens on learn.get_preds:

dl = self.dls.test_dl([item], num_workers=0)
inp,preds,_,dec_preds = self.get_preds(dl=dl, with_input=True, with_decoded=True)
TypeError                                 Traceback (most recent call last)
<ipython-input-20-4149556023ad> in <module>
      1 dl = self.dls.test_dl([item], num_workers=0)
----> 2 inp,preds,_,dec_preds = self.get_preds(dl=dl, with_input=True, with_decoded=True)

~/git/fastai2/fastai2/ in get_preds(self, ds_idx, dl, with_input, with_decoded, with_loss, act, inner, reorder, **kwargs)
    235                 res[pred_i] = act(res[pred_i])
    236                 if with_decoded: res.insert(pred_i+2, getattr(self.loss_func, 'decodes', noop)(res[pred_i]))
--> 237             if reorder and hasattr(dl, 'get_idxs'): res = nested_reorder(res, tensor(idxs).argsort())
    238             return tuple(res)

~/git/fastai2/fastai2/ in nested_reorder(t, idxs)
    613     "Reorder all tensors in `t` using `idxs`"
    614     if isinstance(t, (Tensor,L)): return t[idxs]
--> 615     elif is_listy(t): return type(t)(nested_reorder(t_, idxs) for t_ in t)
    616     if t is None: return t
    617     raise TypeError(f"Expected tensor, tuple, list or L but got {type(t)}")

~/git/fastai2/fastai2/ in <genexpr>(.0)
    613     "Reorder all tensors in `t` using `idxs`"
    614     if isinstance(t, (Tensor,L)): return t[idxs]
--> 615     elif is_listy(t): return type(t)(nested_reorder(t_, idxs) for t_ in t)
    616     if t is None: return t
    617     raise TypeError(f"Expected tensor, tuple, list or L but got {type(t)}")

~/git/mantisshrimp/mantisshrimp/data/ in nested_reorder2(t, idxs)
     83     if isinstance(t, Bucket):
     84         return t[idxs]
---> 85     return _old_nested_reorder(t, idxs)
     86 fastai2.torch_core.nested_reorder = nested_reorder2

~/git/fastai2/fastai2/ in nested_reorder(t, idxs)
    615     elif is_listy(t): return type(t)(nested_reorder(t_, idxs) for t_ in t)
    616     if t is None: return t
--> 617     raise TypeError(f"Expected tensor, tuple, list or L but got {type(t)}")
    619 # Cell

TypeError: Expected tensor, tuple, list or L but got <class 'dict'>

The main culprit of the problem here is is_listy.

After this problem is solved, a further problem is going to be encountered inside learn.predict, here:

dec = self.dls.decode_batch(inp + tuplify(dec_preds))[0]

tuplify will ultimately use is_iter and our dec preds will not end up wrapped by a tuple, which is what we need.

This is why led me to this crazy Bucket solution, which required a lot of monkey-patching… I would love a simpler solution if you think it’s possible =)

1 Like

The problem is that in that solution we’re creating types at runtime =/

But forget about this for now, not important compared to the other problems

My DataBlock is eassier, is just a dict.

I am using it just for the target and keeping ImageBlock for the input.

It works pretty straidforward. I am not using transforms like resize and normalize. It is done in first layer of torchvision.models.detection.maskrcnn_resnet50_fpn.

Can you share your implementation?

Does your data block allows for variable sized images? (Without the need to collate them with the help of Resize)

It is working for variable images sizes. All the work is done in a subclass of Learner.

1 Like

Ow, I’m so sorry, I forgot to answer this.

I don’t have telegram, but I really like the forum system we have here, if you like, we can create a thread only for discussing this project, with the added benefit of being open to anyone that wants to collaborate =)

1 Like

In the previous answer you have a topic that was created for that purpose. I have been updating there with my advances

1 Like

So I believe this you current datablock?

maskrccnnDataBlock = DataBlock(
    blocks=(ImageBlock, MaskRCNNBlock),

Does it work if you remove the resize from item_tfms?