Object detection using fast.ai+pytorch+Faster R-CNN


Hi everbody!

I have been working with the Tensorflow Object detection API + Faster R-CNN to detect dead trees from large aerial/satellite images. The images are huge, so they are split up in a 600X600 moving window. The training dataset is not very large (2000 images), so I use transfer learning as descirbed in the API docs to train the last layer of the model which works quite well.

Since I have started the #part1-v2 fast.ai course I was wondering if all I am doing with tensorflow can´t be faster and easier when using fast.ai+pytorch. Also the neat features of the fast.ai library like the learning rate finder, Stochastic Gradient Descent (SGD) with Restart and so on make it very appealing to try this approach! Unfortunately I haven´t found any helpful info on this subject or the pytorch forum about this…

So my question to the people of this lovely forum is, if anybody has tried already to do object detection with the fast.ai library using pretrained pytorch Faster R-CNN, R-FCN, SDD models or could point me in a good direction where to start? @jeremy or is this going to be coverd in #part2 by coincidence?

I have found an two interesting pytorch implementation of Faster R-CNN and for SDD that could be useful for this:

I am looking forward to your responses :grinning:



Hi, I saw you haven’t got any responses. Have you been able to implement this using fastai?


(Dien Hoa TRUONG) #3

You can find Object Detection model with fast.ai v1 in this thread: Object detection in fast.ai v1

For details, we have SSD and Retina Net.

Hope that helps


(Brian) #4

Recently torchvision was updated with pretrained models for faster r-cnn (and more):

I’ve been trying to get it to play nice with fastai but it seems to be incompatible, or am I missing something?

I had it up to the point where I was able to create a Learner (with its constructor), but the main problem seems to be that the loss function is baked into the model. In training mode the model expects images and targets (model(images, targets)) so that the model can produce the loss(es).
The fastai Learner's fit() (or actually in loss_batch()) would only pass images.

I see how a some of the complexity was abstracted away like this, but still… thinking of compatibility, why did they decide to do this?
Is there some feature in fastai that I’ve missed that anticipates this sort of behavior?

I’ve been at it a few days now but so far haven’t been able to make it work. I guess to get it working one would need to:
a) clone and alter most of the torchvision detection code; or
b) create a custom learner?; or
c) hopefully something better that I haven’t discovered yet :wink:

Has anyone else had more luck with this?


(Christian Marzahl) #5

Hi (Moin),

I´m trying to achieve the same, but facing the same challenges as you do. Any progress from your side?

With kind regards,


(Brian) #6

Hi Christian,

I started with altering the torchvision detection code.
I changed GeneralizedRCNN's forward() to not calculate the losses but only return a dict with the images and the features that came out of the backbone (while training). Then steal rpn and roi_heads and add them to a custom loss function. If I created a custom DataSet (or pipeline? I haven’t read up on that yet) that could supply the model with the images and targets simultaneously, I wouldn’t have had to move those into the loss function, but somehow I was determined to use an ObjectItemList to create a DataBunch ;D

fit() ran, but the losses went to infinity or NaN. More specifically, I noticed loss_rpn_box_reg losses started to contain infs or NaNs.

At first I figured it was because I wasn’t supplying the target box coords in the right order, but now I guess it’s because I thought fastai could take care of transforms so I disabled the transforms in GeneralizedRCNN. I haven’t looked closely at the transforms but there might be some necessary normalization or something that I overlooked :sweat_smile:

I’ll give it another go soon. Have you made more progress?

1 Like