Object detection in fast.ai v1

You can just clone it and add your sys with Object-Detection-Metrics/lib/ . That’s what I did :smiley:

ok in the pascal nbk we dont have any metric defined ??

What do you mean by pascal nbk ? I found this in his Cocotiny nb:

In [15]:

voc = PascalVOCMetric(anchors, size, [i for i in data.train_ds.y.classes[1:]])
learn = Learner(data, model, loss_func=crit, callback_fns=[ShowGraph, BBMetrics],

i was talking about this fai notebook which is in dev phase

In this notebook are they using above metrics or any other metrics?

I haven’t run this nb yet. But at the end I found mAP. I don’t know if they show any others metrics during learning.


the models return three thinks:

  1. [1, 24480, 3] -> [Batchsize, number of boxes, classes + background] -> classification results for each anchor
  2. [1, 24480, 4] -> [Batchsize, number of boxes, box-coordinates refinements] -> how the anchors should be resized to match the objects in the image
  3. Feature map shapes. You can ignore that for a start is more for debug purposes

For inference please take a look at the following colab notebook:

or this function in the repo:


Great !! Thank you so much

THanks christian
I wanted to ask few more things related to metrics and bbox coordinates

  1. I am using RSNA kaggle competition dataset where they have provided bbox in this order as x, y width height
    Do i have to do swap the x y and width & height here so that is why m doing as below
    train_df[‘bbox’]= train_df.loc[:,[‘y’,‘x’,‘height’,‘width’]].values.tolist() # x ,y,width ,height
    train_df[‘bbox’]= train_df[‘bbox’].apply(lambda x:[[x[0],x[1],x[2]-x[0]+1,x[3]-x[1]+1]])
  1. I am having a doubt if bbx are drawn correctly or not when i see in data.show_batch so wanted to check if i m passing coordinates correctly to API.
  1. In the above problem we are doing binary classification 0 and 1 so will this metric work for this ?
    what does this metric tries to evaluate ?
    PascalVOCMetric(anchors, size, [i for i in data.train_ds.y.classes[1:]])

Thanks in advance for your time


sorry I have problems understanding your questions.

1-2) Are the boxes correctly drawn at schow_batch?

  1. Yes that would work to evaluate the mAP at train time on the validation dataset for the one class you have.

With kind regards,

By looking at the picture i had doubt for that reason i wanted to take one image for which i can draw bbox using rectangle patch and compare that to show batch .
But problem is show batch takes pics at random every time.
Any alternate suggestion to cross verify ?
I tried using below means.But i get an error message show has no param y.
before calling BBoxcreate, i read the image using pydicom (Since img format is .dcm and it stores numpy array in field ) and converted numpy array to PIL Img object before passing it to method.

img = open_image(path/'train'/train_images[1])
bbox = ImageBBox.create(*img.size, train_lbl_bbox[1][0], [0, 1], classes=['person', 'horse'])
img.show(figsize=(6,4), y=bbox)

Thanks for your notebook @Bronzi88! I tried to recreate your success with the SVHN dataset, however I am stuck when it comes the training part. As soon as I run learn.recorder.plot() or learn.fit_one_cycle() it throws the following error, which I don’t know how to handle:

RuntimeError: The size of tensor a (3) must match the size of tensor b (4) at non-singleton dimension 3

Are my images the wrong size? I resized them to be 48x48. Are the anchors set wrong?

1 Like

Hi (Moin),

the minimal supported image size is 256x256 at least until now.
But that would lead to a different error :frowning: message.

Can you post the notebook or some code? Then I could try to replicate the error.

With kind regards,

Hi (Moin),

the following code should work.

img = open_image(coco/ 'train_sample' / images[0])
image_boxes = img2bbox[images[0]][0] #[[x1,y1,x2,y2], [x1,y1,x2,y2]] 
bbox = ImageBBox.create(*img.size, bboxes=image_boxes, labels=[0,0,0], classes=['TV'])
img.show(figsize=(6,4), y=bbox)

With kind regards,

I’m trying to adapt the RetinaNet notebook to a dataset that contains empty images with no objects. Does anyone know how I might go about adapting the current notebook to accommodate this?

Right now the notebook as is works fine when I use the subset of the data that contains images, so everything is good there. When I try to add in images without labels, I get errors. Specifically the create method of the ImageBBox class tries to index into an empty list.

Does anyone know a good format for representing empty images in bounding box format? I’ve considered adding some dummy coordinates like [0, 0, 0, 0] but I’m concerned that will have weird effects on training.

Hi (Moin),

if you use [0, 0, 0, 0] with the background class as the label that should be work just fine.

With kind regards,

Gave this a try. It causes some sort of numerical issue. When I create a dataloader, I get a ton of UserWarning: Tensor is int32: upgrading to int64; for better performance use int64 input

When I try to call data.show_batch(rows=3), the kernel dies. The issue repeats.

Maybe extending the bounding box for the background to the entire picture area helps to avoid this problem?


in case you don’t have any labels

def get_y_func(o):
    if str(o.id) in img2bbox: #Labels?
        return img2bbox[str(o.id)]
    else: #No labels?
        return [[[0,0,0,0]], ['background']]

works just fine for me.

This does work for dealing with empty images. You do end up with an extra class (‘background’ is listed twice in data.classes and data.c is 3, but monkey patching these seems to work.

The dead kernel thing is still an issue, which is weird. It seems like if the dataloader gets a batch of empty images, something goes wrong. I’m doing this on Windows 10 so this could be an OS specific issue. My first thought was maybe it’s a multiprocessing thing, so I set num_workers=1 but that didn’t solve it. I’m not really sure how to troubleshoot because things go from fine to dead kernel without any kind of error.

Might spin up an Ubuntu server and see if the issue persists there.


to deactivate multiprocessing, you have to set num_workers to zero. At least if I remember correctly.

'background’ is listed twice. Okay, that is not what I intended. But the fastai method generate_classes add [‘background’] anyway and do not check if its already in the list of classes. Sorry about that.

But you can easily fix that by overwriting the method generate_classes from the class ObjectCategoryProcessor.

With kind regards,

1 Like