Object Detection - Open Images V5

baz · September 12, 2019, 6:01pm

I’m trying to create an object detection algorithm based on the Google Image Dataset

I’m using the validation set.

Here is a link to the notebook that will download and process the data for you.

The bounding boxes however don’t seem to be in the correct places:

download

Does anyone know what I might be doing wrong here?

sgugger · September 12, 2019, 8:47pm

Why are you doing the scaling in this cell?

labels['XMin'] *= w
labels['XMax'] *= w
labels['YMin'] *= h
labels['YMax'] *= h

I think that’s where your problem comes from. fastai will automatically rescale the bounding boxes for you.

baz · September 12, 2019, 8:48pm

The XMin, XMax, YMin, YMax values are between 0 and 1 and in the examples I saw that they needed to be inline with the actual image width and height.

sgugger · September 12, 2019, 8:51pm

Ah yes, if it’s 0 to 1 you need this. Have you tried doing it the other way round? I think y is first.

baz · September 12, 2019, 8:53pm

Doing what the other way round? The operations on the DF? Or swapping the order of the values in the labelling function?

bounds = boxes[['YMin', 'XMin', 'YMax', 'XMax']].values.tolist()

baz · September 12, 2019, 8:54pm

Just a note that the code works with the tiny coco dataset you show in the documentation but these images are 447x1024. Could that have an effect?

sgugger · September 12, 2019, 9:01pm

No, the code handles rectangular images. In your case we can see the xs are correct with the fish and the balls, but the ys are improperly scaled, so that’s where there is a problem.

baz · September 13, 2019, 9:31am

So from the documentation of the dataset

XMin , XMax , YMin , YMax : coordinates of the box, in normalized image coordinates. XMin is in [0,1], where 0 is the leftmost pixel, and 1 is the rightmost pixel in the image. Y coordinates go from the top pixel (0) to the bottom pixel (1).

So it seems that they have the values we need for the top left bottom right coordinate system.

baz · November 8, 2019, 10:43pm

Images aren’t all the same size so I had to go through and scale values depending on the size of the image