Transform api for bounding boxes


(Arka Sadhu) #1

I see there are a bunch of useful transforms. However, I can’t find transforms for bounding box coordinates. I believe in the older version there was TfmType. Is there an equivalent of that?


#2

Transforms are automatically applied to bounding boxes, see the pascal notebook. It’s not done on the points though (will work on this a bit letter) but applied to a rectangle mask, so it’s not the most efficient.


(Arka Sadhu) #3

Can you give a link to the pascal notebook? I can only find the notebook from the previous version.


(Kevin Bird) #4

I’m pretty sure it is this one. It is in the fastai_docs repo.


(Arka Sadhu) #5

I have got a hang of the transform api. Thank you for pointing me to the right direction @sgugger @KevinB However, I get an error on setting the label=None because there is an x.clone() here: https://github.com/fastai/fastai/blob/efb9cd6387cd11f0ddc95f83bf548c3db3c69561/fastai/vision/image.py#L428
and it raises the error “Nonetype object has no method defined clone”.

I am guessing https://github.com/fastai/fastai/blob/efb9cd6387cd11f0ddc95f83bf548c3db3c69561/fastai/vision/image.py#L225 should be

if self.labels:
    bbox.labels = self.labels.clone()

#6

Thanks for pointing out and giving the fix. I just pushed a correction.
Note that bboxes are still moving a bit and that I’ll probably break them once or twice before the end of next week :wink:


(Arka Sadhu) #7

The current version seems fine to me though. I will update my code when time comes as well.


#8

It’s doing the data augmentation in a very dumb way. I’m implementing data augmentation for points, and will then use it for ImageBBox when it’s ready.


(Arka Sadhu) #9

Ah, I misunderstood that as changing the api for bounding boxes to which I said the current api seems good to me. Yep definitely data augmentation can be done for points.


(Arka Sadhu) #10

Wanted to know the motivation behind normalized bounding boxes returned by data attribute here: https://github.com/fastai/fastai/blob/master/fastai/vision/image.py#L254

And is there any easy way to get the raw bounding box?

Edit: Now that I think it makes a lot of sense to have it normalized. Ignore my question.


(Arka Sadhu) #11

Sorry for the repeated comments. I am wondering why there is no resize transformation? There is a crop function, however when using bounding boxes, if the bounding box is not in the cropped part, it becomes a problem. Is there resize defined but I can’t find it?


#12

Resize is done when you give a target size.
As for your previous question, the bbox coordinates are scaled from -1 to 1 because that’s what the coord transforms expect internally (because of pytorch grid_sample function).