Transform api for bounding boxes

I see there are a bunch of useful transforms. However, I can’t find transforms for bounding box coordinates. I believe in the older version there was TfmType. Is there an equivalent of that?

1 Like

Transforms are automatically applied to bounding boxes, see the pascal notebook. It’s not done on the points though (will work on this a bit letter) but applied to a rectangle mask, so it’s not the most efficient.

Can you give a link to the pascal notebook? I can only find the notebook from the previous version.

I’m pretty sure it is this one. It is in the fastai_docs repo.

2 Likes

I have got a hang of the transform api. Thank you for pointing me to the right direction @sgugger @KevinB However, I get an error on setting the label=None because there is an x.clone() here: https://github.com/fastai/fastai/blob/efb9cd6387cd11f0ddc95f83bf548c3db3c69561/fastai/vision/image.py#L428
and it raises the error “Nonetype object has no method defined clone”.

I am guessing https://github.com/fastai/fastai/blob/efb9cd6387cd11f0ddc95f83bf548c3db3c69561/fastai/vision/image.py#L225 should be

if self.labels:
    bbox.labels = self.labels.clone()

Thanks for pointing out and giving the fix. I just pushed a correction.
Note that bboxes are still moving a bit and that I’ll probably break them once or twice before the end of next week :wink:

The current version seems fine to me though. I will update my code when time comes as well.

It’s doing the data augmentation in a very dumb way. I’m implementing data augmentation for points, and will then use it for ImageBBox when it’s ready.

Ah, I misunderstood that as changing the api for bounding boxes to which I said the current api seems good to me. Yep definitely data augmentation can be done for points.

Wanted to know the motivation behind normalized bounding boxes returned by data attribute here: https://github.com/fastai/fastai/blob/master/fastai/vision/image.py#L254

And is there any easy way to get the raw bounding box?

Edit: Now that I think it makes a lot of sense to have it normalized. Ignore my question.

Sorry for the repeated comments. I am wondering why there is no resize transformation? There is a crop function, however when using bounding boxes, if the bounding box is not in the cropped part, it becomes a problem. Is there resize defined but I can’t find it?

Resize is done when you give a target size.
As for your previous question, the bbox coordinates are scaled from -1 to 1 because that’s what the coord transforms expect internally (because of pytorch grid_sample function).

1 Like