Bounding boxes with data augmentation

Does anyone know if there is any way to easily use data augmentation with bounding boxes at the same time? Lesson 7 has an example of training the network with both the classes and bounding box coordinates. As you may remember, the bounding boxes had to be resized as the images were resized, otherwise they would show up in the wrong places.

I am not sure if there is any way to easily do this for augmentation, unless you incorporated it into your own augmentation code.

Thanks, Christina

No easy way AFAIK. But I had to deal with a similar problem in lesson 14, for image segmentation, so in that lesson I show a framework you can use for any situation where your labels need to be modified with your data augmentation. Maybe you can try using that…

1 Like

Thanks Jeremy… I will eventually get to lesson 14. :slight_smile:

I also found that someone else has implemented an augmentation algorithm that will do this – check this out - about 3/4 of the way down the page, on landmarks:

Also, if you look in the pictures on that page, you will see the green dots (landmarks) are moved to their correct spots by the augmentation algorithm. This looks to be extremely useful!

Good find!

If you change the Keras image data generator to return the affine transform matrix as well as the transformed image, you could then apply that to other points too (just matrix multiplication of the affine matrix with [x,y,1]).

If you did this with all four corners of your bounding box, you would be able to get max and min x and y values for the transformed image. (Think opposite corners might be enough, but would have to sketch it out.)

Only caveat is that IIRC Keras centers before the transform, so you might have to add the midpoint and then subtract it again after the transform.

Here is a recent paper that shows Object Aware Random Erasing.
The idea is that random erasure of a section within an image is an effective augmentation technique, and can be applied within bounding boxes. I haven’t tried it myself.
Some code is available at