What's the proper way to handle bounding box transformations?

nicochidt · July 29, 2019, 2:57pm

I’m trying to replicate the bounding box regressor from 2018’s lesson 8 but using fastai v1.

I’m not able to get the bounding boxes transformed.
I’m using a file with the format:

image_path, x0 y0 x1 y1

What I’m trying:

df = pd.DataFrame( {'fn': [ i[0] for i in imgs ] , 
                    'bb': [ ' '.join(str(p) for p in i[1]) for i in imgs]
                   }, columns = ['fn', 'bb'])
df.to_csv('dataset.csv', index=False)

data = ImageList.from_csv(path='.', 
                         csv_name='dataset.csv', 
                         folder='.',
                         )
data = data.split_by_rand_pct()
data = data.split_none()
data = data.label_from_df(cols=[ bb'], label_cls=FloatList)
data = data.transform(None, size=224, resize_method= ResizeMethod.SQUISH, tfm_y=True)

I’m using None as the first argument of transform since I only want to resize the images without cropping them.

If I show data I get the following:

ImageDataBunch;

Train: LabelList (1727 items)
x: ImageList
Image (3, 224, 224),Image (3, 224, 224),Image (3, 224, 224),Image (3, 224, 224),Image (3, 224, 224)
y: FloatList
[ 35.  72. 153. 172.],[344. 238. 416. 312.],[  0.  66. 121. 182.],[ 25.  59. 161. 197.],[ 16.  23. 483.  91.]
Path: .;

Which is the exact same bounding boxes in the dataset. There are even bounding boxes that are outside the images (344 > 224, 238 > 224).

What I’ve tried so far:

Different values for tfm_y (True/TfmCoord/TfmPixel)
Getting the bounding boxes coordinates from a function
Getting the bounding boxes coordinates from different columns of the dataframe
Combinations of 1,2,3 with and without label_cls=FloatList

What’s the right way to get the bounding boxes transformed properly?