Preparation of a dataset for image regression

I’m very new to fastai and it’s my first personal project using it.
My goal is to locate a particular symbol on a paper sheet.
So I want to do image regression, feeding the model with images associated with coordinates of a point or maybe better a bounding box. (something like in the lesson 6 and the BIWI dataset)

Right now I’m collecting images and creating a little tool to help me fill in the coordinates of the symbol for each image.

My concern is that my images are not of the same size and not on the same orientation.
What I understood is that images should be the same size, thus be transformed and squared during the training. But I don’t understand how the coordinates will behave according to this.
In my case cropping the image is not possible but adding pads is
If my original images is of size 800x600 and the symbol is at position 700x500, if my image is resized to 256x256 for example, the coordinates will be outside, right ? Same concern if pads are added to the image
Is it automagicaly handled by the library ?
Or should I do something specific to better prepare my data ?
thank you for your insights

Hi there!

I don’t believe coordinate transformations will be automatically handled by the library if you pad your data.

This sounds like a data handling issue to me. Here’s a few ideas:

  • You could calculate the required <x,y> coordinates yourself based on the prediction result, the input image size and the padded image size used for training
  • You could also decide to always work with the padded image (is there any reason for you to revert to the original?)

hey !
Thank you for your answer,
I continued my research since my last message and came across a nice library calle iceVision that seems to handle the problems related to Object detection and my concerns: resizing and padding

I didn’t try it yet but will soon do.
Thks again for your help
1 Like