Scaling Facial Landmarks to image size?

Hi everyone !

I’m trying to do facial landmarks prediction, using the Helen dataset. I’m having some troubles with the data pipeline and I thought I’d look for help here !
In the dataset, the images have different sizes and the landmark position is given pixel-wise. Thus, to resize the images to a common shape for the network implies that modifications are in order for the landmarks as well.
Also, for stability, I think it is wiser to have all targets between 0 and 1 and scale them afterwards.

So here’s what I’m doing:

def get_label(x): 
        points = np.loadtxt('annotation_simpler/' + x.name.replace('jpg', 'txt').replace('test', 'valid'), delimiter = ',', skiprows = 0)
        p = np.zeros_like(points[1:])
        div = points[0] # image size I added in the txt file

        p[:,0] = points[1:,1]
        p[:,1] = points[1:,0]

        out = torch.tensor(p/div).float()
        return out

data = PointsItemList.from_folder('./fastai_dataset').split_by_folder(train = 'train', valid = 'valid').\
                    label_from_func(get_label).\
                    transform(size = 224).\
                    databunch(bs = 4, num_workers = 1).\
                    normalize(imagenet_stats)

However, when calling data.show_batch(), well, all the points are in the upper-left corner. Also, upon inspection of the matrix for these points, it appears that the coordinates have been modified within the fastai pipeline.
I’ve tried various approaches, such as:

  • Removing the scaling inside the get_label function: This results in non-aligned landmark with the face
  • Setting tfm_y = True in the databunch creation: returns an error concerning the label’s shape that is affected
  • I tried to recreate a Custom Item that would reconstruct correctly the labels and predictions for displaying the results, but so far I didn’t figure out totally how to do this
  • I’ve also seen a scale boolean in the PointLabelList but setting it manually to True didn’t yield any perceivable change.

Do you guys have a solution for this configuration ? Maybe someone already dealt with regression in this specific case ?

Thanks a lot ! (: