Flipping and image regression with keypoints


I want to augment my data with flipping. But I’ve realized that can have some issues when the keypoints have semantic meaning or an order to them. For instance, look at this picture of a cat. All the keypoints for ears are in the same direction: [left ear, right ear]. But after flipping, they are now opposite. So when the model makes a prediction, it will get a big error even though it was correct!

To make it even more clear:
I have two points for the ears: [(100, 150), (200, 150)]
Flipping the image around x=150 gives the Y: [(200, 150), (100, 150)]
Predictions from model: [(98, 149), (201, 151)]
Will give a huge error, since the model predicts left to right, but the points now are in the other direction.

How is this best solved? Fixing my MSELoss to first sort the points from Y-truth back to a left-to-right order? Or do it in some augmentation step to make sure they’re all in the same order?

I actually think this is the “bug” mentioned here, with flip and tensorpoints making it only predict in the middle: Flip with TensorPoint causes a bug · Issue #100 · fastai/fastai2 · GitHub
Of course it will, when the flips gives it a huge error so it regresses towards the middle.

Also might be the issue @muellerzr is seeing in his GH issue here? Major bug regarding Augmentation and TensorPoint · Issue #2628 · fastai/fastai · GitHub