4 point regression in images to scan and warp documents

Hello! I’ve been doing the fast.ai course for a few weeks now and I had the idea to do something that would be applicable in the place I work in, which would be to scan a specific type of document that’s a bit too complex for a simple edge detection program using OpenCV.

The idea would be to do a 4 point regression which would “detect” the four corners of the rectangular document to easily warp and crop it, facilitating OCR usage afterwards.

Although I have the dataset ready, that is, I have many images of the specific type of document with the coordinates of each corner in order, I have no idea on how to use the fast.ai API to actually train the model. I tried looking at the example in lesson3-head-pose but I could not figure out any way to pass more than one point as label for each image in the training set.

I also tried using image segmentation to classify pixels in an image as either “part of document” or “not part of document” and although that worked reasonably, it didn’t work well enough to find the four corners of the document using edge and contour detection explained in this article: How to Build a Kick-Ass Mobile Document Scanner in Just 5 Minutes

Does anybody have any ideas? If you can think of a better way to achieve this without using point regression, please share it. I would be very grateful.