Hi there,

I would like to solve a interesting fundamental problem in Sport Analytic but a bit lost atm… hopefully somebody here could help to evaluate my approach or could point out for me a different direction. I wanted to create a deep learning model to learn the homography of a hockey video footage to map it to the rink model, something similar to this: https://www.youtube.com/watch?v=9pUjkdiS9HE …

I have think of this problem in mind for months and doing quite a lot of research, found out the the traditional approach using features detection & RANSAC to find the Homography between the consecutive frame kind of work, but I think that this can be solve using Deep Learning as well.

The solution I have in mind so far is kind of similar to Facial Keypoints detection:

- Define a set of keypoints in the rink model, like
`(kp1, kp2, kp3...)`

and they are always in fixed positions. (mostly corners & points that help the model able to differentiate) - Design a CNN that will learn to detect these positions in the video footage and then find the top 4 points and do a
`warpPerspective`

to the rink model.

I think the CNN will basically similar to the Facial Keypoints Detection, but the different is that I don’t have the whole rink at a time (compare with the whole face), but only a part of the rink. Hopefully that won’t be a problem for the network to learn. I hope that the model can also learn of the spacial structure of those keypoints.

Do you think my approach above is feasible? I really appreciate your help!

Many Thanks,

Anh