I would like to solve a interesting fundamental problem in Sport Analytic but a bit lost atm… hopefully somebody here could help to evaluate my approach or could point out for me a different direction. I wanted to create a deep learning model to learn the homography of a hockey video footage to map it to the rink model, something similar to this: https://www.youtube.com/watch?v=9pUjkdiS9HE …
I have think of this problem in mind for months and doing quite a lot of research, found out the the traditional approach using features detection & RANSAC to find the Homography between the consecutive frame kind of work, but I think that this can be solve using Deep Learning as well.
The solution I have in mind so far is kind of similar to Facial Keypoints detection:
- Define a set of keypoints in the rink model, like
(kp1, kp2, kp3...)and they are always in fixed positions. (mostly corners & points that help the model able to differentiate)
- Design a CNN that will learn to detect these positions in the video footage and then find the top 4 points and do a
warpPerspectiveto the rink model.
I think the CNN will basically similar to the Facial Keypoints Detection, but the different is that I don’t have the whole rink at a time (compare with the whole face), but only a part of the rink. Hopefully that won’t be a problem for the network to learn. I hope that the model can also learn of the spacial structure of those keypoints.
Do you think my approach above is feasible? I really appreciate your help!