Hello, I’m a new fastai user. I’m going through the Deep Learning for Coders with fastai & PyTorch book right now but ultimately I’d like to create an ai model which is able to recognize math equations found in an image. ie I take a photo of a math textbook, and the AI can find all the math on the page. I’d like the model to be able to generate MathML ( MathML - Wikipedia) for all the identified parts on the page.

Any help would be appreciated in guiding me on how to use fastai to achieve my goal. I was thinking maybe my problem would be an OCR type of problem Building Custom Deep Learning Based OCR models (nanonets.com) but current off the shelf OCR’s are terrible at identifying math. ie x^2 can’t even be accurately identified, not to mention fractions, square roots, sigmas etc

I was thinking that my problem might be able to be solved by using Segmentation ( Computer vision | fastai) If I can train an ai model to recognize each math character then I would need to find the bounding box (not sure how to do that yet) of each character, and then I would have to stitch together a math equation based on the boxes that I have?

Any suggestions or approaches that I could look at?