What model to choose for multiple handwritten digit recognition

IRailean · October 27, 2020, 8:36am

Hello,
The task is to detect digits in the grid in an image .
Example:
21000

What is the best model to use in this case?
I found that PP-OCR is doing very well, but I think this may be overkill for my task.

Thank you!

birosjh · October 29, 2020, 2:59am

If you can ensure that the grid is always in the same place each time, you could just use a library like OpenCV to slice the grid into individual images and then have one classifier that classifies the numbers.

IRailean · October 29, 2020, 4:14pm

Hi @birosjh, thank you for your answer.

No, in test images the grid is not always present.
I tried approach mentioned by you + trying to restore the grid and applying different transformation, but the result does not seem to be good.

birosjh · October 30, 2020, 1:22am

@IRailean
Ah so the position of the numbers can move around quite a bit then? If splitting them into individual images isn’t possible, then I think that your idea to look for pretrained models is a good idea. PP-OCR seems like it would be fine to use, but you might check if it needs a GPU or not for inference. This project also has pretrained models and looks like it might work pretty well https://github.com/githubharald/SimpleHTR

IRailean · November 4, 2020, 4:01pm

Some preprocessing + densenet121 has done very good work (resnet34 is aplicable too, but has 2-3% lower accuracy)

Find and remove horizontal and vertical lines + remove noises using Gaussian kernels.
Join close contours.
Find 4 biggest contours. Remove contours with small area/width/height (not all items have exactly 4 digits, maybe less).
Extract digits.
Use CNN.

birosjh · November 6, 2020, 2:36am

Glad you found out a way to tackle it. I hadn’t thought of finding the 4 biggest contours. Awesome idea!