Optical Character Recognition (OCR) for power meter readings

faib · May 28, 2019, 11:45am

Hi everyone,

I am working on an OCR model using fastai. In the first step I successfully extracted the meter readings from the image like so:

7500002498

Now I am stuck trying to use OCR on the extracted meter readings. I already used Pytesseract and Google Cloud Vision API with little success.

How am I supposed to go on here? I thought about using the extracted image and train it using MultiCategoryList for the labels (how sgugger described it here.
However supplying label_from_func with a string with the letters seperated by a char and then passing label_delim=';' when i call label_from_func removes the order of the labels and duplicates.

data = (ImageList.from_folder('data/train-extracted-224')
    .split_by_rand_pct()
    .label_from_func(get_label, label_delim=';',label_cls=MultiCategoryList)
    .databunch()
   )

data.show_batch(3)

sgugger · May 28, 2019, 1:31pm

Yes there is no proper class to do this yet. You should look at the code of MultiCategory and adapt it to your needs.

haverstind · May 29, 2019, 8:18am

Did you extract the region manually or automatically? Anyway, I’d be very interested in updates regarding your project

faib · June 14, 2019, 7:06am

I drew bounding boxes around the power meter reading on ~200 images and then used Radek Osmulski’s notebook from the Humpback Whale Identification Competition to extract the bounding boxes

haverstind · June 14, 2019, 7:20am

Thank you for the update. Did you adapt the MultiCategoryList in the end? (because I’m still trying to understand how to modify it so you could have ordered labels + “duplicates”).

faib · June 14, 2019, 7:46am

I only used the approach above for the meter reading extraction.

I currently am trying to use RetinaNet for detection of multiple objects, meaning the various digits (see here for a more detailed explanation).