Optical Character Recognition (OCR) for power meter readings

Hi everyone,

I am working on an OCR model using fastai. In the first step I successfully extracted the meter readings from the image like so:

7500002498

Now I am stuck trying to use OCR on the extracted meter readings. I already used Pytesseract and Google Cloud Vision API with little success.

How am I supposed to go on here? I thought about using the extracted image and train it using MultiCategoryList for the labels (how sgugger described it here.
However supplying label_from_func with a string with the letters seperated by a char and then passing label_delim=';' when i call label_from_func removes the order of the labels and duplicates.

data = (ImageList.from_folder('data/train-extracted-224')
    .split_by_rand_pct()
    .label_from_func(get_label, label_delim=';',label_cls=MultiCategoryList)
    .databunch()
   )

data.show_batch(3)

Yes there is no proper class to do this yet. You should look at the code of MultiCategory and adapt it to your needs.

Did you extract the region manually or automatically? Anyway, I’d be very interested in updates regarding your project :slight_smile:

1 Like

I drew bounding boxes around the power meter reading on ~200 images and then used Radek Osmulski’s notebook from the Humpback Whale Identification Competition to extract the bounding boxes :slight_smile:

1 Like

Thank you for the update. Did you adapt the MultiCategoryList in the end? (because I’m still trying to understand how to modify it so you could have ordered labels + “duplicates”).

I only used the approach above for the meter reading extraction.

I currently am trying to use RetinaNet for detection of multiple objects, meaning the various digits (see here for a more detailed explanation).

1 Like