Reading Text from images using convolution


(satish) #1

I have finished lesson 2 of MOOC and I am looking for creation which can read the text from the images. I believe CNNs are powerful enough to read the same .

I would like to know if there are any pre trained model which can do the same.


(Karthik Kannan) #2

You will need to look at an attention OCR model for this. Google put out a pre-trained model that does just that.

https://github.com/tensorflow/models/tree/master/attention_ocr

Take a look at this. :slight_smile:


(arnaud) #3

Hi everyone :slight_smile:

Not to be too lazy but… Has someone, by chance, already digged into that attention_ocr model released by Tensorflow that @karthik_k314 mentioned ? Would be great to know before going into the wild, because as such it doesn’t look as “easy to use”, well at least to me :slight_smile:

Thanks a lot,
A


(Alexandre Cadrin-Chênevert) #4

I think the attention OCR is focused on street names. The more generic universal text OCR is Tesseract also owned by Google : https://github.com/tesseract-ocr


(Niyaz Puzhikkunnath) #5

Whether Tesseract will work for you depends on the type of image you are working with.

  • Scans of documents? Tesseract may work well.
  • Photographs of street signs? Tesseract will fail most probably.

See: https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality


(Karthik Kannan) #6

If you do want to use the attention_ocr model on your own data and deploy the model you are better off with DeepDetect https://deepdetect.com/

I’ve found it to be FAAAR better than Tensorflow Serving. DeepDetect has made it far easier for me to deploy models than TF Serving.


(arnaud) #7

Hi all,

First, thanks for your answers.

@alexandrecc @niyazpk I’m indeed trying to detect text in pictures. They’re not street signs, but rather products from daily life (e.g a bottle) or things like that. Thus I’m not sure tesseract is the solution…

@karthik_k314 I’ve been through deepdetect and they have spectacularly easy to consume Docker images for localization and classification. I couldn’t find any ready-to-use text recognition model though. Did you go through it ?

Google’s Cloud API for Text Detection works really well… I don’t know what they use :wink: