Help to perform OCR

abhigupta4981 · March 28, 2019, 8:34pm

I want to make a model which can perform the task of OCR i.e reading the text inside an image. Is there some dataset which I can use and what architecture should I use

klemenka · March 29, 2019, 8:54am

I found this can be useful: 🔥 Latest Deep Learning OCR with Keras and Supervisely in 15 minutes | HackerNoon

and some GitHub page: GitHub - emedvedev/attention-ocr: A Tensorflow model for text recognition (CNN + seq2seq with visual attention) available as a Python package and compatible with Google Cloud ML Engine.

You can also generate your dataset by putting text in the image. But I don’t know how it will work with real text and real photos.

from PIL import Image
from PIL import ImageFont
from PIL import ImageDraw 

img = Image.open("sample_in.jpg")
draw = ImageDraw.Draw(img)
# font = ImageFont.truetype(<font-file>, <font-size>)
font = ImageFont.truetype("sans-serif.ttf", 16)
# draw.text((x, y),"Sample Text",(r,g,b))
draw.text((0, 0),"Sample Text",(255,255,255),font=font)
img.save('sample-out.jpg')

abhigupta4981 · March 29, 2019, 9:12am

thank you