Can CTPN or CRNN be changed to recognize multiple words


(bhaarat sharma) #1

I am playing around with CTPN (https://arxiv.org/abs/1609.03605) from this GitHub Repo . This provides me with word boundaries in images. However, the word boundaries contain multiple words. For example in the image below:

line

One of the boxes detected for text are

box_0

When this box is run through CRNN (https://arxiv.org/abs/1507.05717) i get the result themistakeyou.

Does someone have any experience with this?

  • Should I be modifying CTPN to detect single word boundaries or modifying CRNN to recognize multiple words?
  • Can CRNN be re-trained to recognize multiple words at same time?

Would appreciate any help on this.


#2

As far as I know, you can use CTC as loss function to train your model. The loss function enables you to have multi `labels’ in a row as Y of your model.


(bhaarat sharma) #3

The implementation I’m using of CRNN does use CTC: https://github.com/meijieru/crnn.pytorch/blob/4850b736fc8186b98c864a0c95c032a887e7c537/utils.py#L10 but I’m unsure whether you’re suggesting that I retrain CRNN with data that contains multiple words? or tweak the parameters. If former, is there a dataset I could use? Mostly folks use http://www.robots.ox.ac.uk/~vgg/data/text/ but it contains single words


(bhaarat sharma) #4

Seems like author thinks CTC won’t work https://github.com/meijieru/crnn.pytorch/issues/53#issuecomment-321514688


#5

In this project https://github.com/ypwhs/captcha_break, captcha labels contain multiple letters. However, I am not sure if words should be kept together as well. CTC seems to ignore blanks with a region. Looking forward to learning from OCR experts.