omnipresent
(bhaarat sharma)
January 30, 2018, 1:29am
1
I am playing around with CTPN (https://arxiv.org/abs/1609.03605 ) from this GitHub Repo . This provides me with word boundaries in images. However, the word boundaries contain multiple words. For example in the image below:
One of the boxes detected for text are
When this box is run through CRNN (https://arxiv.org/abs/1507.05717 ) i get the result themistakeyou
.
Does someone have any experience with this?
Should I be modifying CTPN to detect single word boundaries or modifying CRNN to recognize multiple words?
Can CRNN be re-trained to recognize multiple words at same time?
Would appreciate any help on this.
1 Like
As far as I know, you can use CTC as loss function to train your model. The loss function enables you to have multi `labels’ in a row as Y of your model.
omnipresent
(bhaarat sharma)
January 30, 2018, 5:32am
3
The implementation I’m using of CRNN does use CTC: https://github.com/meijieru/crnn.pytorch/blob/4850b736fc8186b98c864a0c95c032a887e7c537/utils.py#L10 but I’m unsure whether you’re suggesting that I retrain CRNN with data that contains multiple words? or tweak the parameters. If former, is there a dataset I could use? Mostly folks use http://www.robots.ox.ac.uk/~vgg/data/text/ but it contains single words
omnipresent
(bhaarat sharma)
January 30, 2018, 5:38am
4
In this project https://github.com/ypwhs/captcha_break , captcha labels contain multiple letters. However, I am not sure if words should be kept together as well. CTC seems to ignore blanks with a region. Looking forward to learning from OCR experts.
Did you try training on any other datasets…?
https://github.com/YCG09/chinese_ocr/ refer this. The author has achieved reading a page from document.
Dear @omnipresent ,
I just came across your question today. How is it going? Can you now recognized multiple words?
I am implementing CRNN to do the same and now the accuracy is up to 80%. I did some modification in model architecture in the CRNN repo.
Dear @munziliashali ,
Thank you for sharing the details.
For achieving 80% accuracy , were you doing any pre-processing ?. How many epochs did you have to run ?
@harikrishnanrajeev
Yes, I was. I needed to perform binarization to all my texts with this repo before feeding it to CRNN model.
I reached 80% acc around epoch 500.
1 Like
@munziliashali , for training the model , were you using your own dataset or any publicly available dataset ?. thanks.
@harikrishnanrajeev , I used my own dataset.
1 Like
thanks @munziliashali , few more questions
a) Have you used any pre-trained models ?.
b) How did you generate data for your custom dataset ?.