Text extraction from images

addamit · May 20, 2019, 1:24pm

Hi folks -
I am currently going through part1 2019 course. Many thanks to @jeremy and team for putting out such a high quality course for everyone to learn from !!

I completed lesson 3 on image segmentation camvid data set. While going through this lesson, I started thinking of a simple project using my custom data set. Basically I want to extract text from images.
I checked out CoCo data set but not sure if that is what I should start with or I need to handcraft data set of my own.

I was thinking maybe I need example bounding boxes on images like so:

Also has anyone tried extracting text from images? If so what type of data set have they used.

Any suggestions/pointers on how to go about doing this will be very helpful.
Many thanks
Amit

KarlH · May 20, 2019, 5:48pm

I know CTC methods are used for text recognition

blissweb · May 22, 2019, 10:52am

I think that approach would get you the ability to locate text sub images and their position which you could then pass through to a more dedicated text network. Very interesting problem to work on.

Would also be interesting to see if it could learn without the bounding boxes at all. Just start with one word somewhere on the image randomly and have that be the label. Then move on to multiline text. Assuming you didn’t care about the location of the text.

addamit · May 23, 2019, 2:09am

Thanks for the linked article. This is a bit beyond my current experience with neural nets. Hopefully when I complete part 1 I might be able to take a stab at it.

addamit · May 23, 2019, 2:13am

Thanks @blissweb
I was thinking along the same lines. Extract the bounding box for the image. Crop this bounding box as a separate image and feed the text located sub-image to a text recognition module such as py tesseract to extract the text.

phucnsp · May 25, 2019, 2:28pm

This is one of OCR topic, search keyword detect text in the wild

mr.ashutoshraj · January 22, 2020, 9:43am

You can use dynamic u-net for getting the activation map of the text area in the image. You can use ICDAR dataset to train on. Let me know if you need any help.

addamit · January 24, 2020, 3:03am

Hi Ashutosh -
I am interested in taking a stab at it. Do you have any notebook/script with the u-net and ICDAR data that I can look at?

Thanks

mr.ashutoshraj · January 24, 2020, 5:47am

github.com

ashutoshraj/TextSegmentation/blob/master/detection.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Text segmentation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "%reload_ext autoreload\n",
    "%autoreload 2\n",
    "%matplotlib inline"
   ]
  },

This file has been truncated. show original

Some of the errors are there in the mask. Otherwise, implementation will be the same. I will update you with the latest result.

bibsian · March 25, 2020, 8:50pm

@addamit Did you ever get any further with this? I’m curious about trying something similar in my project and wondering if you ever came across this: https://github.com/songdejia/EAST/blob/master/model.py. The paper is here (https://arxiv.org/pdf/1704.03155.pdf).

I’ve never built my own model but this seems like it could be an interesting thing to experiment with?

kyle.h · April 14, 2020, 5:10pm

There’s a decent number of threads on this topic, most pointing to the FastAI 2018 Part 2 course on object detection. I’m also working through the examples in 2019 Part 1 - Course V3 and would love any FastAI v1 or PyTorch examples people may have.

MuhammadAli · February 17, 2021, 5:37pm

Thanks for sharing your nb. I am unable to find the dataset, can you help me out with this?

Thanks

mr.ashutoshraj · February 21, 2021, 6:56am

MuhammadAli · February 21, 2021, 6:10pm

Thanks, Ashut,but I am talking about text segmentation dataset, not for cars…

MuhammadAli · February 22, 2021, 5:30am

I am trying to implement SRGAN in fastai for text, but I am getting complete black screen as output.

I do not know what is going wrong.I am able to start training, but output is always a black screen.

claraashford21 · January 23, 2025, 3:48pm

If you are looking for a tool to extract text from images, I recommend checking out ImgOCR.com.

Our Tool offers:

Text Extraction from Images: Easily extract text from any image format.
Image Translator: Translate the extracted text into different languages.
PNG/JPG to Word Conversion: Convert image files like PNG or JPG into editable Word documents.
PDF to Word Conversion: Turn your PDFs into Word files while maintaining formatting.

It’s quick, accurate, and easy to use. Perfect for handling all your text extraction and conversion needs.