OCR with fastai?

Hi all,

Does anyone have any tips or starting points for performing OCR with fastai v2?

My use case is the de-identification of DICOM medical images which have patients’ data burned in to the images.

Cloud-based OCR options from AWS, Google, Azure seem to perform very well - however they require sending patient data to the cloud which is not HIPAA compliant.

Therefore I’ve been tasked with creating a homemade OCR solution, which can sit on an owned machine, and I thought a deep learning model made with fastai might be a good candidate to solve this.

If anyone has pointers let me know :slight_smile:

Edit:
From looking at a couple articles,

Fastai Arabic Character Recognition
Devanagari Handwritten Classifier

It looks like I’d be able to make a classifier for identifying individual characters fairly easily. It could possibly be fine tuned for my data set (it’s all computerized text, no handwriting; in a particular set of fonts used in DICOM).

I would think the other crucial thing to solve, is how to extract text regions, and split them up into characters (which are unknown), which would then be passed into the classifier for identification.

Any nice way to use fastai to extract the text regions/individual characters?

1 Like

Have you tried using a standard ocr library? I guess if the Text is not handwritten you should get Good Results with e.g. Tesseract.

I Built and use the following Docker Image for text extraction / ocr.

2 Likes

Hi Florian,

I have tried it, it works pretty well if the data is consistent (all text same size, same font) but poorly in other circumstances. For instance, sometimes text is overlaid on irregular backgrounds (the contents of a CT scan or xray) and it doesn’t get good results. Strangely enough the cloud variations handle that well.

I noticed the other thing you linked, Textract, I will check that out.

Still, I want to give fastai a go for the learning experience. I found this similar thread interesting, looks like I would need to use segmentation to extract the characters. Does anyone know how to do this? Does it require some ground-truth data, like in the camvid example in the lessons?

OCR with fastai sounds like an interesting project. I don’t have much experience with it myself, but have you looked into Smart Engines?

OCR with fastai sounds like an interesting project. I don’t have much experience with it myself, but have you looked into Smart Engines? They offer OCR software that can be deployed on-premise and is HIPAA compliant. It might be worth checking out as an alternative to building a homemade solution from scratch. However, if you do decide to proceed with fastai, it seems like you’re on the right track with creating a character classifier. As for extracting text regions and splitting them into characters, you might want to look into using techniques like contour detection and bounding boxes to identify regions of text. Best of luck with your project!

I was looking for a good OCR software. I think I will try it. Thanks for your help.

I haven’t worked with fastai for OCR specifically, but I’ve tackled a similar challenge of text extraction from images for identity verification purposes. From my experience, using advanced OCR technology like what’s offered by ID Analyzer can be incredibly effective. Their ID Verification API uses cutting-edge computer vision and AI to scan and accurately extract data from various identity documents. This technology impressively handles a wide range of document conditions and languages, showing exceptional accuracy rates, like 99.8% for English and even 98.5% for complex scripts like Chinese. For your DICOM images, the key might be developing a custom solution that can detect and extract text regions before feeding them into an OCR system. Considering the sensitivity of medical images, a solution that ensures data privacy and compliance, similar to the precision and reliability offered by Identity Verification technologies, could be crucial.