I am working to extract text from digital documents(pdf) using OCR. I am facing a major issue of resolution of images as pdf’s are mainly invoice documents coming from different customer and suppliers and they have their own way to generate it.
When converting to images the resolution of image changes. I was looking to work on deep learning based approach to fix the resolution image, as of now we have resized all images from pdf conversion to 300dpi, but i still feel this is not the best approach and was wondering if this can be learned. So i started to browse ideas of super resolution but could not find the best approach for this.
If possible please guide me to some papers or approach that i should take to solve this problem.
Many thanks in advance.