Extracting data from a form for new dataset

I’m looking for some tips on creating a new dataset.

So we have forms with 5 images on them and I want to find a tool or a way to capture each of those images and put them in a folder. For example box 5 will always be a picture of the same thing, so I just want to put a box around it and say capture the data in this box and save it as a jpg. Is there a good tool to do this or a tutorial where somebody has done something similar?

I’ve tried googling the issue, but most of what I’m finding assumes you have the images already split out.

If all the forms are similar, I would approach it with opencv, google your problem with ‘opencv’ or ‘cv2’ in the query. The basic approach would be to separate photo from non photo then find contours to get bounding boxes. You may need to further process. e.g. this comes up: https://stackoverflow.com/questions/51498510/how-extract-pictures-from-an-big-image-in-python

Here is an example of the form

So what I’m looking for is a way to extract the hand-drawn images. Do you think that finding the contours would be doable for this? I will research that option. Thanks for the response!

Initially I thought just extracting the data using bounding boxes would work, but unfortunately, the scans are all slightly skewed so I don’t get a great image of just the hand-drawn images.

In case anybody else has similar issues, here is an article that seems to do a similar project to what I was wanting to do: http://www.aishack.in/tutorials/sudoku-grabber-opencv-plot/ (This code is in C++, but the concept should still hold true)

If your forms are all like this it is very simple. Get contours; get bounding box of each contour; compare bounding box to matching reference (side ratios, angle, x/y position, etc), grab area inside box as your images, rotating by found angle if you wish. Something like this code shows the location of the 12 boxes above. Obviously its just a start for you to adapt, cv2 is a great detection tool for targets like this.

imgray = cv2.cvtColor(np.array(im),cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(imgray,127,255,0) # binarize
_, contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contours: # for each contour
    # convert contour to bounding box
    rect = cv2.minAreaRect(cnt) 
    a, b = rect[1] # height and width
    if (a>200) & (b>200) & (a<300) and (b<300): # known target sizes
        if 1==1: # do other checks if you like, eg aspect ratio, xy position filtering, angle
            print(rect) # (center, height/width), angle)