Making a Dataset from one single, large Image file

Hello everyone!

I trained a classifier to detect power grid elements in aerial imagery and want to use it for object localization on larger images.
The classifier is trained on 150x150 pixel images (down-scaled to 45x45) and the large images are of size 5000x5000 pixels. I want to go over the whole image with the classifier and use it to find power grid elements.
Since it is inefficient to run this task on the CPU I want to create a Dataset/DataBlock/?? for lots of large images and use classifier.predict() or classifier.get_preds() to evaluate the image parts on the GPU.
I am struggling to find a way to choose the correct fastai data structure and procedure to split the large image in a set of 150x150 pixel images without saving them on disk (which is to be avoided since this also slows down the process and should not be necessary ).

Currently my idea was to do it more or less by hand:
predictions = []
img_size = (5000, 5000, 3)
detectior_size = 150
step_size = 100

for x in range(0, img_size[0], step_size):
for y in range(0, img_size[1], step_size):
img = full_img[x:x + detector_size, y:y + detector_size]
# plt.imshow(img)
# plt.show()
prob = element_detector.predict(img)
predictions.append((x, y, prob))

The item transforms for training the network were ITEM_TRANSFORMS = [
Resize(CROP_SIZE, method=‘crop’),
Resize(RESCALED_SIZE, method=‘squish’)
] with a crop_size of 150 and rescaled_size of 45.

I am looking forward to see how this can be done with FastAI, thank you very much for reading this and any help is very welcome! :slight_smile:
And yes, I will go over lots of large images later, but they are not known in advance.

Amadeus

1 Like

I solved it like this.

helper functions for loading the image data by splitting the large image into (overlapping) tiles

STEP_SIZE = 50
TILE_SIZE = 200


def get_tiles_from_large_image(image_path):
    image = tensor(Image.open(image_path))
    number_of_tiles = [math.floor((image.shape[0] - TILE_SIZE) / STEP_SIZE),
                       math.floor((image.shape[1] - TILE_SIZE) / STEP_SIZE)]
    return [(([image, i, j]),)
            for i in range(0, number_of_tiles[0])
            for j in range(0, number_of_tiles[1])]


def get_tile(image_and_location):
    image, i, j = image_and_location
    selected_slice = get_slice(i, j)
    return image[selected_slice]


def get_slice(i, j):
    pos = np.array([i, j]) * STEP_SIZE
    return (slice(pos[0], pos[0] + TILE_SIZE),
            slice(pos[1], pos[1] + TILE_SIZE))

datablock = DataBlock(blocks=(ImageBlock),
                      get_items=get_tiles_from_large_image,
                      get_x=get_tile)
2 Likes