How to use image pixel arrays in datablocks

Just wanted to share a useful trick.

Let’s say you are working with image data in the form of pandas dataframes or numpy arrays. I.e., you have all the pixel values stored in arrays, not as images. How do you then use them in a DataBlock?

Here is one way to do just that:

train_x.shape, train_y.shape
# gives: ((27455, 28, 28), (27455,))

def np_image_datablock(images, labels):

    dblock = DataBlock(
            blocks    = (ImageBlock, CategoryBlock),
            get_items = lambda idx: idx,
            get_x     = lambda i: images[i],
            get_y     = lambda i: labels[i] 
    )

    dls = dblock.dataloaders(list(range(images.shape[0])))

    return dls

dls = np_image_datablock(train_x, train_y)

Hope it helps!

5 Likes