Is there a way to efficiently extract all PILImages from fastai.data.core.Dataset object?

I was trying to rewrite my code from fastai1 to fastai2 due to GPU incompatibility, and I run into a problem with extracting images:

fastai1 (extract all images from train_ds):
data.train_ds.x

I have 100,000 images as input, and my list comprehension technique doesn’t work (the thread was killed every time I try to run it):
[x[0] for x in data.train_ds]

Is there a better way to extract all the images in fastai2 from train_ds?

Thanks in advance!!!

Hi there,

I believe what you are looking for is dls.train.xs.

But why do you want to get the images like this?

You have taken the raw files from somewhere to put them into data.

1 Like

I wanted to create a custom dataset based on fastai Dataset class which perform some changes to images and labels. The data object I created is similar to

data = DataBlock(blocks=(ImageBlock, MultiCategoryBlock),
                   splitter=ColSplitter('is_valid'),
                   get_x=ColReader('fname', pref=str(path/'train') + os.path.sep),
                   get_y=ColReader('labels', label_delim=' '),
                   item_tfms = Resize(460),
                   batch_tfms=aug_transforms(size=224)). dataloaders(df)

from the vision tutorial: Computer vision | fastai.

I think dls.train.xs only works for Tabular data with TabularPandas (please correct me if I am wrong), but I had used DataBlock which doesn’t have ‘.train.xs’ as an attribution (sorry for not mentioning this in the question)?

Haven’t been on the forum for a while… Did you find a solution? My bad about the .train.xs being tied to tabular data.

Yes! The reason why I want to extract all the images is to custom changes to the dataset. Since the way I structured my code (creating a custom dataset) allows me to use getitem(self, i), I just extract the images using img[i][0] there.

1 Like

Glad you found a way :muscle: