Accessing Images and Labels from the 'data' Object

data = ImageList.from_folder().random_split_by_pct().label_from_folder().transform().databunch()

After we perform the above code to gather Images and Labels from respective sources and save it in the ‘data’ object, everything ( training data, test data and validation data) is systematically maintained in that one variable. My question is after we perform the above code, is there a way to extract, say, training images and its labels separately from the ‘data’ object ?

The reason why I am asking this is, generally, we find training images in various formats and labels in various formats. The fastai data block api is an easy one stop solution for gathering the information from various sources. So the plan is to utilize the datablock api to systematically assemble (train data - labels),(val data - labels),(test data - labels) and then later play around with pytorch by ourselves to experiment with the algorithms which would require to access the (train data - labels),(val data - labels),(test data - labels) separately.

I may be wrong on the attribute names, but the general concept is

  1. Use data.train_dl to access training dataloader
  2. Use data.valid_dl to access validation dataloader
  3. Then you can use dataset attribute of dataloader to access the datasets.
1 Like

Thank you for you response.

After extracting images and targets from the data (i.e. by using data.train_dl.x and data.train_dl.y) I am getting an error when i send this data through my CNN. The error states:

TypeError: conv2d(): argument ‘input’ (position 1) must be Tensor, not Image

So i suppose I have to convert the image to a Tensor ? I thought the get_transforms() handled conversion into tensor but I checked the type of the images and its a ‘fastai.vision.data.ImageList’ and not a tensor. How do I convert imagelist into a tensor ?
I have attached Screenshots.

data.train_dl is a dataloader. You need to use next(iter(dl)) to iterate over the dataloader.

1 Like