Create Databunch from pytorch dataloader


(Weikun Wen) #1

I try to create a Databunch from pytorch dataloader but failed.
I need to build a network for age and gender estimation, thus my dataset have to return the image, age and gender info.
Fastai version:1.0.33 and below is the my code and error screenshot:


#2

This is unrelated to fastai: you can’t put PIL Image directly in a pytorch dataloader.


(Weikun Wen) #3

this is the custom dataset template from pytorch:
if no transform apply, it will return the PIL image, but normally will have the transforms.ToTensor() apply.

from torch.utils.data.dataset import Dataset
from torchvision import transforms

class MyCustomDataset(Dataset):
    def __init__(self, ..., transforms=None):
        # stuff
        ...
        self.transforms = transforms
        
    def __getitem__(self, index):
        # stuff
        ...
        data = # Some data read from a file or image
        if self.transforms is not None:
            data = self.transforms(data)
        # If the transform variable is not empty
        # then it applies the operations in the transforms with the order that it is created.
        return (img, label)

    def __len__(self):
        return count # of how many data(images?) you have

#4

Yes and since you don’t have it here you pass Image directly in a pytorch dataloader, which again isn’t possible.


(Weikun Wen) #5

I changed my code:

  1. use transfrom during craeting dataloader
transform = transforms.Compose([
    transforms.RandomResizedCrop(224),
    transforms.ToTensor(),
])
dataset = ImdbWikiDataset(transform=transform)
dataloader = DataLoader(dataset, batch_size=8, shuffle=True, num_workers=1)

dataloader works fine:
image

  1. create the databunch from the pytorch dataloader:
tfms_train, tfms_val = get_transforms()
test_db = DataBunch(dataloader, dataloader, tfms=tfms_train)
test_db.one_batch()

the error msg says: AttributeError: ‘list’ object has no attribute ‘pixel’

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-7-15cf372150e2> in <module>()
----> 1 test_db.one_batch()


~/anaconda3/envs/pytorch_v1/lib/python3.6/site-packages/fastai/basic_data.py in one_batch(self, ds_type, detach, denorm)
    132         w = self.num_workers
    133         self.num_workers = 0
--> 134         try:     x,y = next(iter(dl))
    135         finally: self.num_workers = w
    136         if detach: x,y = to_detach(x),to_detach(y)


~/anaconda3/envs/pytorch_v1/lib/python3.6/site-packages/fastai/basic_data.py in __iter__(self)
     68         for b in self.dl:
     69             y = b[1][0] if is_listy(b[1]) else b[1]
---> 70             if not self.skip_size1 or y.size(0) != 1: yield self.proc_batch(b)
     71 
     72     @classmethod


~/anaconda3/envs/pytorch_v1/lib/python3.6/site-packages/fastai/basic_data.py in proc_batch(self, b)
     60         "Proces batch `b` of `TensorImage`."
     61         b = to_device(b, self.device)
---> 62         for f in listify(self.tfms): b = f(b)
     63         return b
     64 


~/anaconda3/envs/pytorch_v1/lib/python3.6/site-packages/fastai/vision/image.py in __call__(self, x, *args, **kwargs)
    495     def __call__(self, x:Image, *args, **kwargs)->Image:
    496         "Randomly execute our tfm on `x`."
--> 497         return self.tfm(x, *args, **{**self.resolved, **kwargs}) if self.do_run else x
    498 
    499 def _resolve_tfms(tfms:TfmList):


~/anaconda3/envs/pytorch_v1/lib/python3.6/site-packages/fastai/vision/image.py in __call__(self, p, is_random, *args, **kwargs)
    442     def __call__(self, *args:Any, p:float=1., is_random:bool=True, **kwargs:Any)->Image:
    443         "Calc now if `args` passed; else create a transform called prob `p` if `random`."
--> 444         if args: return self.calc(*args, **kwargs)
    445         else: return RandTransform(self, kwargs=kwargs, is_random=is_random, p=p)
    446 


~/anaconda3/envs/pytorch_v1/lib/python3.6/site-packages/fastai/vision/image.py in calc(self, x, *args, **kwargs)
    447     def calc(self, x:Image, *args:Any, **kwargs:Any)->Image:
    448         "Apply to image `x`, wrapping it if necessary."
--> 449         if self._wrap: return getattr(x, self._wrap)(self.func, *args, **kwargs)
    450         else:          return self.func(x, *args, **kwargs)
    451 


AttributeError: 'list' object has no attribute 'pixel'

#6

Yes, since you’re not using a fastai datasets, you can’t expect the fastai functions to work properly as they rely on different behaviors.
You can put your DataBunch in a Learner object with your custom model to use fastai to train it, but all the helper functions around your data will need fastai datasets.


(Weikun Wen) #7

Thanks Sugugger, I will change to use the fastai datasets.

The fastai doc says we can use the torch.utils.data.DataLoader or torch.utils.data.Dataset during construct the Databunch. But no where to see how…