Unable to create custom data generator for training UNet

sarvagya1991 · March 11, 2020, 7:46am

Hi,

I am trying to create my own data generator to train a UNet for image segmentation. The code is as follows:

df = pd.read_csv('/path/to/csv/data.csv')

X = list(df['input_img'])
y = list(df['mask_img'])

X_train, X_valid, y_train, y_valid = train_test_split(
     X, y, test_size=0.33, random_state=42)

class ToTensor(object):
    """Convert ndarrays in sample to Tensors."""

    def __call__(self, img):
        img = img.transpose((2, 0, 1))
        # return {'image': torch.from_numpy(img),
                # }
        return torch.from_numpy(img)

class NumbersDataset(Dataset):
    def __init__(self, inputs, labels, transform=None):
        classes = [0,1]
        self.X = inputs
        self.y = labels
        self.transform = transform
        self.c = 2

    def __len__(self):
        return len(self.X)

    def __getitem__(self, idx):
        img_train = cv2.imread(self.X[idx])
        img_mask = cv2.imread(self.y[idx])
        img_train = cv2.resize(img_train, (427,240), interpolation = cv2.INTER_LANCZOS4)
        img_mask = cv2.resize(img_mask, (427,240), interpolation = cv2.INTER_LANCZOS4)
        img_mask = cv2.cvtColor(img_mask, cv2.COLOR_BGR2GRAY)
        bin_mask = np.zeros_like(img_mask)
        bin_mask[(img_mask)>0]=1
        bin_mask = bin_mask.reshape(240, 427, 1)
        if self.transform:
            img_train = self.transform(img_train)
            bin_mask = self.transform(bin_mask)

        return img_train, bin_mask

if __name__ == '__main__':
    dataset_train = NumbersDataset(X_train, y_train, transforms.Compose([ToTensor()]))
    # dataset_train = NumbersDataset(X_train, y_train)
    dataloader_train = DataLoader(dataset_train, batch_size=4, shuffle=True)

    # dataset_valid = NumbersDataset(X_valid, y_valid)
    dataset_valid = NumbersDataset(X_valid, y_valid, transforms.Compose([ToTensor()]))
    dataloader_valid = DataLoader(dataset_valid, batch_size=4, shuffle=True)

    datas = DataBunch.create(train_ds = dataloader_train, valid_ds = dataloader_valid)
    # datas.show_batch()
    datas.c = 1
    learner = unet_learner(datas, models.resnet34)

The CSV file contains the location of the required images in the first column and the respective masks in the second one. I get the following error:

TypeError: new() argument after * must be an iterable, not builtin_function_or_method

What should I do?

sgugger · March 11, 2020, 1:20pm

As indicated here please copy and paste the whole stack trace for us to be able to help.

sarvagya1991 · March 11, 2020, 2:43pm

Sorry. This is my first post so didn’t know.

Here’s the full stack trace:

Traceback (most recent call last):
  File "dataset_test.py", line 101, in <module>
    learner = unet_learner(datas, models.resnet34)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/vision/learner.py", line 121, in unet_learner
    bottle=bottle), data.device)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/core.py", line 66, in _init
    old_init(self, *args,**kwargs)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/vision/models/unet.py", line 43, in __init__
    sfs_szs = model_sizes(encoder, size=imsize)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/callbacks/hooks.py", line 113, in model_sizes
    x = dummy_eval(m, size)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/callbacks/hooks.py", line 108, in dummy_eval
    return m.eval()(dummy_batch(m, size))
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/callbacks/hooks.py", line 104, in dummy_batch
    return one_param(m).new(1, ch_in, *size).requires_grad_(False).uniform_(-1.,1.)
TypeError: new() argument after * must be an iterable, not builtin_function_or_method

And when I just try to show a batch of data using datas.show_batch(), I get the following error:

Traceback (most recent call last):
  File "dataset_test.py", line 99, in <module>
    datas.show_batch()
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/basic_data.py", line 186, in show_batch
    x,y = self.one_batch(ds_type, True, True)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/basic_data.py", line 169, in one_batch
    try:     x,y = next(iter(dl))
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/basic_data.py", line 75, in __iter__
    for b in self.dl: yield self.proc_batch(b)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 345, in __next__
    data = self._next_data()
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 385, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
TypeError: 'DataLoader' object does not support indexing

What do you suggest I should do?

sgugger · March 11, 2020, 5:06pm

Oh you passed dataloaders to DataBunch.create. It takes datasets.

jeremy · March 11, 2020, 7:22pm

Or pass dataloaders to DataBunch() (i.e. __init__)

sarvagya1991 · March 11, 2020, 7:30pm

@jeremy, thank you. As you can see in the attached code above, I also get an error when for DataBunch instead of DataBunch.create. I don’t have access to my machine but I will send the error soon. As Jeremy mentrioned, I am passing my training and valid dataloaders to DataBunch().

sgugger · March 11, 2020, 7:35pm

Also, you have posted in fastai v2 when using v1, which is a bit confusing. Let me move that post to the right place.

sarvagya1991 · March 13, 2020, 6:27am

Here is the error for

datas = DataBunch(train_dl = dataloader_train, valid_dl = dataloader_valid):

Traceback (most recent call last):
  File "dataset_test.py", line 100, in <module>
    datas.show_batch()
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/basic_data.py", line 188, in show_batch
    n_items = rows **2 if self.train_ds.x._square_show else rows
AttributeError: 'NumbersDataset' object has no attribute 'x'

When I run
datas.c = 1; learner = unet_learner(datas, models.resnet34)
instead of datas.show_batch(), I get the following error:

Traceback (most recent call last):
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/vision/learner.py", line 117, in unet_learner
    try:    size = data.train_ds[0][0].size
AttributeError: 'dict' object has no attribute 'size'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "dataset_test.py", line 102, in <module>
    learner = unet_learner(datas, models.resnet34)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/vision/learner.py", line 118, in unet_learner
    except: size = next(iter(data.train_dl))[0].shape[-2:]
AttributeError: 'dict' object has no attribute 'shape'

So I thought that maybe I’ll load the data as array instead of dictionary. So I returned an array in class To_Tensor() and I got the following:

Traceback (most recent call last):
  File "dataset_test.py", line 102, in <module>
    learner = unet_learner(datas, models.resnet34)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/vision/learner.py", line 121, in unet_learner
    bottle=bottle), data.device)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/core.py", line 66, in _init
    old_init(self, *args,**kwargs)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/vision/models/unet.py", line 43, in __init__
    sfs_szs = model_sizes(encoder, size=imsize)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/callbacks/hooks.py", line 113, in model_sizes
    x = dummy_eval(m, size)
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/callbacks/hooks.py", line 108, in dummy_eval
    return m.eval()(dummy_batch(m, size))
  File "/home/sarvagya/miniconda3/envs/gr/lib/python3.6/site-packages/fastai/callbacks/hooks.py", line 104, in dummy_batch
    return one_param(m).new(1, ch_in, *size).requires_grad_(False).uniform_(-1.,1.)
TypeError: new() argument after * must be an iterable, not builtin_function_or_method

According to documentation, I am supposed to pass a DataLoader object for train_dl and valid_dl. Kindly let me know what should I do in this case

sgugger · March 13, 2020, 5:02pm

fastai does not provide support for dictionaries in your batches, only tuples (input,target).

sarvagya1991 · March 13, 2020, 6:32pm

But a tuple doesn’t have attribute shape. So even that would give me an error.