RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0

suhaild · December 23, 2018, 3:40am

Environment information

fastai version: 1.0.38
PyTorch version: 1.0.0
Is debug build: No
CUDA used to build PyTorch: 9.0.176

OS: Ubuntu 16.04.5 LTS
GCC version: (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609
CMake version: version 3.12.2

Python version: 3.6
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: Quadro M4000
Nvidia driver version: 410.48
cuDNN version: Could not collect

Versions of relevant libraries:
[pip] Could not collect
[conda] pytorch 1.0.0 py3.6_cuda9.0.176_cudnn7.4.1_1 pytorch
[conda] pytorch-nightly 1.0.0.dev20181115 py3.6_cuda9.0.176_cudnn7.1.2_0 pytorch
[conda] torchvision 0.2.1 py_2 pytorch
[conda] torchvision-nightly 0.2.1 py_0 fastai

Error

RuntimeError: Traceback (most recent call last):
  File "/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/paperspace/anaconda3/lib/python3.6/site-packages/fastai/torch_core.py", line 105, in data_collate
    return torch.utils.data.dataloader.default_collate(to_data(batch))
  File "/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 232, in default_collate
    return [default_collate(samples) for samples in transposed]
  File "/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 232, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File "/home/paperspace/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 209, in default_collate
    return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 3146 and 3009 in dimension 2 at /opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/TH/generic/THTensorMoreMath.cpp:1333

Reproduce

tfms = get_transforms()

data = ImageDataBunch.from_csv(
    path=DATA_PATH,
    folder='train',
    csv_labels=LABELS_PATH,
    sep=',',
    bs=bs,
    ds_tfms=tfms,
    num_workers=1,
).normalize(imagenet_stats)

print(len(data.train_ds))

Output: 1882

I get this warning:

/home/paperspace/anaconda3/lib/python3.6/site-packages/fastai/basic_data.py:211: UserWarning: It's not possible to collate samples of your dataset together in a batch.
Shapes of the inputs/targets:
[[torch.Size([3, 2106, 1433]), torch.Size([3, 3240, 1917]), torch.Size([3, 2751, 1713]), torch.Size([3, 1556, 1188]), torch.Size([3, 2984, 1811]), torch.Size([3, 2705, 1693]), torch.Size([3, 2905, 1778]), torch.Size([3, 2977, 1808]), torch.Size([3, 3533, 2036]), torch.Size([3, 3652, 2084]), torch.Size([3, 3103, 1860]), torch.Size([3, 3063, 1843]),

lr = 1e-2
learn = create_cnn(
    data=data, 
    arch=models.resnet34, 
    pretrained=True, 
    metrics=[metrics.error_rate, metrics.accuracy]
)

learn.fit_one_cycle(1, lr)

My hunch is that there’s something wrong how I am using the ImageDataBunch since I can successfully get the course3 notebook lesson1-pets working.

sgugger · December 24, 2018, 8:59am

You need to pass a size argument so that your images are all resized to it. The warning you got when creating the DataBunch told you that your images couldn’t be put together in a batch because they’re not of the same size.

harpalss · January 18, 2019, 12:22pm

I get the same warning when I create my ImageDataBunch instance with a size key word argument like so:

data = ImageDataBunch.from_folder(
    '../data/processed/images_sample',
    size=224,
    valid_pct=0.1,
    tfms=get_transforms(),
)

UserWarning: It’s not possible to collate samples of your dataset together in a batch.

I would assume passing the the size=224 key word argument would resize the image to 224 x 224 and thus create equal size tensors?

I’m using fastai version 1.0.39

sgugger · January 18, 2019, 2:49pm

Be careful, get_transforms() goes in ds_tfms, not tfms.

harpalss · January 18, 2019, 3:05pm

Interesting, didn’t know get_transforms() goes in ds_tfms. This fixed the issue for me.

xjdeng · February 5, 2019, 9:03am

I was having that error until I did the following:

data = ImageDataBunch.from_folder(mypath, valid_pct = 0.2, size=224,ds_tfms=get_transforms()).normalize(imagenet_stats)

musedivision · May 23, 2019, 12:04am

This fixed my similar problem!

Posting here for others to find:

# my images in folder, creating ImageDataBunch
data = ImageDataBunch.from_folder(path, train='.', valid_pct=0.2,
                                  ds_tmfs=get_transforms(),
                                  size=224, num_workers=4).normalize(imagenet_stats)

then data.show_batch() produced this error Error

RuntimeError: Traceback (most recent call last):
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/opt/conda/lib/python3.6/site-packages/fastai/torch_core.py", line 99, in data_collate
    return torch.utils.data.dataloader.default_collate(to_data(batch))
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 232, in default_collate
    return [default_collate(samples) for samples in transposed]
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 232, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File "/opt/conda/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 209, in default_collate
    return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 280 and 312 in dimension 2 at /opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/TH/generic/THTensorMoreMath.cpp:1333

Solution ds_tmfs was spelled wrong
So the different image sizes weren’t being transformed to 224

data = ImageDataBunch.from_folder(path, train='.', valid_pct=0.2,
                      🤦‍♂️    -->   ds_tfms=get_transforms(),
                                  size=224, num_workers=4).normalize(imagenet_stats)

Classic

Thanks