Multiclassification errors

apologies for lengthy post, however i get errors for multiclassification problem, i use lesson2-planet notebook as example for different dataset.

help will be much appreciated!

labels:
image

image files:
image

ImageDataBunch works:

data = ImageDataBunch.from_csv(path, folder='train_f', sep=' ', csv_labels='labels_train_f.csv', valid_pct=0.2, suffix='.jpg', size=224, bs=64)

but data.normalize() returns error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-33-4a4ed0ab3d33> in <module>()
----> 1 data.normalize()

~/.anaconda3/lib/python3.7/site-packages/fastai/vision/data.py in normalize(self, stats)
    342         "Add normalize transform using `stats` (defaults to `DataBunch.batch_stats`)"
    343         if getattr(self,'norm',False): raise Exception('Can not call normalize twice')
--> 344         if stats is None: self.stats = self.batch_stats()
    345         else:             self.stats = stats
    346         self.norm,self.denorm = normalize_funcs(*self.stats)

~/.anaconda3/lib/python3.7/site-packages/fastai/vision/data.py in batch_stats(self, funcs)
    336         "Grab a batch of data and call reduction function `func` per channel"
    337         funcs = ifnone(funcs, [torch.mean,torch.std])
--> 338         x = self.valid_dl.one_batch()[0].cpu()
    339         return [func(channel_view(x), 1) for func in funcs]
    340 

~/.anaconda3/lib/python3.7/site-packages/fastai/basic_data.py in one_batch(self)
     79         self.num_workers = 0
     80         it = iter(self)
---> 81         try:     return next(it)
     82         finally: self.num_workers = w
     83 

~/.anaconda3/lib/python3.7/site-packages/fastai/basic_data.py in __iter__(self)
     72     def __iter__(self):
     73         "Process and returns items from `DataLoader`."
---> 74         for b in self.dl: yield self.proc_batch(b)
     75 
     76     def one_batch(self)->Collection[Tensor]:

~/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in __next__(self)
    613         if self.num_workers == 0:  # same-process loading
    614             indices = next(self.sample_iter)  # may raise StopIteration
--> 615             batch = self.collate_fn([self.dataset[i] for i in indices])
    616             if self.pin_memory:
    617                 batch = pin_memory_batch(batch)

~/.anaconda3/lib/python3.7/site-packages/fastai/torch_core.py in data_collate(batch)
     89 def data_collate(batch:ItemsList)->Tensor:
     90     "Convert `batch` items to tensor data."
---> 91     return torch.utils.data.dataloader.default_collate(to_data(batch))
     92 
     93 def requires_grad(m:nn.Module, b:Optional[bool]=None)->Optional[bool]:

~/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in default_collate(batch)
    230     elif isinstance(batch[0], container_abcs.Sequence):
    231         transposed = zip(*batch)
--> 232         return [default_collate(samples) for samples in transposed]
    233 
    234     raise TypeError((error_msg.format(type(batch[0]))))

~/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in <listcomp>(.0)
    230     elif isinstance(batch[0], container_abcs.Sequence):
    231         transposed = zip(*batch)
--> 232         return [default_collate(samples) for samples in transposed]
    233 
    234     raise TypeError((error_msg.format(type(batch[0]))))

~/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in default_collate(batch)
    207             storage = batch[0].storage()._new_shared(numel)
    208             out = batch[0].new(storage)
--> 209         return torch.stack(batch, 0, out=out)
    210     elif elem_type.__module__ == 'numpy' and elem_type.__name__ != 'str_' \
    211             and elem_type.__name__ != 'string_':

RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 768 and 774 in dimension 2 at /opt/conda/conda-bld/pytorch-nightly_1540036376816/work/aten/src/TH/generic/THTensorMoreMath.cpp:1317

data.show_batch works:
image

there are 5304 data classes

also using data block API returns error:

np.random.seed()
data = (ImageFileList.from_folder(path)            
        .label_from_csv('labels_train_f.csv', sep=' ', folder='train_f', suffix='.jpg')  
        .random_split_by_pct(0.2)
        .datasets(ImageMultiDataset)  
        .transform(size=224)             
        .databunch()
        .normalize())
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-35-7bb5b6947e80> in <module>()
      3         .label_from_csv('labels_train_f.csv', sep=' ', folder='train_f', suffix='.jpg')
      4         .random_split_by_pct(0.2)
----> 5         .datasets(ImageMultiDataset)
      6         .transform(size=224)
      7         .databunch()

~/.anaconda3/lib/python3.7/site-packages/fastai/data_block.py in datasets(self, dataset_cls, **kwargs)
    145         if hasattr(dss[0], 'classes'): kwg_cls = dss[0].classes
    146         if kwg_cls is not None: kwargs['classes'] = kwg_cls
--> 147         dss.append(dataset_cls(*self.valid.items.T, **kwargs))
    148         cls = getattr(dataset_cls, '__splits_class__', SplitDatasets)
    149         return cls(self.path, *dss)

~/.anaconda3/lib/python3.7/site-packages/fastai/vision/data.py in __init__(self, fns, labels, classes)
    110         super().__init__(classes)
    111         self.x = np.array(fns)
--> 112         self.y = [np.array([self.class2idx[o] for o in l], dtype=np.int64) for l in labels]
    113         self.loss_func = F.binary_cross_entropy_with_logits
    114 

~/.anaconda3/lib/python3.7/site-packages/fastai/vision/data.py in <listcomp>(.0)
    110         super().__init__(classes)
    111         self.x = np.array(fns)
--> 112         self.y = [np.array([self.class2idx[o] for o in l], dtype=np.int64) for l in labels]
    113         self.loss_func = F.binary_cross_entropy_with_logits
    114 

~/.anaconda3/lib/python3.7/site-packages/fastai/vision/data.py in <listcomp>(.0)
    110         super().__init__(classes)
    111         self.x = np.array(fns)
--> 112         self.y = [np.array([self.class2idx[o] for o in l], dtype=np.int64) for l in labels]
    113         self.loss_func = F.binary_cross_entropy_with_logits
    114 

KeyError: '/m/06x1h8'

i can execute create_cnn:

f_score = partial(fbeta, thresh=0.2)
learn = create_cnn(data, models.resnet34, metrics=[accuracy_thresh, f_score], pretrained=False)

but learn.fit returns error:

learn.fit_one_cycle(5, lr)
---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-38-9f9404aafd82> in <module>()
----> 1 learn.fit_one_cycle(5, lr)

~/.anaconda3/lib/python3.7/site-packages/fastai/train.py in fit_one_cycle(learn, cyc_len, max_lr, moms, div_factor, pct_start, wd, callbacks, **kwargs)
     20     callbacks.append(OneCycleScheduler(learn, max_lr, moms=moms, div_factor=div_factor,
     21                                         pct_start=pct_start, **kwargs))
---> 22     learn.fit(cyc_len, max_lr, wd=wd, callbacks=callbacks)
     23 
     24 def lr_find(learn:Learner, start_lr:Floats=1e-7, end_lr:Floats=10, num_it:int=100, stop_div:bool=True, **kwargs:Any):

~/.anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in fit(self, epochs, lr, wd, callbacks)
    160         callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
    161         fit(epochs, self.model, self.loss_func, opt=self.opt, data=self.data, metrics=self.metrics,
--> 162             callbacks=self.callbacks+callbacks)
    163 
    164     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~/.anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     92     except Exception as e:
     93         exception = e
---> 94         raise e
     95     finally: cb_handler.on_train_end(exception)
     96 

~/.anaconda3/lib/python3.7/site-packages/fastai/basic_train.py in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     80             cb_handler.on_epoch_begin()
     81 
---> 82             for xb,yb in progress_bar(data.train_dl, parent=pbar):
     83                 xb, yb = cb_handler.on_batch_begin(xb, yb)
     84                 loss = loss_batch(model, xb, yb, loss_func, opt, cb_handler)

~/.anaconda3/lib/python3.7/site-packages/fastprogress/fastprogress.py in __iter__(self)
     63         self.update(0)
     64         try:
---> 65             for i,o in enumerate(self._gen):
     66                 yield o
     67                 if self.auto_update: self.update(i+1)

~/.anaconda3/lib/python3.7/site-packages/fastai/basic_data.py in __iter__(self)
     72     def __iter__(self):
     73         "Process and returns items from `DataLoader`."
---> 74         for b in self.dl: yield self.proc_batch(b)
     75 
     76     def one_batch(self)->Collection[Tensor]:

~/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in __next__(self)
    635                 self.reorder_dict[idx] = batch
    636                 continue
--> 637             return self._process_next_batch(batch)
    638 
    639     next = __next__  # Python 2 compatibility

~/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py in _process_next_batch(self, batch)
    656         self._put_indices()
    657         if isinstance(batch, ExceptionWrapper):
--> 658             raise batch.exc_type(batch.exc_msg)
    659         return batch
    660 

RuntimeError: Traceback (most recent call last):
  File "/home/nbuser/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/nbuser/.anaconda3/lib/python3.7/site-packages/fastai/torch_core.py", line 91, in data_collate
    return torch.utils.data.dataloader.default_collate(to_data(batch))
  File "/home/nbuser/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 232, in default_collate
    return [default_collate(samples) for samples in transposed]
  File "/home/nbuser/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 232, in <listcomp>
    return [default_collate(samples) for samples in transposed]
  File "/home/nbuser/.anaconda3/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 209, in default_collate
    return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 683 and 518 in dimension 2 at /opt/conda/conda-bld/pytorch-nightly_1540036376816/work/aten/src/TH/generic/THTensorMoreMath.cpp:1317

@miwojc You did not include the transforms…
Try adding:

tfms = get_transforms(....) 

and replace

.transform(size=224) 

by

.transform(tfms, size=224) 

If I recall correctly, the current examples only work with squared images, and the transform makes them square…

4 Likes

Thank you @gsg!
I intentionally left out transforms as I only wanted to resize images to 224. But I will try adding transforms back.
Thanks!

1 Like

I see your point now as I look at the show data batch images. They are clearly not square, so resize didn’t work :wink:
Thanks for pointing that out. I was not seeing that before.
No wonder pytorch was complaining about not equal sizes…

Glad it helped.
I recall there was some work ongoing to support rectangular images, but not sure if that is ready yet or not…

I had a similar error while working with my dataset, running lesson2-download.ipynb. Looked deeply into what I had typed and found a typo.

While creating ImageDataBunch.from_folder, I had passed df_tfms=get_transforms() instead of ds_tfms=get_transforms(). Because I did not use the right argument name, the transformations never happened.

3 Likes

Thanks for posting this. I’m new to this course, and ran into a problem when loading the oxford-flowers dataset via DataFrame. Adding the tfms and setting the size to 224 got me past the error. Thanks agan!