Error in Lesson 3 Camvid with Colab

ingbiodanielh · January 19, 2019, 8:40pm

I’m getting this error while I try to run the exact same notebook of lesson 3 camvid in colab.

The first part is when I run this cell and the following warning appears:

 data = (src.transform(get_transforms(), size=size, tfm_y=True)
            .databunch(bs=bs)
            .normalize(imagenet_stats))

You can deactivate this warning by passing `no_check=True`.
/usr/local/lib/python3.6/dist-packages/fastai/basic_data.py:224: UserWarning: There seems to be something wrong with your dataset, can't access any element of self.train_ds.
Tried: 343,297,143,412,495...
  warn(warn_msg)

Then when I run the next cell data.show_batch(2,figsize=(10,7)) the following error appears:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-39-4672a3bbd9d5> in <module>()
----> 1 data.show_batch(2, figsize=(10,7))

/usr/local/lib/python3.6/dist-packages/fastai/basic_data.py in show_batch(self, rows, ds_type, **kwargs)
    157     def show_batch(self, rows:int=5, ds_type:DatasetType=DatasetType.Train, **kwargs)->None:
    158         "Show a batch of data in `ds_type` on a few `rows`."
--> 159         x,y = self.one_batch(ds_type, True, True)
    160         if self.train_ds.x._square_show: rows = rows ** 2
    161         xs = [self.train_ds.x.reconstruct(grab_idx(x, i)) for i in range(rows)]

/usr/local/lib/python3.6/dist-packages/fastai/basic_data.py in one_batch(self, ds_type, detach, denorm, cpu)
    140         w = self.num_workers
    141         self.num_workers = 0
--> 142         try:     x,y = next(iter(dl))
    143         finally: self.num_workers = w
    144         if detach: x,y = to_detach(x,cpu=cpu),to_detach(y,cpu=cpu)

/usr/local/lib/python3.6/dist-packages/fastai/basic_data.py in __iter__(self)
     69     def __iter__(self):
     70         "Process and returns items from `DataLoader`."
---> 71         for b in self.dl: yield self.proc_batch(b)
     72 
     73     @classmethod

/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py in __next__(self)
    635                 self.reorder_dict[idx] = batch
    636                 continue
--> 637             return self._process_next_batch(batch)
    638 
    639     next = __next__  # Python 2 compatibility

/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py in _process_next_batch(self, batch)
    656         self._put_indices()
    657         if isinstance(batch, ExceptionWrapper):
--> 658             raise batch.exc_type(batch.exc_msg)
    659         return batch
    660 

TypeError: Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/usr/local/lib/python3.6/dist-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/usr/local/lib/python3.6/dist-packages/fastai/data_block.py", line 566, in __getitem__
    x = x.apply_tfms(self.tfms, **self.tfmargs)
  File "/usr/local/lib/python3.6/dist-packages/fastai/vision/image.py", line 109, in apply_tfms
    x.resize(target)
  File "/usr/local/lib/python3.6/dist-packages/fastai/vision/image.py", line 185, in resize
    self.flow = _affine_grid(size)
  File "/usr/local/lib/python3.6/dist-packages/fastai/vision/image.py", line 528, in _affine_grid
    grid = FloatTensor(N, H, W, 2)
TypeError: new(): argument 'size' must be tuple of ints, but found element of type numpy.float64 at pos 2

muellerzr · January 20, 2019, 5:57am

It has to do with the size declaration. I was able to figure that out. It does not recognize that ‘//‘ is an INT division for some reason. My only solution was getting rid of that but full size uses so much GPU that colab runs out on me!

sebderhy · January 23, 2019, 10:24am

@muellerzr I’m having the same issue on an AWS EC2 instance, and didn’t understand how you fixed this. Can you explain a bit more what you did? Thanks

a_yasyrev · January 23, 2019, 2:33pm

Had same problem (version 1.41).
Solve by change
size = src_size//4
to:
tuple(size = src_size//4)

muellerzr · January 23, 2019, 2:43pm

Thank you!!!

muellerzr · January 23, 2019, 2:45pm

I am new to python. What does the tuple function do in this case?

a_yasyrev · January 23, 2019, 2:52pm

tuple is base type in python.

In this case when you pass size to dataset constructor, it expect couple integer - width and high. On my comp, on previous version (1.39) work good with array, now scathing changed, passing tuple of int works for me.

muellerzr · January 23, 2019, 4:24pm

Ok so slight problem.

size = src_size//4
tuple(size)
bs = 8

When I run this followed by a creation of data with:
data = (src.transform(get_transforms(), size=size, tfm_y=True .databunch(bs=bs) .normalize(imagenet_stats))

it will give me a warning message saying
There seems to be something wrong with your dataset, can’t access any element of self.train_ds.

Followed by if I do a show batch it tells me:
TypeError: new(): argument 'size' must be tuple of ints, but found element of type numpy.float64 at pos 2

The same thing OP posted

sebderhy · January 23, 2019, 4:45pm

The tuple trick didn’t work for me, but helped me find a solution!
I solved the issue by replacing:

size = src_size//4

with

size = (src_size//4).tolist()

a_yasyrev · January 23, 2019, 6:02pm

This better!
So - we need couple of int here!

muellerzr · January 23, 2019, 7:08pm

Everywhere you would put size, put the attribute .tolist() for it to work. When you create the data specifically,

data = (src.transform(get_transforms(), size=size.tolist(), tfm_y=True)
   .databunch(bs=bs)
   .normalize(imagenet_stats))

Considering this issue is a little prolific I wonder if we could get the main notebook fixed? (Unless its only an issue for us 3 right now)

sebderhy · January 23, 2019, 9:17pm

I suspect that it’s not only for us, because I tried also on Paperspace Gradient, and had the same issue. My guess is that it’s probably better to add a correction in the data block api rather than the notebook, so that we don’t need to add this every time we want to create a databunch… Btw, I think that in the past, the data block api was compatible with numpy arrays as size.

vettukal · January 24, 2019, 2:54am

I think we should get main notebook fixed as I am also having the same issue.

a_yasyrev · January 24, 2019, 8:01am

Playing with this notebook.
Work good, but when i trying add test dataset, i have a problem.
For test a take validation (valid.txt), make list of file names and trying work with it as test dataset.
When i trying get predictions, i have error.
With
pred_val = learn.get_preds(DatasetType.Valid)
everything OK, but
pred_test = learn.get_preds(DatasetType.Test)
i have error:

Exception: Traceback (most recent call last):
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/opt/conda/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 138, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/opt/conda/lib/python3.7/site-packages/fastai/data_block.py", line 568, in __getitem__
    y = y.apply_tfms(self.tfms_y, **{**self.tfmargs_y, 'do_resolve':False})
  File "/opt/conda/lib/python3.7/site-packages/fastai/core.py", line 156, in apply_tfms
    if tfms: raise Exception('Not implemented')
Exception: Not implemented

Tried in versions 1.038 - 1.0.41
In version 1.0.41 and 1.0.40 same problem

In versions 1.0.38 and 1.0.39 i cant add test dataset to src. When i try
src = src.add_test(testlist)
i got error -
AttributeError: 'ImageSegment' object has no attribute 'obj'

Did someone have experience with image segmentation?

sgugger · January 24, 2019, 3:32pm

I fixed the issue in master.

sebderhy · January 24, 2019, 4:01pm

Thanks Sylvain!!

a_yasyrev · January 24, 2019, 4:52pm

Thanks!

muellerzr · January 24, 2019, 7:52pm

Thank you!

a_yasyrev · January 25, 2019, 8:39am

Sorry, still have same error on ver 1.0.42, 1.0.43dev.

Where should i write about this?

codefeeder · February 5, 2019, 8:13pm

I am receiving same error when I am using tfm_y=True
How to solve this error?