Different behaviour google colab vs floydhub

(sergio marchesini) #1

Hi, I am trying to do my own image classification thing, following lesson one.
my data set is a root folder containing many folders, one folder per label, and each folder contains n images.

I tried loading it into a ImageDataBunch
like this:

data = ImageDataBunch.from_name_func(’.’, full_file_paths,label_func=smart_get_labels,ds_tfms=get_transforms(), size=224, bs=bs)

(the label function works correctly)
in Google colab this worked fine and I could plot a few images just like in the lesson notebook

I then tried the same on Floydhub but I got a (data)bunch of errors like
/usr/local/lib/python3.6/site-packages/fastai/basic_data.py:226: UserWarning: There seems to be something wrong with your dataset, can’t access any element of self.train_ds.
Tried: 6488,436,46069,2298,17059…

If I try to plot the images that the warning mentions they display all right.

Also I tried:

And that gives me:
/usr/local/lib/python3.6/site-packages/fastprogress/fastprogress.py:95: UserWarning: Your generator is empty.

Is it possibile there is an older version of fastai on Floydhub (I followed the getting started tutorial in the lesson pages) ?
How do I check the version and eventually update?

TIA and sorry for newbieness :slight_smile:


(sergio marchesini) #2

Just to give more context, here is the error stack

 		You can deactivate this warning by passing `no_check=True`.
	/usr/local/lib/python3.6/site-packages/fastai/basic_data.py:226: UserWarning: There seems to be something wrong with your dataset, can't access any element of self.train_ds.
	Tried: 42349,36208,51596,33934,7632...
	ValueError                                Traceback (most recent call last)
	<ipython-input-13-ecaf6f2dc64a> in <module>()
	----> 1 data = ImageDataBunch.from_name_func(datapath, fungi,label_func=smart_get_labels,ds_tfms=get_transforms(), size=224, bs=bs).normalize()

	/usr/local/lib/python3.6/site-packages/fastai/vision/data.py in normalize(self, stats, do_x, do_y)
	    177         "Add normalize transform using `stats` (defaults to `DataBunch.batch_stats`)"
	    178         if getattr(self,'norm',False): raise Exception('Can not call normalize twice')
	--> 179         if stats is None: self.stats = self.batch_stats()
	    180         else:             self.stats = stats
	    181         self.norm,self.denorm = normalize_funcs(*self.stats, do_x=do_x, do_y=do_y)

	/usr/local/lib/python3.6/site-packages/fastai/vision/data.py in batch_stats(self, funcs)
	    171         "Grab a batch of data and call reduction function `func` per channel"
	    172         funcs = ifnone(funcs, [torch.mean,torch.std])
	--> 173         x = self.one_batch(ds_type=DatasetType.Valid, denorm=False)[0].cpu()
	    174         return [func(channel_view(x), 1) for func in funcs]

	/usr/local/lib/python3.6/site-packages/fastai/basic_data.py in one_batch(self, ds_type, detach, denorm, cpu)
	    140         w = self.num_workers
	    141         self.num_workers = 0
	--> 142         try:     x,y = next(iter(dl))
	    143         finally: self.num_workers = w
	    144         if detach: x,y = to_detach(x,cpu=cpu),to_detach(y,cpu=cpu)

	/usr/local/lib/python3.6/site-packages/fastai/basic_data.py in __iter__(self)
	     69     def __iter__(self):
	     70         "Process and returns items from `DataLoader`."
	---> 71         for b in self.dl: yield self.proc_batch(b)
	     73     @classmethod

	/usr/local/lib/python3.6/site-packages/torch/utils/data/dataloader.py in __next__(self)
	    334                 self.reorder_dict[idx] = batch
	    335                 continue
	--> 336             return self._process_next_batch(batch)
	    338     next = __next__  # Python 2 compatibility

	/usr/local/lib/python3.6/site-packages/torch/utils/data/dataloader.py in _process_next_batch(self, batch)
	    355         self._put_indices()
	    356         if isinstance(batch, ExceptionWrapper):
	--> 357             raise batch.exc_type(batch.exc_msg)
	    358         return batch

	ValueError: Traceback (most recent call last):
	  File "/usr/local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 106, in _worker_loop
	    samples = collate_fn([dataset[i] for i in batch_indices])
	  File "/usr/local/lib/python3.6/site-packages/torch/utils/data/dataloader.py", line 106, in <listcomp>
	    samples = collate_fn([dataset[i] for i in batch_indices])
	  File "/usr/local/lib/python3.6/site-packages/fastai/data_block.py", line 567, in __getitem__
	    x = x.apply_tfms(self.tfms, **self.tfmargs)
	  File "/usr/local/lib/python3.6/site-packages/fastai/vision/image.py", line 117, in apply_tfms
	    x = tfm(x, size=_get_crop_target(size,mult=mult), padding_mode=padding_mode)
	  File "/usr/local/lib/python3.6/site-packages/fastai/vision/image.py", line 506, in __call__
	    return self.tfm(x, *args, **{**self.resolved, **kwargs}) if self.do_run else x
	  File "/usr/local/lib/python3.6/site-packages/fastai/vision/image.py", line 453, in __call__
	    if args: return self.calc(*args, **kwargs)
	  File "/usr/local/lib/python3.6/site-packages/fastai/vision/image.py", line 458, in calc
	    if self._wrap: return getattr(x, self._wrap)(self.func, *args, **kwargs)
	  File "/usr/local/lib/python3.6/site-packages/fastai/vision/image.py", line 167, in pixel
	    self.px = func(self.px, *args, **kwargs)
	  File "/usr/local/lib/python3.6/site-packages/fastai/vision/image.py", line 140, in px
	  File "/usr/local/lib/python3.6/site-packages/fastai/vision/image.py", line 127, in refresh
	    self._px = _grid_sample(self._px, self.flow, **self.sample_kwargs)
	  File "/usr/local/lib/python3.6/site-packages/fastai/vision/image.py", line 523, in _grid_sample
	    return F.grid_sample(x[None], coords, mode=mode, padding_mode=padding_mode)[0]
	  File "/usr/local/lib/python3.6/site-packages/torch/nn/functional.py", line 2092, in grid_sample
	    raise ValueError("padding_mode needs to be 'zeros' or 'border', but got {}".format(padding_mode))
	ValueError: padding_mode needs to be 'zeros' or 'border', but got reflection

(Zachary Mueller) #3

I’d move onto lesson two. Jeremy introduces a from_folder function which is closer to what you’re hoping to achieve I believe

1 Like

(sergio marchesini) #4

Thank you @muellerzr, I have watched lesson two, and I also looked inside the library code, honestly I still don’t understand why my code works on google colab and on a local conda install but not on floydhub.
From my perspective using from_name_func should be correct in my case (one folder per label, pass full array of file paths as second argument)

I will check my floydhub setup first.
Thank you very much.


(sergio marchesini) #5

I can confirm there is a problem with the automatic setup in floydhub. I followed the course instructions, cloned the standard fastai workspace, but If I simply run lesson1 notebook I get the same error when instantiating ImageDataBunch via from_name_re()

I think I will try the code on a different service.