ImageDataBunch Problem

Ezno · November 5, 2019, 1:28am

I am able to follow and do the example from the first videos, however when I try to do my own image recognition model I get stuck loading into an imagedatabunch. I’ve tried searching the forums and a few other things and I’m not able to understand why the issue is happening. If someone can help, that would be greatly appreciated.

Dataset: I am using the cats vs dog dataset from this kaggle competition: https://www.kaggle.com/c/dogs-vs-cats/data

My approach was to use ImageDataBunch.from_df.

import pandas as pd
import numpy as np
from fastai.vision import *

path_img = '/home/ubuntu/data/dog-vs-cats/data/train/'
fnames = get_image_files(path_img)
bs = 64

df = pd.DataFrame(data = {'name': fnames})
df['name'] = df['name'].astype(str)
df['label'] = df['name'].str[41:44]

This gives me a dataframe like this:

24%20PM

When I then try to load it into a databunch I get this result

data = ImageDataBunch.from_df(path_img, df, path_img, ds_tfms=get_transforms(), size=224, bs=bs).normalize(imagenet_stats)

You can deactivate this warning by passing `no_check=True`.
/home/ubuntu/anaconda3/lib/python3.6/site-packages/fastai/basic_data.py:201: UserWarning: There seems to be something wrong with your dataset, can't access self.train_ds[i] for all i in [9013, 17837, 7231, 90, 1904, 12343, 4740, 2794, 3424, 5195, 15953, 19372, 7610, 11148, 946, 19141, 18028, 4529, 8890, 17871, 1583, 5938, 17926, 14521, 6517, 14450, 6038, 12066, 4524, 11484, 7937, 12945, 8262, 4435, 4550, 3989, 17425, 13755, 7796, 2187, 12598, 15165, 17847, 3169, 15377, 614, 14890, 7424, 16674, 9770, 13733, 1018, 5113, 4556, 9782, 14122, 5766, 2837, 2778, 7183, 7884, 12891, 7108, 13375]
  warn(f"There seems to be something wrong with your dataset, can't access self.train_ds[i] for all i in {idx}")

If I then try to see what is in “data” I get an unusual error indicating that somehow various paths are being concatenated together and so it isn’t finding some files (bottom of error message/trace below). I checked to see if any were concatenated together in the dataframe, and they were not. I also checked to see if there were any missing paths and I didn’t see that either. I am not really sure what else to try - nothing I looked at seemed out of pace.

data


---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

~/anaconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    398                             if callable(meth):
    399                                 return meth(obj, self, cycle)
--> 400                         if cls is not object \
    401                                 and callable(cls.__dict__.get('__repr__')):
    402                             return _repr_pprint(obj, self, cycle)

~/anaconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    693 
    694 def _repr_pprint(obj, p, cycle):
--> 695     """A pprint that just redirects to the normal repr function."""
    696     # Find newlines and replace them with p.break_()
    697     output = repr(obj)

~/anaconda3/lib/python3.6/site-packages/fastai/basic_data.py in __repr__(self)
     98 
     99     def __repr__(self)->str:
--> 100         return f'{self.__class__.__name__};\n\nTrain: {self.train_ds};\n\nValid: {self.valid_ds};\n\nTest: {self.test_ds}'
    101 
    102     @staticmethod

~/anaconda3/lib/python3.6/site-packages/fastai/data_block.py in __repr__(self)
    492 
    493     def __repr__(self)->str:
--> 494         x = f'{self.x}' # force this to happen first
    495         return f'{self.__class__.__name__}\ny: {self.y}\nx: {x}'
    496     def predict(self, res):

~/anaconda3/lib/python3.6/site-packages/fastai/data_block.py in __repr__(self)
     59         return self.items[i]
     60     def __repr__(self)->str:
---> 61         items = [self[i] for i in range(min(5,len(self.items)))]
     62         return f'{self.__class__.__name__} ({len(self.items)} items)\n{items}...\nPath: {self.path}'
     63 

~/anaconda3/lib/python3.6/site-packages/fastai/data_block.py in <listcomp>(.0)
     59         return self.items[i]
     60     def __repr__(self)->str:
---> 61         items = [self[i] for i in range(min(5,len(self.items)))]
     62         return f'{self.__class__.__name__} ({len(self.items)} items)\n{items}...\nPath: {self.path}'
     63 

~/anaconda3/lib/python3.6/site-packages/fastai/data_block.py in __getitem__(self, idxs)
     92     def __getitem__(self,idxs:int)->Any:
     93         idxs = try_int(idxs)
---> 94         if isinstance(idxs, numbers.Integral): return self.get(idxs)
     95         else: return self.new(self.items[idxs], xtra=index_row(self.xtra, idxs))
     96 

~/anaconda3/lib/python3.6/site-packages/fastai/vision/data.py in get(self, i)
    264     def get(self, i):
    265         fn = super().get(i)
--> 266         res = self.open(fn)
    267         self.sizes[i] = res.size
    268         return res

~/anaconda3/lib/python3.6/site-packages/fastai/vision/data.py in open(self, fn)
    260     def open(self, fn):
    261         "Open image in `fn`, subclass and overwrite for custom behavior."
--> 262         return open_image(fn, convert_mode=self.convert_mode)
    263 
    264     def get(self, i):

~/anaconda3/lib/python3.6/site-packages/fastai/vision/image.py in open_image(fn, div, convert_mode, cls)
    374     with warnings.catch_warnings():
    375         warnings.simplefilter("ignore", UserWarning) # EXIF warning from TiffPlugin
--> 376         x = PIL.Image.open(fn).convert(convert_mode)
    377     x = pil2tensor(x,np.float32)
    378     if div: x.div_(255)

~/anaconda3/lib/python3.6/site-packages/PIL/Image.py in open(fp, mode)
   2632     :param mode: Mode to use (will be determined from type if None)
   2633       See: :ref:`concept-modes`.
-> 2634     :returns: An image object.
   2635 
   2636     .. versionadded:: 1.1.6

FileNotFoundError: [Errno 2] No such file or directory: '/home/ubuntu/data/dog-vs-cats/data/train//home/ubuntu/data/dog-vs-cats/data/train///home/ubuntu/data/dog-vs-cats/data/train/cat.2960.jpg'

Any ideas for things I can try?

bwarner · November 5, 2019, 4:02am

It’s because you are passed the path three times to the ImageDataBunch, once in your dataframe, and twice with path_img.

Ezno · November 6, 2019, 6:06pm

Thank you bwarner for helping a noob out These are probably not very interesting problems to solve for more experienced people, but I am committed to learning this and really appreciate the help getting started.

The fixed that issue, but now I seem to be running into a new issue. I tried this on a couple datasets I am trying to work with and same issue with all of them (above without duplicate path, and another using .from_folder). I then went back to course-v3 lesson1-pets jupyter notebook to ensure I could still successfully run the course code and found I am not able to.

I re-pulled the github repository just in case I had screwed up the notebook. I then found forum threads indicating that this issue was tied to fastai not being fully updated, so I tried updating everything. I am a bit lost on next steps, but here’s what I tried.

cd course-v3
git pull
conda update conda
conda install -c fastai fastai

I get the errors on this line

data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=224, bs=bs
                                  ).normalize(imagenet_stats)

You can deactivate this warning by passing `no_check=True`.
/home/ubuntu/anaconda3/lib/python3.6/site-packages/fastai/basic_data.py:201: UserWarning: There seems to be something wrong with your dataset, can't access self.train_ds[i] for all i in [3389, 5725, 2011, 4842, 5112, 907, 1825, 536, 251, 2756, 4307, 3027, 4510, 3167, 3047, 2171, 3125, 4349, 2398, 1170, 780, 125, 4297, 3924, 5019, 4067, 2893, 2461, 1102, 715, 5212, 3150, 1215, 3128, 418, 1197, 2701, 451, 5028, 4205, 4697, 1620, 4744, 3177, 88, 4952, 556, 5667, 4233, 3627, 1852, 257, 2679, 4671, 4776, 146, 3947, 993, 2824, 1768, 280, 1838, 5709, 3755]
  warn(f"There seems to be something wrong with your dataset, can't access self.train_ds[i] for all i in {idx}")

On the following line it has a trace that maybe could be helpful, indicating that the “torch” module is missing some component. Since that is not a library import I am guessing it’s a component of fastai.

data.show_batch(rows=3, figsize=(7,6))

/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py:2693: UserWarning: Default grid_sample and affine_grid behavior will be changed to align_corners=False from 1.4.0. See the documentation of grid_sample for details.
  warnings.warn("Default grid_sample and affine_grid behavior will be changed "
/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py:2693: UserWarning: Default grid_sample and affine_grid behavior will be changed to align_corners=False from 1.4.0. See the documentation of grid_sample for details.
  warnings.warn("Default grid_sample and affine_grid behavior will be changed "
/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py:2693: UserWarning: Default grid_sample and affine_grid behavior will be changed to align_corners=False from 1.4.0. See the documentation of grid_sample for details.
  warnings.warn("Default grid_sample and affine_grid behavior will be changed "
/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/nn/functional.py:2693: UserWarning: Default grid_sample and affine_grid behavior will be changed to align_corners=False from 1.4.0. See the documentation of grid_sample for details.
  warnings.warn("Default grid_sample and affine_grid behavior will be changed "
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-11-66824b983385> in <module>
----> 1 data.show_batch(rows=3, figsize=(7,6))

~/anaconda3/lib/python3.6/site-packages/fastai/basic_data.py in show_batch(self, rows, ds_type, **kwargs)
    158     def show_batch(self, rows:int=5, ds_type:DatasetType=DatasetType.Train, **kwargs)->None:
    159         "Show a batch of data in `ds_type` on a few `rows`."
--> 160         x,y = self.one_batch(ds_type, True, True)
    161         if self.train_ds.x._square_show: rows = rows ** 2
    162         xs = [self.train_ds.x.reconstruct(grab_idx(x, i)) for i in range(rows)]

~/anaconda3/lib/python3.6/site-packages/fastai/basic_data.py in one_batch(self, ds_type, detach, denorm, cpu)
    141         w = self.num_workers
    142         self.num_workers = 0
--> 143         try:     x,y = next(iter(dl))
    144         finally: self.num_workers = w
    145         if detach: x,y = to_detach(x,cpu=cpu),to_detach(y,cpu=cpu)

~/anaconda3/lib/python3.6/site-packages/fastai/basic_data.py in __iter__(self)
     68     def __iter__(self):
     69         "Process and returns items from `DataLoader`."
---> 70         for b in self.dl:
     71             #y = b[1][0] if is_listy(b[1]) else b[1] # XXX: Why is this line here?
     72             yield self.proc_batch(b)

~/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py in __next__(self)
    817             else:
    818                 del self._task_info[idx]
--> 819                 return self._process_data(data)
    820 
    821     next = __next__  # Python 2 compatibility

~/anaconda3/lib/python3.6/site-packages/torch/utils/data/dataloader.py in _process_data(self, data)
    844         self._try_put_index()
    845         if isinstance(data, ExceptionWrapper):
--> 846             data.reraise()
    847         return data
    848 

~/anaconda3/lib/python3.6/site-packages/torch/_utils.py in reraise(self)
    383             # (https://bugs.python.org/issue2651), so we work around it.
    384             msg = KeyErrorMessage(msg)
--> 385         raise self.exc_type(msg)

AttributeError: Caught AttributeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/fastai/data_block.py", line 526, in __getitem__
    x = x.apply_tfms(self.tfms, **self.tfmargs)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/fastai/vision/image.py", line 113, in apply_tfms
    else: x = tfm(x)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/fastai/vision/image.py", line 499, in __call__
    return self.tfm(x, *args, **{**self.resolved, **kwargs}) if self.do_run else x
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/fastai/vision/image.py", line 446, in __call__
    if args: return self.calc(*args, **kwargs)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/fastai/vision/image.py", line 451, in calc
    if self._wrap: return getattr(x, self._wrap)(self.func, *args, **kwargs)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/fastai/vision/image.py", line 167, in coord
    self.flow = func(self.flow, *args, **kwargs)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/fastai/vision/transform.py", line 238, in _symmetric_warp
    return _do_perspective_warp(c, targ_pts, invert)
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/fastai/vision/transform.py", line 225, in _do_perspective_warp
    return _apply_perspective(c, _find_coeffs(_orig_pts, targ_pts))
  File "/home/ubuntu/anaconda3/lib/python3.6/site-packages/fastai/vision/transform.py", line 206, in _find_coeffs
    return torch.gesv(B,A)[0][:,0]
AttributeError: module 'torch' has no attribute 'gesv'

Ezno · November 7, 2019, 12:45am

hmm this is an environment problem. I do not know what’s different, or how to fix it yet, but it works locally but fails when run on an p2.xlarge EC2 instance. I’ll update this if I am able to figure out how to troublshoot or fix that.

bwarner · November 7, 2019, 3:29am

Yeah, it looks like an environment issue. I’d start by double checking the pytorch version on AWS and go from there.

Ezno · November 7, 2019, 3:53am

Yeah pytorch and fastai were on the same version both places. The only difference was the EC2 was on python 3.6.9 where locally it was on 3.7.5. I tried getting conda to update to 3.7, though there were issues there. After a couple hours of that I decided that I am going to try to focus out some different environments and circle back to this issue at a later date. For where I am at I’d rather avoid this problem for now and focus on the machine learning aspects.

When setting up the environment I followed this guide https://course.fast.ai/start_aws.html. Is there anyone I should notify that it may not be working, or somewhere I should put in a ticket or something?