I am able to follow and do the example from the first videos, however when I try to do my own image recognition model I get stuck loading into an imagedatabunch. I’ve tried searching the forums and a few other things and I’m not able to understand why the issue is happening. If someone can help, that would be greatly appreciated.
Dataset: I am using the cats vs dog dataset from this kaggle competition: https://www.kaggle.com/c/dogs-vs-cats/data
My approach was to use ImageDataBunch.from_df.
import pandas as pd
import numpy as np
from fastai.vision import *
path_img = '/home/ubuntu/data/dog-vs-cats/data/train/'
fnames = get_image_files(path_img)
bs = 64
df = pd.DataFrame(data = {'name': fnames})
df['name'] = df['name'].astype(str)
df['label'] = df['name'].str[41:44]
This gives me a dataframe like this:
When I then try to load it into a databunch I get this result
data = ImageDataBunch.from_df(path_img, df, path_img, ds_tfms=get_transforms(), size=224, bs=bs).normalize(imagenet_stats)
You can deactivate this warning by passing `no_check=True`.
/home/ubuntu/anaconda3/lib/python3.6/site-packages/fastai/basic_data.py:201: UserWarning: There seems to be something wrong with your dataset, can't access self.train_ds[i] for all i in [9013, 17837, 7231, 90, 1904, 12343, 4740, 2794, 3424, 5195, 15953, 19372, 7610, 11148, 946, 19141, 18028, 4529, 8890, 17871, 1583, 5938, 17926, 14521, 6517, 14450, 6038, 12066, 4524, 11484, 7937, 12945, 8262, 4435, 4550, 3989, 17425, 13755, 7796, 2187, 12598, 15165, 17847, 3169, 15377, 614, 14890, 7424, 16674, 9770, 13733, 1018, 5113, 4556, 9782, 14122, 5766, 2837, 2778, 7183, 7884, 12891, 7108, 13375]
warn(f"There seems to be something wrong with your dataset, can't access self.train_ds[i] for all i in {idx}")
If I then try to see what is in “data” I get an unusual error indicating that somehow various paths are being concatenated together and so it isn’t finding some files (bottom of error message/trace below). I checked to see if any were concatenated together in the dataframe, and they were not. I also checked to see if there were any missing paths and I didn’t see that either. I am not really sure what else to try - nothing I looked at seemed out of pace.
data
---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
~/anaconda3/lib/python3.6/site-packages/IPython/core/formatters.py in __call__(self, obj)
700 type_pprinters=self.type_printers,
701 deferred_pprinters=self.deferred_printers)
--> 702 printer.pretty(obj)
703 printer.flush()
704 return stream.getvalue()
~/anaconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in pretty(self, obj)
398 if callable(meth):
399 return meth(obj, self, cycle)
--> 400 if cls is not object \
401 and callable(cls.__dict__.get('__repr__')):
402 return _repr_pprint(obj, self, cycle)
~/anaconda3/lib/python3.6/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
693
694 def _repr_pprint(obj, p, cycle):
--> 695 """A pprint that just redirects to the normal repr function."""
696 # Find newlines and replace them with p.break_()
697 output = repr(obj)
~/anaconda3/lib/python3.6/site-packages/fastai/basic_data.py in __repr__(self)
98
99 def __repr__(self)->str:
--> 100 return f'{self.__class__.__name__};\n\nTrain: {self.train_ds};\n\nValid: {self.valid_ds};\n\nTest: {self.test_ds}'
101
102 @staticmethod
~/anaconda3/lib/python3.6/site-packages/fastai/data_block.py in __repr__(self)
492
493 def __repr__(self)->str:
--> 494 x = f'{self.x}' # force this to happen first
495 return f'{self.__class__.__name__}\ny: {self.y}\nx: {x}'
496 def predict(self, res):
~/anaconda3/lib/python3.6/site-packages/fastai/data_block.py in __repr__(self)
59 return self.items[i]
60 def __repr__(self)->str:
---> 61 items = [self[i] for i in range(min(5,len(self.items)))]
62 return f'{self.__class__.__name__} ({len(self.items)} items)\n{items}...\nPath: {self.path}'
63
~/anaconda3/lib/python3.6/site-packages/fastai/data_block.py in <listcomp>(.0)
59 return self.items[i]
60 def __repr__(self)->str:
---> 61 items = [self[i] for i in range(min(5,len(self.items)))]
62 return f'{self.__class__.__name__} ({len(self.items)} items)\n{items}...\nPath: {self.path}'
63
~/anaconda3/lib/python3.6/site-packages/fastai/data_block.py in __getitem__(self, idxs)
92 def __getitem__(self,idxs:int)->Any:
93 idxs = try_int(idxs)
---> 94 if isinstance(idxs, numbers.Integral): return self.get(idxs)
95 else: return self.new(self.items[idxs], xtra=index_row(self.xtra, idxs))
96
~/anaconda3/lib/python3.6/site-packages/fastai/vision/data.py in get(self, i)
264 def get(self, i):
265 fn = super().get(i)
--> 266 res = self.open(fn)
267 self.sizes[i] = res.size
268 return res
~/anaconda3/lib/python3.6/site-packages/fastai/vision/data.py in open(self, fn)
260 def open(self, fn):
261 "Open image in `fn`, subclass and overwrite for custom behavior."
--> 262 return open_image(fn, convert_mode=self.convert_mode)
263
264 def get(self, i):
~/anaconda3/lib/python3.6/site-packages/fastai/vision/image.py in open_image(fn, div, convert_mode, cls)
374 with warnings.catch_warnings():
375 warnings.simplefilter("ignore", UserWarning) # EXIF warning from TiffPlugin
--> 376 x = PIL.Image.open(fn).convert(convert_mode)
377 x = pil2tensor(x,np.float32)
378 if div: x.div_(255)
~/anaconda3/lib/python3.6/site-packages/PIL/Image.py in open(fp, mode)
2632 :param mode: Mode to use (will be determined from type if None)
2633 See: :ref:`concept-modes`.
-> 2634 :returns: An image object.
2635
2636 .. versionadded:: 1.1.6
FileNotFoundError: [Errno 2] No such file or directory: '/home/ubuntu/data/dog-vs-cats/data/train//home/ubuntu/data/dog-vs-cats/data/train///home/ubuntu/data/dog-vs-cats/data/train/cat.2960.jpg'
Any ideas for things I can try?