X,y = next(iter(data.val_dl)) returns no such file error


#1

Hi,
I am trying to replicate the code from planet notebook to another data set that I downloaded from kaggle. Planet notebook runs fine, however, when I run x,y = next(iter(data.val_dl)) on my own dataset keeps returning no such file or directory, almost everything is identical formed the same as planet competition. I tried to find answers and solution however still can’t wrap my head around. thank you for any idea or suggestion.

here’s the error message:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
<ipython-input-165-58f4f6117e2c> in <module>()
----> 1 x,y = next(iter(data.val_dl))

~/fastai/courses/dl1/fastai/dataloader.py in __iter__(self)
 86                 # avoid py3.6 issue where queue is infinite and can result in memory exhaustion
 87                 for c in chunk_iter(iter(self.batch_sampler), self.num_workers*10):
---> 88                     for batch in e.map(self.get_batch, c):
 89                         yield get_tensor(batch, self.pin_memory, self.half)
 90 

~/src/anaconda3/envs/fastai/lib/python3.6/concurrent/futures/_base.py in result_iterator()
584                     # Careful not to keep a reference to the popped future
585                     if timeout is None:
--> 586                         yield fs.pop().result()
587                     else:
588                         yield fs.pop().result(end_time - time.time())

~/src/anaconda3/envs/fastai/lib/python3.6/concurrent/futures/_base.py in result(self, timeout)
423                 raise CancelledError()
424             elif self._state == FINISHED:
--> 425                 return self.__get_result()
426 
427             self._condition.wait(timeout)

~/src/anaconda3/envs/fastai/lib/python3.6/concurrent/futures/_base.py in __get_result(self)
382     def __get_result(self):
383         if self._exception:
--> 384             raise self._exception
385         else:
386             return self._result

~/src/anaconda3/envs/fastai/lib/python3.6/concurrent/futures/thread.py in run(self)
 54 
 55         try:
---> 56             result = self.fn(*self.args, **self.kwargs)
 57         except BaseException as exc:
 58             self.future.set_exception(exc)

~/fastai/courses/dl1/fastai/dataloader.py in get_batch(self, indices)
 73 
 74     def get_batch(self, indices):
---> 75         res = self.np_collate([self.dataset[i] for i in indices])
 76         if self.transpose:   res[0] = res[0].T
 77         if self.transpose_y: res[1] = res[1].T

~/fastai/courses/dl1/fastai/dataloader.py in <listcomp>(.0)
 73 
 74     def get_batch(self, indices):
---> 75         res = self.np_collate([self.dataset[i] for i in indices])
 76         if self.transpose:   res[0] = res[0].T
 77         if self.transpose_y: res[1] = res[1].T

~/fastai/courses/dl1/fastai/dataset.py in __getitem__(self, idx)
166             xs,ys = zip(*[self.get1item(i) for i in range(*idx.indices(self.n))])
167             return np.stack(xs),ys
--> 168         return self.get1item(idx)
169 
170     def __len__(self): return self.n

~/fastai/courses/dl1/fastai/dataset.py in get1item(self, idx)
159 
160     def get1item(self, idx):
--> 161         x,y = self.get_x(idx),self.get_y(idx)
162         return self.get(self.transform, x, y)
163 

~/fastai/courses/dl1/fastai/dataset.py in get_x(self, i)
243         super().__init__(transform)
244     def get_sz(self): return self.transform.sz
--> 245     def get_x(self, i): return open_image(os.path.join(self.path, self.fnames[i]))
246     def get_n(self): return len(self.fnames)
247 

~/fastai/courses/dl1/fastai/dataset.py in open_image(fn)
219     flags = cv2.IMREAD_UNCHANGED+cv2.IMREAD_ANYDEPTH+cv2.IMREAD_ANYCOLOR
220     if not os.path.exists(fn) and not str(fn).startswith("http"):
--> 221         raise OSError('No such file or directory: {}'.format(fn))
222     elif os.path.isdir(fn) and not str(fn).startswith("http"):
223         raise OSError('Is a directory: {}'.format(fn))

OSError: No such file or directory: /home/ubuntu/fastai/sample/train/10

(Poonam Ligade) #2

can you check if you can do

ls /home/ubuntu/fastai/sample/train/10

looks like that particular dir is not there.


#3

hi PoonamV,
Thank you for your reply.
you are right that there’s no file named 10 in my train directory. I have my training image filename and label in the train_label.csv, and used
def get_data(sz):
tfms = tfms_from_model(f_model, sz, aug_tfms=transforms_side_on, max_zoom=1.05)
return ImageClassifierData.from_csv(PATH, ‘train’, f’{PATH}trn_label2.csv’, tfms=tfms,val_idxs=val_idxs, test_name=‘test’)

to read as we did in the planet notebook.
therefore, I don’t know why would it return such error?

I used f’{PATH}trn_label2.csv’ in label file otherwise it returns the same no such file error in data = get_Data(256)


Getting Data With Image Classifier from CSV
(Poonam Ligade) #4

Can you post first few lines of your trn_label2.csv to verify column names and string for path.
Path should be the first column and then the label.
If it has got any suffix like jpeg, you have to specify it.

data = ImageClassifierData.from_csv(PATH, folder=‘train’, csv_fname=f’{PATH}/labels.csv’,suffix=’.jpg’,
test_name=‘test’, tfms=tfms, bs=batch_size, num_workers=4)

train folder should contain all the image files.
Hope this helps


#5

the trn_label2.csv looks like this:
filename label
id_1_labels_[95, 66, 137, 70, 20].jpg 95 66 137 70 20
id_2_labels_[36, 66, 44, 214, 105, 133].jpg 36 66 44 214 105 133
id_3_labels_[170, 66, 97, 153, 105, 138].jpg 170 66 97 153 105 138

do you mean in the filename column i should put exact path like : /home/ubuntu/fastai/sample/train/id_1_labels_[95, 66, 137, 70, 20].jpg ?

yes, my training and validation images are in the train folder.


(Martin) #6

I would also check what filename it is trying to open when it raises the OSError.

~/fastai/courses/dl1/fastai/dataset.py in get_x(self, i)
243         super().__init__(transform)
244     def get_sz(self): return self.transform.sz
--> 245     def get_x(self, i): return open_image(os.path.join(self.path, self.fnames[i]))
246     def get_n(self): return len(self.fnames)
247 

From what the error says, I would edit ~/fastai/courses/dl1/fastai/dataset.py (temporarily) so that get_x is like:

def get_x(self, i):
    print(os.path.join(self.path, self.fnames[i])) # FOR TEST ONLY!!! (remove after)
    return open_image(os.path.join(self.path, self.fnames[i]))

That way when you run it, you should see the path to what it is trying to open printed out before the error message. Check if that is the correct path to your image.

Make sure you remove it after! :slight_smile:

If that didn’t help you figure it out perhaps you could provide the link to the dataset from Kaggle.


#7

Hi Hadus,
I will try it now! thank you.

the kaggle competition is this one: iMaterialist Challenge (Fashion) at FGVC5

and I download the training images via this kernal:
https://www.kaggle.com/sshekhar/download-image-progress-resume-multiprocessing


#8

I just runned it. It return:
/home/ubuntu/fastai/sample/train/10
/home/ubuntu/fastai/sample/train/327
/home/ubuntu/fastai/sample/train/649
/home/ubuntu/fastai/sample/train/951

and still the same error:

~/fastai/courses/dl1/fastai/dataset.py in get_x(self, i)
    245     def get_x(self, i):
    246         print(os.path.join(self.path, self.fnames[i])) # FOR TEST ONLY!!! (remove after)
--> 247         return open_image(os.path.join(self.path, self.fnames[i]))
    248     def get_n(self): return len(self.fnames)
    249 

~/fastai/courses/dl1/fastai/dataset.py in open_image(fn)
    219     flags = cv2.IMREAD_UNCHANGED+cv2.IMREAD_ANYDEPTH+cv2.IMREAD_ANYCOLOR
    220     if not os.path.exists(fn) and not str(fn).startswith("http"):
--> 221         raise OSError('No such file or directory: {}'.format(fn))
    222     elif os.path.isdir(fn) and not str(fn).startswith("http"):
223         raise OSError('Is a directory: {}'.format(fn))

`OSError: No such file or directory: /home/ubuntu/fastai/sample/train/10`

I also checked my csv and train file again, all the files are formed as id_1_labels_[95, 66, 137, 70, 20].jpg


(Martin) #9

Where did you get the csv from? I didn’t find it on Kaggle nor do I think the image downloader script creates it.

I don’t think that dataset is meant to be loaded with ImageClassifierData.from_csv

Your images are in the right place but it isn’t trying to open the right filenames…


#10

Hi Hadus,
I made the csv.
thank you for your reply, I just figure out what went wrong :joy:
when I made the csv with to_csv, I didn’t set index=False. so when I use from_csv to read the label, the first column is the index like Unnamed: column.
thank you again for all of your help!


#11

@PoonamV thank you for your help and advice!


(Martin) #12

You are very welcome! :smile: