Not a Directory error in CIFAR10 exercise

(Prajjwal) #1

get_data(32,4) brings up an error
NotADirectoryError: [Errno 20] Not a directory: ‘data/cifar10/train/0_frog.png’
How do if ix this ?

(Ramesh Sampath) #2

Can you be bit more specific? What Notebook / Location. Screenshots?

Setup: Are you using Local system (git clone?) or Paperspace or Crestle or any other environment?

Try a few thing -

  1. !pwd - to see what is the current working directory
  2. !ls or !dir to see what’s in your current working dir. Do you see a folder called data? Then do the same for subfolders.

(Prajjwal) #3

(Prajjwal) #4

I am in dl 1 folder. I downloaded and unzipped the dataset in /data directory. Not sure where the error lies. I ran the notebook on CPU as well as GPU cluster, but this error persists.

(Prajjwal) #5

What can be the possible reason ? Is it specific to fastai ?

(Jason McGhee) #6

Seeing this as well. I solved it by making folders for classes in “train” and “test”.

Remember how we did cats and dogs in lesson 1?
train/cats and train/dogs

I wanted to see what classes we had:
cd train && find . | grep -o [a-z]*.png | sort -u && cd .

We have: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck

I made new folders
mkdir train_ test_

I went into one of them to make our classes, created the fn to organize the files, and executed it:

  1. cd train_

  2. mkdir airplane automobile bird cat deer dog frog horse ship truck

  3. cd ..

  4. function copytrain { for arg in $@; do cp $(find train -name '*'$arg'.png') train_/$arg/; done; };

  5. copytrain $(ls train_ | grep -o "[a-z]*")

It took a few minutes to run. Lots of files.

Then repeat 1-5, but with test and test_ instead of train and train_

Now it all works. This is because that from_paths method is expecting folders for the classes.

Make sure the new folders you created match the names you provide to from_paths val_name and trn_name.

Howto: installation on Windows
(Prajjwal) #8

Thanks, worked like charm !


oh wow. this didn’t come up when I searched on the same error at about the same time.

(Andrew Holmgren) #10

If someone wants to do this in python (went with python since I’m on windows) this was the code I used (assumes either your .py file or notebook file is located in the courses/dl1 directory):

import os
import glob
import shutil
classes = ('airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
cwd = os.getcwd()
train_path = cwd + '/data/cifar/train/'
# go through classes and make a directory for each one
for class_now in classes:
    path_now = train_path + class_now
    if not os.path.exists(path_now):
# go through classes and match them with file names
# file names are e.g. '123_frog.png' so glob picks out all the e.g. frog files
for class_now in classes:
    identifier = train_path + '*' + class_now + '.png'
    class_files = glob.glob(identifier)
    file_destination = train_path + class_now
    # move all frog files to proper class directory
    for file_to_move in class_files:
        shutil.move(file_to_move, file_destination)

# do all the same but now for the test data
test_path = cwd + '/data/cifar/test/'
for class_now in classes:
    path_now = test_path + class_now
    if not os.path.exists(path_now):
for class_now in classes:
    identifier = test_path + '*' + class_now + '.png'
    class_files = glob.glob(identifier)
    file_destination = test_path + class_now
    for file_to_move in class_files:
        shutil.move(file_to_move, file_destination)

(Alison Bird) #11

Thank you! This is so helpful

(Jason McGhee) #12

I’m also on windows. I use GitBash to be able to execute bash scripts.

(Aditya) #13

If you are on 64 bit and win 10, kindly search for Windows Subsystem Linux

(Andreas Daiminger) #14

Hi @jsonm
Thanks a lot. This was very helpful. But I run into a different error after following your instructions.
This is the error I get when I run data = get_data(32,4) :

ValueError                                Traceback (most recent call last)
<ipython-input-47-6a185ac353fc> in <module>()
----> 1 data = get_data(32,4)

<ipython-input-45-88c9e0487857> in get_data(sz, bs)
      1 def get_data(sz,bs):
      2     tfms = tfms_from_stats(stats, sz, aug_tfms=[RandomFlip()], pad=sz//8)
----> 3     return ImageClassifierData.from_paths(PATH, val_name='test', tfms=tfms, bs=bs)

~/fastai/courses/dl1/fastai/ in from_paths(cls, path, bs, tfms, trn_name, val_name, test_name, test_with_labels, num_workers)
    423             test = folder_source(path, test_name) if test_with_labels else read_dir(path, test_name)
    424         else: test = None
--> 425         datasets = cls.get_ds(FilesIndexArrayDataset, trn, val, tfms, path=path, test=test)
    426         return cls(path, datasets, bs, num_workers, classes=trn[2])

~/fastai/courses/dl1/fastai/ in get_ds(fn, trn, val, tfms, test, **kwargs)
    362         res = [
    363             fn(trn[0], trn[1], tfms[0], **kwargs), # train
--> 364             fn(val[0], val[1], tfms[1], **kwargs), # val
    365             fn(trn[0], trn[1], tfms[1], **kwargs), # fix
    366             fn(val[0], val[1], tfms[0], **kwargs)  # aug

~/fastai/courses/dl1/fastai/ in __init__(self, fnames, y, transform, path)
    259         self.y=y
    260         assert(len(fnames)==len(y))
--> 261         super().__init__(fnames, transform, path)
    262     def get_y(self, i): return self.y[i]
    263     def get_c(self):

~/fastai/courses/dl1/fastai/ in __init__(self, fnames, transform, path)
    235     def __init__(self, fnames, transform, path):
    236         self.path,self.fnames = path,fnames
--> 237         super().__init__(transform)
    238     def get_sz(self): return
    239     def get_x(self, i): return open_image(os.path.join(self.path, self.fnames[i]))

~/fastai/courses/dl1/fastai/ in __init__(self, transform)
    154         self.transform = transform
    155         self.n = self.get_n()
--> 156         self.c = self.get_c()
    157 = self.get_sz()

~/fastai/courses/dl1/fastai/ in get_c(self)
    266 class FilesIndexArrayDataset(FilesArrayDataset):
--> 267     def get_c(self): return int(self.y.max())+1

~/anaconda3/envs/fastai/lib/python3.6/site-packages/numpy/core/ in _amax(a, axis, out, keepdims)
     24 # small reductions
     25 def _amax(a, axis=None, out=None, keepdims=False):
---> 26     return umr_maximum(a, axis, None, out, keepdims)
     28 def _amin(a, axis=None, out=None, keepdims=False):

ValueError: zero-size array to reduction operation maximum which has no identity