Gradient/Paperspace can't load Datasets anymore

I was loading Imagenette from lesson 07 and got an error, then tried the small Mnist dataset (i thought maybe its the size) but neither worked. Error below, it sounds to me like they changed something about the structure they download data into their filesystem as in the temporary vs persistent one, as it’s a rename error as far as I understand that happens in that case. Any help much appreciated

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
/opt/conda/lib/python3.7/shutil.py in move(src, dst, copy_function)
    565     try:
--> 566         os.rename(src, real_dst)
    567     except OSError:

OSError: [Errno 18] Invalid cross-device link: '/tmp/tmpu8704f5v/imagenette2/imagenette2' -> '/storage/data/imagenette2'

During handling of the above exception, another exception occurred:

FileExistsError                           Traceback (most recent call last)
/tmp/ipykernel_47/957957985.py in <module>
----> 1 path = untar_data(URLs.IMAGENETTE)

/opt/conda/lib/python3.7/site-packages/fastai/data/external.py in untar_data(url, archive, data, c_key, force_download)
    122     "Download `url` to `fname` if `dest` doesn't exist, and extract to folder `dest`"
    123     d = FastDownload(fastai_cfg(), module=fastai.data, archive=archive, data=data, base='~/.fastai')
--> 124     return d.get(url, force=force_download, extract_key=c_key)

/opt/conda/lib/python3.7/site-packages/fastdownload/core.py in get(self, url, extract_key, force)
    120             if data.exists(): return data
    121         self.download(url, force=force)
--> 122         return self.extract(url, extract_key=extract_key, force=force)

/opt/conda/lib/python3.7/site-packages/fastdownload/core.py in extract(self, url, extract_key, force)
    112         dest = self.data_path(extract_key)
    113         dest.mkdir(exist_ok=True, parents=True)
--> 114         return untar_dir(arch, dest, rename=True, overwrite=force)
    115 
    116     def get(self, url, extract_key='data', force=False):

/opt/conda/lib/python3.7/site-packages/fastcore/xtras.py in untar_dir(fname, dest, rename, overwrite)
    231             else: return dest
    232         if rename: src = _unpack(fname, out)
--> 233         shutil.move(str(src), dest)
    234         return dest
    235 

/opt/conda/lib/python3.7/shutil.py in move(src, dst, copy_function)
    575                             " '%s'." % (src, dst))
    576             copytree(src, real_dst, copy_function=copy_function,
--> 577                      symlinks=True)
    578             rmtree(src)
    579         else:

/opt/conda/lib/python3.7/shutil.py in copytree(src, dst, symlinks, ignore, copy_function, ignore_dangling_symlinks)
    322         ignored_names = set()
    323 
--> 324     os.makedirs(dst)
    325     errors = []
    326     for name in names:

/opt/conda/lib/python3.7/os.py in makedirs(name, mode, exist_ok)
    221             return
    222     try:
--> 223         mkdir(name, mode)
    224     except OSError:
    225         # Cannot rely on checking for EEXIST, since the operating system

FileExistsError: [Errno 17] File exists: '/storage/data/imagenette2'

I also got the same error while doing untar of MNIST_Sample dataset.

I tried removing the dataset and it worked.
Please refer below thread.

2 Likes

Life saver, thanks!

As the thread linked shows, but this is what I did:

Ran this in a notebook (in paperspace)
rm -rf /storage/data/imagenette2
then this worked again
path = untar_data(URLs.IMAGENETTE)

A little confused still, as I obviously re-ran the untar command many times previously when restarting a notebook, but only now there was an issue.

1 Like