I am working on running the WGAN notebook and getting the data prepared to do so. I’ve decided to start with the 20 Percent subset of the Data that @jeremy has created on Kaggle.
So far, I have encountered the following. The download file is lsun_bedroom.zip. When you unzip it, it results in a single new file called sample.zip. When you unzip sample, it creates a directory called data0, which contains a series of nested numbered files.
This means that to actually get to an image file I end up with a path that follow a structure similar to the following:
{Path}/data0/bedroom/lsun/0/0/0/<bedroom1.jpg>
I get the following error when I run cell 3:
PATH = Path('data/lsun/')
IMG_PATH = PATH/'bedroom'
CSV_PATH = PATH/'files.csv'
TMP_PATH = PATH/'tmp'
TMP_PATH.mkdir(exist_ok=True)
...
FileNotFoundError: [Errno 2] No such file or directory: 'data/lsun/tmp'
I get the same error message regardless of whether I explicitly create a ‘tmp’ directory or not.
I got around the first error by explicitly creating the tmp file via terminal and commenting the TMP_PATH.mkdir(exist_ok=True) line out. I also changed the files = PATH.glob('bedroom/**/*.jpg') to the one in the screenshot below.
FileNotFoundError: [Errno 2] No such file or directory: 'data/lsun/files.csv'
I’m not sure exactly what the issue is, but I am feeling like it may have something to do with the permission setting of my docker instance as it relates to the larger file system. I am using @Matthew docker container via his setup, outlined here.
UPDATE–
Turns out it had nothing to do with permissions. I changed the working directory explicitly from ‘/code/fastai/courses/dl2’ to ‘/data/lsun’, which seems to have solved the issue.
From your latest update it appears that you don’t have the data directory as a subdirectory of the courses/dl2 directory. The paths in the notebook are relative paths not absolute paths so you need to have a file structure like /code/fastai/courses/dl2/data/lsun/ and then start jupyter notebook from within the dl2 directory. That’s what I do and everything looks to be working fine for me with the WGAN notebook.
Also, the ** in files = PATH.glob(‘bedroom/**/*.jpg’) should do a recursive search so you don’t need to change that either.
I actually kept the system file structure that I started with, so the data dir does not live inside of dl2. As I mentioned, just the single line of code os.chdir(PATH), solved the issue.
In regards to the creation of files.csv, you are correct files = PATH.glob(‘bedroom/**/*.jpg’) does, in fact, work. Yesterday, I was trying everything I could think of to get things working.