WGAN Data Prep - Lesson 12

I am working on running the WGAN notebook and getting the data prepared to do so. I’ve decided to start with the 20 Percent subset of the Data that @jeremy has created on Kaggle.

So far, I have encountered the following. The download file is lsun_bedroom.zip. When you unzip it, it results in a single new file called sample.zip. When you unzip sample, it creates a directory called data0, which contains a series of nested numbered files.

This means that to actually get to an image file I end up with a path that follow a structure similar to the following:

{Path}/data0/bedroom/lsun/0/0/0/<bedroom1.jpg>

I get the following error when I run cell 3:

PATH = Path('data/lsun/')
IMG_PATH = PATH/'bedroom'
CSV_PATH = PATH/'files.csv'
TMP_PATH = PATH/'tmp'
TMP_PATH.mkdir(exist_ok=True)
...
FileNotFoundError: [Errno 2] No such file or directory: 'data/lsun/tmp'

I get the same error message regardless of whether I explicitly create a ‘tmp’ directory or not.

I got around the first error by explicitly creating the tmp file via terminal and commenting the TMP_PATH.mkdir(exist_ok=True) line out. I also changed the files = PATH.glob('bedroom/**/*.jpg') to the one in the screenshot below.

The error I am getting now is:

FileNotFoundError: [Errno 2] No such file or directory: 'data/lsun/files.csv'

I’m not sure exactly what the issue is, but I am feeling like it may have something to do with the permission setting of my docker instance as it relates to the larger file system. I am using @Matthew docker container via his setup, outlined here.

UPDATE–

Turns out it had nothing to do with permissions. I changed the working directory explicitly from ‘/code/fastai/courses/dl2’ to ‘/data/lsun’, which seems to have solved the issue.

Still trying to understand the reason for the atypical file structure of the data (ie ‘bedroom/**/*.jpg’).

From your latest update it appears that you don’t have the data directory as a subdirectory of the courses/dl2 directory. The paths in the notebook are relative paths not absolute paths so you need to have a file structure like /code/fastai/courses/dl2/data/lsun/ and then start jupyter notebook from within the dl2 directory. That’s what I do and everything looks to be working fine for me with the WGAN notebook.

Also, the ** in files = PATH.glob(‘bedroom/**/*.jpg’) should do a recursive search so you don’t need to change that either.

I actually kept the system file structure that I started with, so the data dir does not live inside of dl2. As I mentioned, just the single line of code os.chdir(PATH), solved the issue.

In regards to the creation of files.csv, you are correct files = PATH.glob(‘bedroom/**/*.jpg’) does, in fact, work. Yesterday, I was trying everything I could think of to get things working.

The Result…

image

1 Like