@jeremy has a nice workflow for moving files around in his different notebooks, but it rarely worked perfectly because my setup was slightly different or I made a typo or whatever. So I’d have to start over, and bash commands wasn’t easily repeatable even with a bash script.
How are you dealing with this? I ended up writing this python script, which takes the raw cats/dogs input we downloaded and moves into the right directories for training/validation/sampling.
You can use it like this:
>>> import prep_data as prep >>> prep.main('~/nbs/data/redux', '~/nbs/data/redux/clean') Copied 11500 dogs files Copied 1000 dogs files Sampling dogs Copied 100 dogs files Copied 20 dogs files Copied 11500 cats files Copied 1000 cats files Sampling cats Copied 100 cats files Copied 20 cats files Copying test data