when I was applying image classification to my own images, I couldn’t find a Python package to easily split folders into training, validation and test sets. So I created one: https://github.com/jfilter/split-folders
Interesting! I was trying to build something similar here. Though my idea was to build a small toolkit with various helpers, like, visualizations, etc. that can help one during their notebook manipulations.
Now the project is in a bit stale state, though I would like to continue development soon =) Delete some old stuff, make the API stable, remove implicit dependencies from keras and Kaggle files structure, and so on.
this looks nice and I could need something like this just right know.
In my code snippet database I have this useful script for copying the images to the class folders for the cifar dataset from DeepLearning-Lec7Notes :
import shutil
classes = ('plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
# create sub-folders for each class
OUTPATH = 'data/cifar10/'
for x in classes:
os.makedirs(OUTPATH+'train/'+x, exist_ok=True)
os.makedirs(OUTPATH+'val/'+x, exist_ok=True)
INPATH = 'data/cifar/'
filenames = os.listdir(INPATH+'train/')
counts = {x:0 for x in classes}
print(len(filenames))
# copy files from cifar folder to cifar10 folder with sub-directories
valsz = len(filenames) / 10 * 0.2 # 20%
for fl in filenames:
for cl in classes:
if cl in fl:
counts[cl] += 1 # increase count +1
if counts[cl] < valsz:
shutil.copy(INPATH+'train/'+fl, OUTPATH+'val/'+cl+'/'+fl)
else:
shutil.copy(INPATH+'train/'+fl, OUTPATH+'train/'+cl+'/'+fl)
if 'automobile' in fl:
counts['car'] += 1
if counts[x] < valsz:
shutil.copy(INPATH+'train/'+fl, OUTPATH+'val/car/'+fl)
else:
shutil.copy(INPATH+'train/'+fl, OUTPATH+'train/car/'+fl)
# copy test set
filenames = os.listdir(INPATH+'test/')
os.makedirs(OUTPATH+'test/', exist_ok=True)
for fl in filenames:
shutil.copy(INPATH+'test/'+fl, OUTPATH+'test/'+fl)
Maybe this is interesting for incorporating a class to dir sorting functionality?
Maybe I will look into this, as I maybe need this for my current pet project.