I put together a flower classifier trained on the Oxford-Flower-102 dataset. 102 classes of flowers and I used the same train/valid/test split as in the data split provided. With that split, only 11 images per class were used for training (a total of 1122), and the same number for validation.
The test set contains 6149 images.
After fine-tuning a Resnet50 model, without data augmentation, I got a 92.26% accuracy on the test set. I think it’s a great model considering how little data was used for training and the number of classes.
In case someone else wants to use the data, here is the code to split the data after downloading it into train/valid/test and put it in an ImageNet folder-structure (execute in a Jupyter notebook for the magic commands to work):
import scipy.io as sio
import numpy as np
import pandas as pd
data_split = sio.loadmat('raw_data/setid.mat')
image_labels = sio.loadmat('raw_data/imagelabels.mat')['labels'][0]
trn_id = data_split['trnid'][0]
val_id = data_split['valid'][0]
tst_id = data_split['tstid'][0]
data={'filename': np.arange(1, len(image_labels)+1), 'label': image_labels}
split = []
for file_index in data['filename']:
if file_index in trn_id:
split.append('train')
elif file_index in val_id:
split.append('valid')
elif file_index in tst_id:
split.append('test')
data['filename'] = ['image_'+str(n).zfill(5)+'.jpg' for n in data['filename']]
data['split'] = split
# This dataframe has three columns: filename, label, and split (train/test/valid)
image_labels = pd.DataFrame(data)
# Execute once to create dir structure + copy the data from the jpg folder to the new labeled, split folders
# ! mkdir raw_data/train
# ! mkdir raw_data/test
# ! mkdir raw_data/valid
# for split in ['train', 'valid', 'test']:
# for label in range(102):
# label = str(label+1)
# ! mkdir 'raw_data/'$split'/'$label
# # Copy images to their labeled folders
# for index, row in image_labels.iterrows():
# #print(row['filename'])
# fname = row['filename']
# split = row['split']
# label = row['label']
# ! cp 'raw_data/jpg/'$fname 'raw_data/'$split'/'$label'/'