Hello all (@miwojc, @digitalspecialists)
Thank you for your patience !
Finally my databunch is ready and a learner based on resnet34 is finding the lr.
Here is what I had to do.
move 1970 random images (out of 9850 i.e. apprx 20%) from train directory to valid directory
built a labels.csv with two columns containing path+image_name & class name
create a codes file containing unique class names from the labels.csv
Build databunch using:
codes = np.loadtxt(path/‘codes.txt’, dtype=str)
data = (ImageFileList.from_folder(path) .label_from_df(df_fn_labels, fn_col=0, label_col=1) .split_by_folder() .datasets(ImageClassificationDataset, fns=fnames, labels=labels, classes=codes) .transform(get_transforms(), size=128) .databunch() .normalize(imagenet_stats))
To miwojc’s point , there is opportunity for the library to automate this. Maybe even clean things up (same df called df_fn_labels is feeding 2-3 objects)
I am sure the performance to the learner will be bad because of single image classes being in validation dataset not seen in training dataset. But at least it’s a start