I am currently fine-tuning resnet18 on ~100,000 images and ~7000 classes and suffice to say it is taking a long time to get through an epoch (about 15 minutes each but I am probably being too impatient ). As I was staring down the progress bar I started thinking there is clearly a tradeoff between model size and model accuracy and speed.
My question is there a “rule of thumb” on how many parameters a model should contain for a given training set size? In my case I have 100,000 images that are 600x600x3 = 108 billion pixels.
This paper talks about model efficiency but doesn’t provide guidance on model size relative to training set size.
Help me speed up my epochs please