Tradeoff between model size and model accuracy/speed

I am currently fine-tuning resnet18 on ~100,000 images and ~7000 classes and suffice to say it is taking a long time to get through an epoch (about 15 minutes each but I am probably being too impatient :slight_smile:). As I was staring down the progress bar I started thinking there is clearly a tradeoff between model size and model accuracy and speed.

My question is there a “rule of thumb” on how many parameters a model should contain for a given training set size? In my case I have 100,000 images that are 600x600x3 = 108 billion pixels.

This paper talks about model efficiency but doesn’t provide guidance on model size relative to training set size.

Help me speed up my epochs please :smile:

those are very large images to begin with !. Does the image make sense if you scale down to say 128x128. afterwards when you have trained on the small images you could return to the large images.

1 Like

I was hoping to preserve the image size to retain the detail in the images (they are landmarks, buildings, dwellings) as I want the learner to recognise styles (e.g. facades, materials, layout, surroundings, etc). But I think you are right. It doesn’t look like resnet18 has the capacity to memorise this many pixels (i.e. I can’t get it to overfit to the training set).