AlexNet performs better than VGG16, ResNets on medical (grayscale) images. Why?

Hi, I am trying to solve a binary classification problem on a dataset of 3540 gray scale medical images. I am using an 80:20 training : validation split ratio which means I have 708 images in validation set. On experimenting the procedure mentioned in lesson 1 for a number of different architectures, I found the following order of accuracy for different models -

  1. AlexNet ( 99.15 %)
  2. ResNet-18 (99.01%)
  3. ResNet-34 (97.87%)
  4. Vgg16_bn (98.72%)
  5. ResNet-50 (98.57%)

I found a similar post on ResearchGate where some other person obtained similar results, i.e. AlexNet performing better than ResNets on medical images ( Link :

I’d be grateful if somebody could explain why AlexNet outperforms other models?

Also, Is it fair to attribute ResNet-50’s performance to overfitting?


It might be because you have less number of training images, as AlexNet has less number of parameters it maybe learning faster (if you are training all the architectures for same number of epochs). For understanding overfitting it would be better if you can check the metric you used while training (Accuracy in your case) check if it starts decreasing after reaching a higher value (did it ever cross 98.57% ?) , if it does then it would be overfitting, if it never increases above a certain point and stays there i.e if it never crossed 98.57% for ResNet-50 then it would be a case of underfitting due to less data and lots of parameters. If it is underfitting then you might wanna train it for a few more epochs and check the accuracy.

AlexNet has around 60M parameters whereas ResNet50 has around 23M.

1 Like

Oh sorry, I didn’t realize although AlexNet is a smaller network it has many more fc layers :sweat_smile:

1 Like

Thanks! That helped.