Hi, I am trying to solve a binary classification problem on a dataset of 3540 gray scale medical images. I am using an 80:20 training : validation split ratio which means I have 708 images in validation set. On experimenting the procedure mentioned in lesson 1 for a number of different architectures, I found the following order of accuracy for different models -
It might be because you have less number of training images, as AlexNet has less number of parameters it maybe learning faster (if you are training all the architectures for same number of epochs). For understanding overfitting it would be better if you can check the metric you used while training (Accuracy in your case) check if it starts decreasing after reaching a higher value (did it ever cross 98.57% ?) , if it does then it would be overfitting, if it never increases above a certain point and stays there i.e if it never crossed 98.57% for ResNet-50 then it would be a case of underfitting due to less data and lots of parameters. If it is underfitting then you might wanna train it for a few more epochs and check the accuracy.