Strange behaviour CNN training

I’m trying to get better at using pure PyTorch and I’m doing so by taking part in a Kaggle competition.

I’m trying to train a CNN (ResNet18) with Adam (lr=0.0001 and CosineAnnealingLR) and a batch size of 64. Unfortunately, I’m not able to get decent scores. I attach some images. My question is rather simple: does anybody notice something from these images? Maybe you can spot what I’m doing wrong simply by looking at them.

trainacc trainloss valacc valloss

There is obviously a plateau after roughly 8 epochs (see accuracies), but the worse thing is the validation loss: it fluctuates a lot and I cannot explain why. Any idea? Is it possible that the validation set is too small? Regarding the accuracy, should I:

  • Decrease the LR?
  • Use a larger model (I doubt it since others are using it)?
  • Not using a cosine annealing?