I noticed that in many cases, fine-tuning a resnet works great, if the training doesn’t take too many epochs.
However, if I have a case where much training is required, I tend to overfit the resnet (training loss going down, validation loss going up). And I see that the architecture by default only has Dropout at the end of the model.
Should I insert Dropout layers throughout the model? Or what is a good strategy to avoid overfitting a resnet?
Thanks for the hints! In this particular case I can’t get more data, and augmentation possibilities are limited.
But indeed I should add regularization.
Interesting, I didn’t know about dropblocks.
rwightman has an amazing repository of image classification models (and other repos like EfficientDet, EfficientNets, etc). It has implemented a LOT of features and data augmentations and plays nicely with fastai. Models have .to_list() methods.
Nice notebook! However, I think that it’s easier if you pass as_sequential=True in model creation. At least, gen-efficientnet-pytorch repo supports it. Then, you could use the standard fastai cnn_learner and pass cut=None. If I remember well, it works well with mobilenetV2 and EfficientNets.
>>> import geffnet
>>> m = geffnet.mixnet_l(pretrained=True, drop_rate=0.25, drop_connect_rate=0.2, as_sequential=True)
learn = cnn_learner(dls, m, cut=None, ...)