Don’t know if it is the right spot to ask this question.
I try to make the model generalize better - and I searched for some examples, notebooks, that I can follow - nothing so far, just theory.
The plan is to:
- 1/ train until overfit
- 2/ data augmentation
- 3/ add dropout (this is why I found your post)
- 4/ add weight_decay
- 5/ test different optimizers/sizes/bs etc
What I have for now is a saved model for two classes (happy/not_happy with 2000 examples each from emotioNet - images in the wild) - after unfreezing it and training many epochs with the following results:
epoch trn_loss val_loss accuracy
x 0.044441 0.107906 0.9625
And TTA 0.96375
The validation Loss doesn’t go lower - and when I try to apply the model to random pictures faces collected in the lab - the result is a non-sense.
So this is why I want to make the model generalise better.
The confusion comes because all these steps 2-5 are defined when initializing the learner
ConvLearner.pretrained(arch, data, precompute=False, xtra_fc=[1024, 512], ps=[0, 0, 0], opt_fn=optimizer)
Do you have any insights, workflow, suggestions on what are the best practices to test the model and make it generalize better?