@raspstephan are you referring to seeing that while using the fast.ai lib? If I’m remembering right, that funny effect happens because of dropout: the training score is computed with dropout (which knocks out a bunch of the network, thereby weakening it), while the validation score is not (I suppose since it’s supposed to mimic how you’d perform on real test data, where we typically turn off dropout). Jeremy covers this oddity in lecture (assuming I’m not mis-remembering), I’ll try to find a link.
5 Likes