Yeah, that looks normal. Note the scale, it’s only fluctuating in a pretty small range (only the last couple of epochs are in that range, so it’s preserved the final accuracy).
You’re missing the fact that
f(x) may not be a deterministic function. There are various options to remove (or at least reduce) the non-determinism of PyTorch operations. However I’d consider why you really want to do this. A lot of people seem to want to eliminate all non-determinism, but I’d tend to say if your results are going to vary a lot based on the non-determinism then this is something you should know, not hide. This may indicate issues with your model. Here performance looks stable so no issues.
In terms of the sources, of non-determinism I’m not quite sure exactly how you generated those graphs (are you just running validate repeatedly?) to know exactly what will be in play. One thing is that even when frozen fastai will update batchnorm layers. Not sure if this will be disabled given it’s looks like you’re running against the validation set.I suspect they will not update as the model should be in eval mode but could be wrong. They shouldn’t update at inference time (I’d imagine as you may feed weird batches you don’t want to update based). But validation is a bit different. Most other major sources of randomness should have been eliminated. Augmentation should be off (unless you added specific validation set augmentations) and the validation set shouldn’t be shuffled.
But there’s still some inherently random parts of PyTorch operations. Generally minimal but enough to lead to slight fluctuations. One thing that no setting in PyTorch will eliminate is differences due to floating point evaluation order. Strictly speaking floating point operations are not commutative (i.e.
a+b does not always equal
b+a). Only at a very low level, tiny differences not big ones, but an NN can easily amplify such small differences (so still fairly small but big enough to create those sorts of fluctuations fluctuations). Without severely impacting performance (and complicating code) these can’t be eliminated for various operations.
So not quite sure where the variation is coming from but looks normal and I wouldn’t try and eliminate it (though some do and a search on PyTorch non-determinism will show you things to minimise it if you really want).