Can anyone explain the significance?
The first line calculates all the losses. The second computes their mean. When the model is in eval mode, the mean loss is 5x greater. There are many nn.BatchNorm3d’s in modelR.
This divergence between train and eval loss appeared as training progressed and got bigger. For most of the training, the eval loss had been smaller than the training loss.
What I am hoping for is an intuitive explanation of what is going on. Thanks!!!
modelR.train()
trainedLosses = [lossfn(mtarget.cuda(),modelR(mbatch.cuda())).item() for (mbatch,mtarget) in training_generator_mem()]
mean(trainedLosses)
0.0002706261747435848
modelR.eval()
trainedLosses = [lossfn(mtarget.cuda(),modelR(mbatch.cuda())).item() for (mbatch,mtarget) in training_generator_mem()]
mean(trainedLosses)
0.001313385098102218