So I was digging through the code and was confused by this. Why are we calling the .eval() method after creating them?
Don’t we want them in training mode?
I believe this is because dummy data is being passed through the model at the time it is initialized to determine input/output shapes the layers and calling .eval() prevents any updates to batch norm stats during this process.
Hello
That seems like a plausible reason. However I can’t see the code that switches it back to training mode. Or will this happen automatically during the training loop?
I am assuming training on eval mode will not enable us to update the batchnorm which is undesirable behavior.
I believe that switch is automated during the training loop because eval mode should be turned on as well when your validation set is evaluated every epoch.