Can any one kindly give like an overview answer as to why two runs on the same model return different accuracy or losses when run a different time?
Or is there a way to get the same results even after a different run?
Can any one kindly give like an overview answer as to why two runs on the same model return different accuracy or losses when run a different time?
Or is there a way to get the same results even after a different run?
Reproduciability is one of the biggest problem in machine learning. No two runs have the same result. There is lots of randomness in the learning process like splitting train test data, choosing which set of data being used in each batch and so on. This is expected. As long as the two runs have similar values instead of the same value, we should be good.
I agree with what @nareshr8 wrote. On top of that, random weight initialization and dropout are additional sources of randomness in the training procedure of neural networks.
@jimmiemunyi I also recommend having a look at this thread: [Solved] Reproducibility: Where is the randomness coming in?
thanks @nareshr8 and @stefan-ai.
Thanks for the thread suggestion too
random_seed(0,use_cuda=False ) before tabular_learner and also fit_one_cycle
in V1 was helpful in reproducing results.
You get different results when making inference? Or when training?
Because if it is when training it is normal and that is why they have put you up, but if it is when making inference it should not happen
I’d suggest looking here: [Solved] Reproducibility: Where is the randomness coming in? - #28 by harikrishnanrajeev
but what I found to reproduce is just the line
set_seed(42)
before the dataloaders.
Here is my theory as to why you need this. If anyone can confirm or correct this please do:
is this right?
if so, are we handicapping the the model training because its shuffling the training set the same way for each epoch?
@daveramseymusic your understanding is perfectly correct. Coming to your question, in the production setting different random sequences of ‘set_seed()’ are chosen to find the better minima, then the ensembling technique is used on best-performing models.