Hallo guys,
in order to reproduce my training and in order to better understand how certain parameterchanges e.g. learning rate, wd, pcl_start, slicing etc. effects the training process and the metric, it is important to set all the random seeds.
However, I tried:
1:
import torch
torch.manual_seed(42)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
2: set_seed(42)
3. np.random.seed(42)
Nothing seems to help, as I get varying results for:
May I ask, are you setting your random seed in every cell that you want to reproduce the results of?
Until recently, I did not know that you have to set the seed in every cell, and thought that setting it once at the top of the notebook was sufficient. I was wrong, and now any time I do a test/train/validate split or anything else that is random, I set the seed in that specific cell.
Hi AMusic, sure. I’m very much a beginner, so don’t judge too harshly!
Here is a notebook where I experimented with a few different ways of filling in missing data, or adding new features to the titanic data-set. Each time I tried something new, I wanted to test the results on the exact same train/validate/test set.
It’s a huge notebook, so just look for the cells with np.random.seed(333). You will see that every time I re-split the data, it is the same.
I’m going to split it up into several notebooks as soon as you are done looking at it. I won’t push the changes until you tell me you are finished browsing!