[Solved] Reproducibility: Where is the randomness coming in?

Pomo · November 29, 2018, 11:21pm

I would like to be confident of measures between different settings and models. However, even when initializing all the seeds I know about, training loss varies between identical runs.

def random_seed(seed_value, use_cuda):  #gleaned from multiple forum posts
    np.random.seed(seed_value) # cpu vars
    torch.manual_seed(seed_value) # cpu  vars
    random.seed(seed_value) # Python
    if use_cuda: torch.cuda.manual_seed_all(seed_value) # gpu 


random_seed(42,True)
data = ImageDataBunch.from_csv(csv_labels=LABELS, suffix='.tif', path=TRAIN, ds_tfms=None, bs=BATCH_SIZE, size=96).normalize(imagenet_stats)

data.show_batch(rows=2, figsize=(96,96))

learn = create_cnn(data, arch, metrics=error_rate)  #resnet34
lr = 1e-2
learn.fit_one_cycle(1, max_lr=lr)

The training losses for three runs are: .168973, .169944, .167258

Images displayed by show_batch appear to be in the same order and look identical (to the eye).

So what’s going on? It seems that if all the seeds are initialized, the results should be equal. A 1% variation over a single epoch is enough to affect my confidence in comparing various settings.

Could there be randomness in the GPU calculations? Or something related to CPU cores?

Thanks for any insight and advice.

fastai 1.0.30
Nvidia 1070

Pomo · November 30, 2018, 1:41am

P.S. Setting num_workers=1 in ImageDataBunch.from_csv() does not help.

stephenjohnson · November 30, 2018, 4:35am

Weights and biases in PyTorch aren’t set randomly. https://discuss.pytorch.org/t/how-are-layer-weights-and-biases-initialized-by-default/13073

stephenjohnson · November 30, 2018, 4:38am

You should be able to save your model after being created and before any training and then re-use that saved/untrained model on subsequent trainings

Pomo · December 1, 2018, 1:34am

Searching in these and in the PyTorch forums, it seem that many others have run into this issue of reproducibility. I gathered all their suggestions into the following code:

def random_seed(seed_value, use_cuda):
    np.random.seed(seed_value) # cpu vars
    torch.manual_seed(seed_value) # cpu  vars
    random.seed(seed_value) # Python
    if use_cuda: 
        torch.cuda.manual_seed(seed_value)
        torch.cuda.manual_seed_all(seed_value) # gpu vars
        torch.backends.cudnn.deterministic = True  #needed
        torch.backends.cudnn.benchmark = False

The good news is that the training code above now gives repeatable results. I did not test to know precisely which initializations are critical, but do know that torch.backends.cudnn.deterministic = True is necessary, and the num_workers does not matter. The not so good news is this reproducibility does not survive a kernel restart.

The best news is that it also gives repeatable results across kernel restarts iff num_workers=0 is passed to the data loader. This has something to do with each worker getting initialized with its own random seeds. Someone more patient than I could devise a worker_init_fn that provides both kernel restart repeatability and different seeds for each worker. But for now I am content with using num_workers=0.

To sum up - to get reproducible measures across runs and kernel restarts, use the above random_seed function and pass num_workers=0 when generating the DataBunch. Non-repeatabilty was leaking in through CudaNN and the data loader workers.

Pomo · December 1, 2018, 1:35am

Stephen - thanks for responding, and sorry that my issue was not clear. It’s to get a single deterministic measure when providing the same inputs, rather than a distribution of measures that varies by 1%. Even reloading the same initial model weights yields varying results if deterministic is not set to False and num_workers to zero.

Pomo · December 1, 2018, 3:11am

Oops! Now I see that this advice is already given in the fastai docs.

blissweb · May 22, 2019, 10:12am

Ha… at least this is a function rather than the list of code in the docs. How about FASTAI library having a callable function or setting somewhere to do this. I think its a must have when experimenting.

blissweb · May 22, 2019, 12:07pm

As an addendum, you also need to watch out for seeds used in generating your test and training split. Mine was non-reproducible even after the above due to my use of the pandas sample function to create my validation set:

df.sample(frac=0.3)

You can fix this using the seed there too:

df.sample(frac=0.3),random_state=42)

Now I finally have reproducible results. Yay!

rpcoelho · June 26, 2019, 8:06pm

Hi @Pomo, I saw your very helpful answer and implemented it but I’m still not getting reproducible results. I’m not sure if I’m setting num_workers in the right place but I’m loading the datasets already split so this can’t be causing the problems like @blissweb mentioned. Here is how I’m creating the databunch, setting the seeds and running the learner:

 def random_seed(seed_value):
    import random 
    random.seed(seed_value) # Python
    import numpy as np
    np.random.seed(seed_value) # cpu vars
    import torch
    torch.manual_seed(seed_value) # cpu  vars
    
    if torch.cuda.is_available(): 
        torch.cuda.manual_seed(seed_value)
        torch.cuda.manual_seed_all(seed_value) # gpu vars
        torch.backends.cudnn.deterministic = True  #needed
        torch.backends.cudnn.benchmark = False

random_seed(0)

dep_var = 'NumberOfSales'
df = train_df[cat_vars + cont_vars + [dep_var]].copy()

path="c:/Benchmarking/testBench.csv"
data = (TabularList.from_df(df, cat_names=cat_vars, cont_names=cont_vars, procs=procs,)
                .split_by_idx(valid_idx)
                .label_from_df(cols=dep_var, label_cls=FloatList, log=False)
                .add_test(TabularList.from_df(test_df, path=path, cat_names=cat_vars, cont_names=cont_vars))
                .databunch(num_workers=0))

    #x=best.x   I'm using scikit opt to find the best parameters but then can't reproduce the results.
    x=[500, 500, 100, 0.0005, 0.4, 8]
    print(x)
    learn3 = tabular_learner(data, layers=[x[0],x[1],x[2]], ps=[0.09,0.5,0.5], emb_drop=0.04, 
                        y_range=y_range, metrics=mae)
    learn3.fit_one_cycle(1, x[3], wd=x[4], div_factor=x[5])

Pomo · June 26, 2019, 8:51pm

Hi Rodrigo,

More work has since been done on this question by myself and others. It looks like random seeds need to be set before creating the DataBunch and before the first fit() and maybe before creating the Learner. Please see this thread:

https://forums.fast.ai/t/lesson1-reproducible-results-setting-seed-not-working/37921
Also,
https://forums.fast.ai/t/help-debug-reproducable-results-solved/48839/2?u=pomo

But I am now quite out of touch with the current “SOTA” in fastai reproducibility. (I was using it to isolate the effects of hyperparameters.) It would be a service if you could combine these posts, do your own experiments, and summarize your conclusions here. I would certainly appreciate it!

rpcoelho · June 27, 2019, 12:37am

Ok, finally got it to work. So just detailing the instructions a bit more:

You have to run random_seed(0), before the first fit;
You have to run it before creating the databunch;
And you have to call it every time for each different time you call fit.

I was calling it before creating the databuch and assuming the seed would be set. So besides the code above, this solved it for me:

    random_seed(0) #Need to insert this line here again before calling fit
    x=[500, 500, 100, 0.0005, 0.4, 8]
    learn3 = tabular_learner(data, layers=[x[0],x[1],x[2]], ps=[0.09,0.5,0.5], emb_drop=0.04, 
                        y_range=y_range, metrics=mae)
    learn3.fit_one_cycle(1, x[3], wd=x[4], div_factor=x[5])

Pomo · June 27, 2019, 4:53am

Thanks! Your efforts will be helpful to me and others.

esingildinov · February 13, 2020, 3:48pm

To get reproducible results between kernel restarts, run script or jupyter with fixed PYTHONHASHSEED:

env PYTHONHASHSEED=42 python train.py
or
env PYTHONHASHSEED=42 jupyter notebook

Note that setting PYTHONHASHSEED inside notebook or train doesn’t help. Hope this helps!

barnacl · February 27, 2020, 6:11pm

i’m using this to seed as suggested here:

def random_seed(seed_value, use_cuda):
    np.random.seed(seed_value) # cpu vars
    torch.manual_seed(seed_value) # cpu  vars
    random.seed(seed_value) # Python
    if use_cuda: 
        torch.cuda.manual_seed(seed_value)
        torch.cuda.manual_seed_all(seed_value) # gpu vars
        torch.backends.cudnn.deterministic = True  #needed
        torch.backends.cudnn.benchmark = False

but i’m not able to reproduce the values. what am i missing ?
Thanks in advance.

muellerzr · February 27, 2020, 6:15pm

@barnacl make sure you also pass a seed into your RandomSplitter too, you may be missing one there because you’ve already split everything before you set it

barnacl · February 27, 2020, 7:50pm

oops! should have been more careful.
made that change but still missing something. here is the copy of your notebook @muellerzr with the changes i added in. https://colab.research.google.com/drive/1Ur6ftKvOjXgukHlmhiPa7AUcMZCK0hTQ
fastcore just got updated and is breaking somethings i thinks.
will report back

muellerzr · February 27, 2020, 7:51pm

@barnacl another issue could be your environment setup. If you look at the most recent notebooks I just do a pip install fastai2. No need for torch etc to be on specific versions

barnacl · February 27, 2020, 7:52pm

ah ok let me check that too. thank you

barnacl · February 27, 2020, 8:12pm

pinned fastcore to 0.1.12 (0.1.13 was complainging about as_item missing).
not able to get rid of randomness.