[Solved] Reproducibility: Where is the randomness coming in?

Ha… at least this is a function rather than the list of code in the docs. How about FASTAI library having a callable function or setting somewhere to do this. I think its a must have when experimenting.

As an addendum, you also need to watch out for seeds used in generating your test and training split. Mine was non-reproducible even after the above due to my use of the pandas sample function to create my validation set:

df.sample(frac=0.3)

You can fix this using the seed there too:

df.sample(frac=0.3),random_state=42)

Now I finally have reproducible results. Yay!

2 Likes

Hi @Pomo, I saw your very helpful answer and implemented it but I’m still not getting reproducible results. I’m not sure if I’m setting num_workers in the right place but I’m loading the datasets already split so this can’t be causing the problems like @blissweb mentioned. Here is how I’m creating the databunch, setting the seeds and running the learner:

 def random_seed(seed_value):
    import random 
    random.seed(seed_value) # Python
    import numpy as np
    np.random.seed(seed_value) # cpu vars
    import torch
    torch.manual_seed(seed_value) # cpu  vars
    
    if torch.cuda.is_available(): 
        torch.cuda.manual_seed(seed_value)
        torch.cuda.manual_seed_all(seed_value) # gpu vars
        torch.backends.cudnn.deterministic = True  #needed
        torch.backends.cudnn.benchmark = False

random_seed(0)

dep_var = 'NumberOfSales'
df = train_df[cat_vars + cont_vars + [dep_var]].copy()

path="c:/Benchmarking/testBench.csv"
data = (TabularList.from_df(df, cat_names=cat_vars, cont_names=cont_vars, procs=procs,)
                .split_by_idx(valid_idx)
                .label_from_df(cols=dep_var, label_cls=FloatList, log=False)
                .add_test(TabularList.from_df(test_df, path=path, cat_names=cat_vars, cont_names=cont_vars))
                .databunch(num_workers=0))

    #x=best.x   I'm using scikit opt to find the best parameters but then can't reproduce the results.
    x=[500, 500, 100, 0.0005, 0.4, 8]
    print(x)
    learn3 = tabular_learner(data, layers=[x[0],x[1],x[2]], ps=[0.09,0.5,0.5], emb_drop=0.04, 
                        y_range=y_range, metrics=mae)
    learn3.fit_one_cycle(1, x[3], wd=x[4], div_factor=x[5])
3 Likes

Hi Rodrigo,

More work has since been done on this question by myself and others. It looks like random seeds need to be set before creating the DataBunch and before the first fit() and maybe before creating the Learner. Please see this thread:

https://forums.fast.ai/t/lesson1-reproducible-results-setting-seed-not-working/37921
Also,
https://forums.fast.ai/t/help-debug-reproducable-results-solved/48839/2?u=pomo

But I am now quite out of touch with the current “SOTA” in fastai reproducibility. (I was using it to isolate the effects of hyperparameters.) It would be a service if you could combine these posts, do your own experiments, and summarize your conclusions here. I would certainly appreciate it!

1 Like

Ok, finally got it to work. So just detailing the instructions a bit more:

  1. You have to run random_seed(0), before the first fit;
  2. You have to run it before creating the databunch;
  3. And you have to call it every time for each different time you call fit.

I was calling it before creating the databuch and assuming the seed would be set. So besides the code above, this solved it for me:

    random_seed(0) #Need to insert this line here again before calling fit
    x=[500, 500, 100, 0.0005, 0.4, 8]
    learn3 = tabular_learner(data, layers=[x[0],x[1],x[2]], ps=[0.09,0.5,0.5], emb_drop=0.04, 
                        y_range=y_range, metrics=mae)
    learn3.fit_one_cycle(1, x[3], wd=x[4], div_factor=x[5])
10 Likes

Thanks! Your efforts will be helpful to me and others.

To get reproducible results between kernel restarts, run script or jupyter with fixed PYTHONHASHSEED:

env PYTHONHASHSEED=42 python train.py
or
env PYTHONHASHSEED=42 jupyter notebook

Note that setting PYTHONHASHSEED inside notebook or train doesn’t help. Hope this helps!

2 Likes

i’m using this to seed as suggested here:

def random_seed(seed_value, use_cuda):
    np.random.seed(seed_value) # cpu vars
    torch.manual_seed(seed_value) # cpu  vars
    random.seed(seed_value) # Python
    if use_cuda: 
        torch.cuda.manual_seed(seed_value)
        torch.cuda.manual_seed_all(seed_value) # gpu vars
        torch.backends.cudnn.deterministic = True  #needed
        torch.backends.cudnn.benchmark = False

but i’m not able to reproduce the values. what am i missing ?
Thanks in advance. :slight_smile:

@barnacl make sure you also pass a seed into your RandomSplitter too, you may be missing one there because you’ve already split everything before you set it

1 Like

oops! should have been more careful.
made that change but still missing something. here is the copy of your notebook @muellerzr with the changes i added in. https://colab.research.google.com/drive/1Ur6ftKvOjXgukHlmhiPa7AUcMZCK0hTQ
fastcore just got updated and is breaking somethings i thinks.
will report back

@barnacl another issue could be your environment setup. If you look at the most recent notebooks I just do a pip install fastai2. No need for torch etc to be on specific versions :slight_smile:

ah ok let me check that too. thank you

pinned fastcore to 0.1.12 (0.1.13 was complainging about as_item missing).
not able to get rid of randomness.

i grabbed your random function and tried it with the mnist example from the walkthrough, still random :frowning:

using only pytorch (not fastai in this case, but no less amazing https://github.com/qubvel/segmentation_models.pytorch ),

was having same problem on jupyter, getting reproducible on run all cells (without restart ) using all the seed/force deterministic operations listed above,

but between kernel restarts results were always different :frowning:

note here: all results (splits, augmentation, pre train val epoch) were equal until torch training starts. during training something is being affected that I could only solve by setting mentioned PYTHONHASHSEED env prior to starting jupyter.

So far after doing this, can fully reproduce results between restarts. finally!
really tricky issue and hard to detect. prob a lot of people think they have reproducible results when they havent?.. (mostly like valid backups :slight_smile: )

next step: check container restarts :), host restarts, and different vms, and cloud providers… who knows…? :slight_smile:

(note: as mentioned by @esingildinov PYTHONHASHSEED has to be set prior to jupyter/kernel start, setting env var in notebook doesnt work, same thing noted here:



)

2 Likes

Any tips on how to this with collab ?

Here is an example of reproducibility in fastai2:

from fastai.vision.all import *
def is_cat(x): return x[0].isupper()
path = untar_data(URLs.PETS)/'images'
set_seed(42,True)
dls = ImageDataLoaders.from_name_func(
    path, get_image_files(path), valid_pct=0.2, seed=42,
    label_func=is_cat, item_tfms=Resize(224))
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fit(1)
set_seed(42,True)
dls = ImageDataLoaders.from_name_func(
    path, get_image_files(path), valid_pct=0.2, seed=42,
    label_func=is_cat, item_tfms=Resize(224))
learn = cnn_learner(dls, resnet34, metrics=error_rate)
learn.fit(1)

please notice that you much set the seed before the dataloader is created, and recreate the dataloader when setting a new seed.

Dataloader keeps an internal Random Number Generator that is seeded with a random number. The seed is not updated with set_seed which is why you have to recreate it.

5 Likes

Just wanted to say that this is one of the most important threads on the forum … and hopefully will find its way into the library and also a dedicated place in the docs w/r/t to when the seed needs to be set.

4 Likes

So what is the definition here of “reproducible”?

I’ve followed the above steps, running my “set_seed” function before creating my dataloaders via dblock.dataloaders(df, num_workers=0), setting the seed in my RandomSplitter, and running that “set_seed” function before creating my Learner and before every call to fit_one_cyclebut the results are never identical.

And that is what I actually expect when training a NN.

So am I wrong? Are folks expecting and getting the exact same validation loss and metrics after each epoch every time they re-run their data loading and training loop code?

Run #1

Run #2

No kernel restart … no fit_cbs … just rebuilding the Learner and calling fit_one_cycle.

2 Likes

running a tabular model , was able to reproduce exact same validation loss and metrics using the random_seed() and placing random_seed(0,use_cuda=False ) before tabular_learner and also fit_one_cycle

many thanks to @Pomo , @rpcoelho and others

1 Like