[Solved] Reproducibility: Where is the randomness coming in?

rquintino · February 29, 2020, 1:52pm

using only pytorch (not fastai in this case, but no less amazing https://github.com/qubvel/segmentation_models.pytorch ),

was having same problem on jupyter, getting reproducible on run all cells (without restart ) using all the seed/force deterministic operations listed above,

but between kernel restarts results were always different

note here: all results (splits, augmentation, pre train val epoch) were equal until torch training starts. during training something is being affected that I could only solve by setting mentioned PYTHONHASHSEED env prior to starting jupyter.

So far after doing this, can fully reproduce results between restarts. finally!
really tricky issue and hard to detect. prob a lot of people think they have reproducible results when they havent?.. (mostly like valid backups )

next step: check container restarts :), host restarts, and different vms, and cloud providers… who knows…?

(note: as mentioned by @esingildinov PYTHONHASHSEED has to be set prior to jupyter/kernel start, setting env var in notebook doesnt work, same thing noted here:

)