Lesson1 Reproducible Results - Setting seed not working

Another clue and partial solution.

The issue is that after setting all the random seeds before creating the DataBunch, training and validation losses are inconsistent (non-reproducible) as of the first fit(). The losses have a bimodal or trimodal pattern. These observations are all after a kernel restart between runs, no transforms, and num_workers = 0.

What I see is that setting these random seeds again before fit() yields consistent, reproducible loss results. However, setting them before create_cnn still yields the bimodal loss pattern.

Conclusions:

  • Setting random seeds before creating the DataBunch is needed to have a consistent Train/Validate split.

  • create_cnn, in this case Resnet50, leaves the random seeds set inconsistently. (It’s possible that both DataBunch creation and create_cnn leave inconsistencies.)

  • Somthing in fit_one_cycle() then uses a random seed, perhaps to shuffle the images. Because the random seeds are inconsistent across runs, the losses are inconsistent.

  • You can get reproducible results by setting random seeds before creating the DataBunch (with num_workers=0) AND before the first fit_one_cycle().

I should say these conclusions are tentative because 1) there are other explanations, such as a flaky GPU; and 2) I have not identified the source of the inconsistency. But I have spend a large number of hours getting to this point, and hope that a more competent developer will eventually investigate.

Here’s the function (originally by someone else) I use to reset every random seed I’ve ever seen mentioned:

def random_seed(seed_value, use_cuda):
    np.random.seed(seed_value) # cpu vars
    torch.manual_seed(seed_value) # cpu  vars
    random.seed(seed_value) # Python
    if use_cuda: 
        torch.cuda.manual_seed(seed_value)
        torch.cuda.manual_seed_all(seed_value) # gpu vars
        torch.backends.cudnn.deterministic = True  #needed
        torch.backends.cudnn.benchmark = False
#Remember to use num_workers=0 when creating the DataBunch.

I hope this information can help anyone else who needs reproducible training. For myself, when trying to squeeze off fractions of a percent for a Kaggle competition, it does not work to have variations of 5% across training runs. With that much variation, small effects of changes to the model and parameters get lost in the noise.

13 Likes