Hello all,
I am busy training a image classifier, and I noticed something strange regarding achieving reproducible results between training runs (by restarting the notebook kernel).
fastai version: fastai==1.0.60
So in order to ensure reproducible results, I do all the necessary ‘seeding’:
seed = 42
# python RNG
random.seed(seed)
# pytorch RNGs
torch.manual_seed(seed)
torch.backends.cudnn.deterministic = True
if torch.cuda.is_available():
torch.cuda.manual_seed_all(seed)
# numpy RNG
np.random.seed(seed)
Set up the data:
data = ImageDataBunch.from_csv(TRAIN_PATH, folder='images', csv_labels='train.csv', size=224, bs=64, num_workers=4,
resize_method=ResizeMethod.SQUISH).normalize(imagenet_stats)
I then set up my learner and train:
kappa = KappaScore()
kappa.weights = 'quadratic'
learn = cnn_learner(data, models.resnet18, metrics=[accuracy, kappa])
learn.callback_fns.append(partial(SaveModelCallback, every='improvement', monitor='accuracy'))
learn.fit(5, lr=1e-4, wd=1e-4)
I then get the following results for run #1
epoch train_loss valid_loss accuracy kappa_score time
0 0.583971 0.114860 0.966038 0.932593 00:08
1 0.268060 0.064782 0.983019 0.961805 00:07
2 0.149086 0.067273 0.977359 0.957736 00:07
3 0.083732 0.053412 0.981132 0.964634 00:07
4 0.052447 0.064336 0.977359 0.944663 00:07
and run #2:
epoch train_loss valid_loss accuracy kappa_score time
0 0.583971 0.114860 0.966038 0.932593 00:08
1 0.268060 0.064782 0.983019 0.961805 00:07
2 0.149086 0.067273 0.977359 0.957736 00:07
3 0.083732 0.053412 0.981132 0.964634 00:07
4 0.052447 0.064336 0.977359 0.944663 00:07
As one can see the numbers are exactly the same. Initially I found this strange, because I found it hard to believe that when the optimiser (Adam) does gradient descent (and other fancy stuff), that it follows the exact same path to the global (or local) minimum of the loss function surface. I then gave it a bit more thought and kind of convinced myself that it could be possible, since the data split and ‘randomness’ is exactly the same, resulting in the exact same loss function surface between training runs.
I then did another test where I monitored the valid_loss
instead of the accuracy i.e
learn.callback_fns.append(partial(SaveModelCallback, every='improvement', monitor='valid_loss'))
learn.fit(5, lr=1e-4, wd=1e-4)
This then produces the results for run #1:
epoch train_loss valid_loss accuracy kappa_score time
0 0.582052 0.114496 0.964151 0.926979 00:08
1 0.266717 0.064743 0.981132 0.960227 00:07
2 0.149264 0.062584 0.977359 0.949135 00:07
3 0.082239 0.050661 0.986792 0.973122 00:07
4 0.050845 0.065832 0.977359 0.948850 00:07
and results for run #2:
epoch train_loss valid_loss accuracy kappa_score time
0 0.584009 0.112865 0.956604 0.912691 00:08
1 0.266808 0.065932 0.979245 0.950349 00:07
2 0.149355 0.061175 0.984906 0.967462 00:07
3 0.082748 0.050527 0.981132 0.956022 00:07
4 0.052076 0.064512 0.979245 0.954532 00:07
As one can see above the results are not exactly the same, but are very similar (which makes sense to me). I did one last test where I did not include the SaveModelCallback
at all, and also got similar results but not exactly the same as above when I monitored the accuracy
with the SaveModelCallback
.
So at this point I am rather confused why this is happening. If anyone could please provide me with some insight to this, it would be much appreciated.
Thank you!