Hey all,
Just sharing this issue here with the validation set random seed.
https://forums.fast.ai/t/lesson-1-pets-benchmarks/27681/55?u=jamesrequa
Please feel free to verify this on your end as well. Steps to reproduce:
- Set a random seed in the jupyter notebook
np.random.seed(2)
- Create an ImageDataBunch
data = ImageDataBunch.from_name_re(path_img, fnames, pat, ds_tfms=get_transforms(), size=299, bs=bs, valid_pct=0.2)
- Save train/val x for later
trn_x, val_x = data.train_ds.x, data.valid_ds.x
- Create a new ImageDataBunch repeating step 2
- Check again the train/val x for this new data instance
trn_x2, val_x2 = data.train_ds.x, data.valid_ds.x
. Compare with the first train/valid set and verify they are not the same.
I have already implemented the code changes which fixes this issue so if you like I can submit a PR I think this is pretty important to fix right away as it can result in validation loss/error rate results which are not reliable and can happen in a very innocent way like if you just wanted to change the batch size or image size (as we saw one student achieved 1% error rate on pets dataset for this reason).