Saving valid_pct generated validation set for later

Is there a way to save a validation set generated by passing valid_pct to ImageDataBunch.from_df so that I’m using the same validation set each time? Or to use the last 20% of each class for the validation set rather than a random sub-sample?

I realized I was letting my model cheat by selecting a validation set containing examples it had previously trained on after I restarted my notebook.

Edit: it’s occurred to me that I could have set my random seed prior to running ImageDataBunch.from_df the first time to make it deterministic – but that doesn’t help me now that I’ve already spent a bunch of time training with that split, unfortunately.

You can do data.save(). It will save your DataBunch object, with your split.

1 Like