Hi all - I was wondering, does the fastai toolkit randomly choose a different training, validation, and test set for every time you train the neural net, or does it use the same images every time?
1 Like
When constructing the DataLoader, fastai uses a split function. IIRC, the default is RandomSplitter
, which will randomly divide your data into a train and valid dataset in an 80/20 ratio. Each time you build the DataLoader, the datasets will be randomly constructed, leading to different datasets. You can avoid this by passing a seed to the RandomSplitter
, so the train and valid dataset will remain the same, given the underlying data does not change. There are various other Splitters
, depending on your task, which allows you to define the validation dataset clearly, e.g., by an extra column in your dataframe, putting it into the file name or by folder structure.
1 Like