06_multicat Constructing a DataBlock

After unloading the csv file into “df”, we define “dsets”:
dsets = dblock.datasets(df)
How is it that the train set and validation set are now defined? what made it happen?
Thanks

If you have question about what fastai is doing, the best approach is to check the source code. Looking through the code for the DataBlock class and its datasets function, we see if there isn’t any splitter originally passed to DataBlock, then it uses a RandomSplitter:

Note that the RandomSplitter has a default split of 80% train-20% valid, resulting in the (4009,1002) split shown in the book. Hope this helps!

2 Likes