I’ve been trying to figure out how to use RandomSplitter instead of sklearn’s TrainTestSplit, but I can’t seem to figure out how unless it’s within a data block, etc. How can I use it to create a test set to hide from the model? I’m currently attempting this with the Movie Lense movie reviews data set. Any tips on where to look?
I would probably start here fastai - Data transformations . There is also
TrainTestSplitter similar scikit. Try it out and let us know if you face any issues.
It actually is
sklearn.model_selection.train_test_split (just wraps the resulting splits in
So the answer might be: just use that, its the thing that does the task you want to be done. Except there is a particular reason you don’t want to use it, then please share why or what your usecase is and what you are trying to achieve to give you a more specific answer .
I was a little confused about how to use RandomSplitter() as an object but figured out how using the code below(for posterity):
df = ratings splitter = RandomSplitter() splits = splitter(df) splits