I have completed the ML course till Lesson 3. And been trying couple of competitions on Kaggle to learn and understand the code better. One of the competitions has been the TFI Restaurant Prediction:
As I am up to Lesson 3, I have extracted features from date, and changed text data to categories. Now the train data set is very small. So I am using OOB score and RMSE on train data as a starting point.
The problem I was facing that both OOB score and RMSE scores fluctuated widely across runs. For example,
1st run: -0.04 (OOB)
2nd Run: -0.15
3rd Run: -0.01
I have had similar issues when trying the ML course techniques on the Titanic dataset.
I think the reason is at least partially in the dataset itself:
The training set only consist of 137 examples. Assuming you define 20% for validation set size, that means the val set(s) is/are about 27 examples, so getting something wrong on one hast a huge influence on the score. So this is not very statistically sound and could explain large fluctuations between the runs due to the different random selections of the val examples. Maybe try your exact same notebook on a larger dataset and see if the problems persist…