Kaggle gives us a Train set, which I split into Train and Test set, and a Validation and Validation Solution (y), which was used to calculate the leaderboard.
Jeremy makes the point that it is really important to make sure that the performance on the Test set is similar to the performance on the Validation set. Otherwise, we are tuning our RF for Test set results and we’ll have a nasty surprise at the end when facing the validation set.
So, I built a couple of RF models and plotted R2 on test set vs validation set. I did not get a linear dependence between performance on the test set and validation set. So, what to do now?