Review on Applying first lesson on Kaggle Saberbank russian housing market data


Applying Lesson 1 of Machine Learning
When I applied random forest model on Saberbank data from kaggle I found that score was 88% on entire train data set. When I split train data into two part,i.e vlid data and train data I found following score of model:-
valid data size : score of RF
25 % – >24% score
20% --> 30% score
10% --> 36% score
5% --> 42% score
after that I predicted on test data from which score on kaggle public RMSLE score was 0.419
I am in bottom 10%
Please give review, feedback on this.