The professor states in Lesson 2 that the function:
set_rf_samples and OOB score are not compatible.
I am a bit confused.
I fail to understand as to why they are not compatible. As i understand, the OOB score is calculated for a row in the sample by making use of trees where this row as not used during training.
With subsampling, each tree will get a different subset of rows out of the total dataset. If thats the case, then each row in the dataset will have been only used in one tree and hence all of the other trees can be used to predict the dependent variable for this row.
Shouldn’t OOB score still be valid then?