Working with Random Forest From Scratch

jonathanl · April 26, 2019, 12:18am

I am having difficulty working with the code from the Random Forest From Scratch lesson.

First, there is a minor diff between the model as built up from the cells and the “Putting it all Together” cell:

These are irrelevant differences, but just in the interest of debugging, I looked at them.

On to the question: when I try to use this code, I do not get intuitive results:

The scikit-learn tree looks ok, but the “rf from scratch” tree only does one split. And the split is on variable 6 which is random noise. The sklearn-tree properly splits on variable 5 which is informative. Can anyone help me understand this?