I am having difficulty working with the code from the Random Forest From Scratch lesson.
First, there is a minor diff between the model as built up from the cells and the “Putting it all Together” cell:
These are irrelevant differences, but just in the interest of debugging, I looked at them.
On to the question: when I try to use this code, I do not get intuitive results:
The scikit-learn
tree looks ok, but the “rf from scratch” tree only does one split. And the split is on variable 6 which is random noise. The sklearn-tree properly splits on variable 5 which is informative. Can anyone help me understand this?