The professor in lesson 4 states the following:
We have a linear relationship between the number of leaf nodes and the size of the sample
I have a tough time building an intuition about this statement.
My understanding is that, as the decision tree is built, the number of leaf nodes can vary and not be the same as the sample size. Reason being, there could be multiple rows in the training data that can be all predicted to have the same value(i.e the value in a leaf node is calculated by taking the mean of the observed target variables). In such a case, how can there be a linear relationship between the number of leaf nodes and the sample size?
Or does the Random Forest keep on splitting till there is a node for each observation in the training set?