I have a question about Random Forest. The question is how smart Random Forest is in this scenario when it comes to Features. I will take the example.
1. We have Feature A that has a Gini Index of 0.45
2. We have Feature B that has a Gini Index of 0.45
So those Features by themselves has a Gini Index of 0.45
Now If I would create a new Feature C that is A and B combined which means that C fulfills exactly the information in A and B like this: Feature C (Feature A + Feature B)
3. Feature C does actually have a Gini Index of 0.35 which is much better!!!
An example would perheps be for a Green Apple like this:
Feature A (Green)
Feature B (Form is Round)
Feature C (Green & Form is Round)
My question is this:
Will the Random Forest pick up and be as efficient if I only use Feature A and B and not C? Or will the Random Forest benefit much better by having the Feature C with the Gini Index of 0.35 which means that Feature C will be used on the Root Node more efficient than Feature A and B?
This is my big question. What can we say about this?