Hi @jeremy, during last Tuesday’s class you mentioned that “better instantiations of random forests generally add more randomness, not less”. This was in response to the question of whether we could/should try models that include all 2- and 3- way interactions of variables. I believe this also led to the point that a model based on random sampling outperforms a model that considers all combinations of levels + factors (including all factor interactions).
Can you please elaborate on why this is the case? Is it because basing a model on randomness allows it to generalize better? Intuitively, I get why this would be preferable in the case where in the alternative you miss possible interactions, but is this also true when comparing to a model that is “exhuastive” with respect to the variables and interactions it considers? Thanks!