Partial Dependence discussion / questions

Thanks @kcturgutlu for helping with my friends to understand Spearman rank correlation. I’m having a hard time applying your definition to the dendogram visualization, (and the dendogram charts these Spearman Rank correlations, yes?)

As part of my data encoding, I’ve converted the labels to one-hot encoded data or gave them numbered labels. Is that what you mean by “continuous and discrete ordinal variables”?

If I look a the dendogram, I see that it’s making different splits and the final pairs of labels that are likely correlated. Can you help me develop an intuition about what’s happening under the hood when we’re running our data df_keep through a Spearman Rank Correlation analysis?

I believe that this procedure helps me identify redundant categories, then I can drop these redundant categories from my data set, and then later I re-train my Random Forest model with this new data, yes?

Asking for some friends :slight_smile: !

Thanks in advance!