Hierarchical clustering of features

In Lesson 4 (Interpreting Random Forests), we construct a dendrogram on the distances between the rank correlation of pairs of features.

However, the rank correlation was not done on one hot encoded features. Therefore, the order of categorical variables may be meaningless (by default they will be alphabetical).

Doesn’t this cause a problem with the rank correlation since it only works for monotonic relationships?

1 Like