I agree that max_card
of 1 is weird, and it took me a while to figure out what was going on: setting max_card
to 1 I got 51 categorical variables, setting it to 9000 I got 60 categorical variables. I then started investigating some of the 51 I got in the first case and I found out that they all had > 1 category.
If you look at the source code for the cont_cat_split(...)
function (e.g. here), you see where the trick is: a variable is considered continuous if it has integer values and > max_card
occurrences or if it has float values. In the case of the 51 categorical variables, they are all string-valued!