Define cont_cat_split parameters?

When trying to understand what the parameters for the following code in 09_tabular codebook.

cont,cat = cont_cat_split(df, 1, dep_var=dep_var)

what does the “1” mean???

The docs state that this value is the max_card argument that is being passed in and represented by the 1.

2 Likes

I’m still trying to understand the max_card

the documentation states:

cont_cat_split [source]

cont_cat_split(df, max_card=20, dep_var=None)

Helper function that returns column names of cont and cat variables from given df.

This function works by determining if a column is continuous or categorical based on the cardinality of its values. If it is above the max_card parameter (or a float datatype) then it will be added to the cont_names else cat_names. An example is below:

would max_card be the number of cat variables? in the documentation it is set to 20 … how would you define 20 vs 1?

Sorry not really sure. It only comes up once in the fastbook and once in the docs in that same place I linked to in my last reply.

2 Likes