When trying to understand what the parameters for the following code in 09_tabular codebook.
cont,cat = cont_cat_split(df, 1, dep_var=dep_var)
what does the “1” mean???
When trying to understand what the parameters for the following code in 09_tabular codebook.
cont,cat = cont_cat_split(df, 1, dep_var=dep_var)
what does the “1” mean???
The docs state that this value is the max_card
argument that is being passed in and represented by the 1
.
I’m still trying to understand the max_card
the documentation states:
cont_cat_split
[source]
cont_cat_split
(df
,max_card
=20
,dep_var
=None
)
Helper function that returns column names of cont and cat variables from given df
.
This function works by determining if a column is continuous or categorical based on the cardinality of its values. If it is above the max_card
parameter (or a float
datatype) then it will be added to the cont_names
else cat_names
. An example is below:
would max_card be the number of cat variables? in the documentation it is set to 20 … how would you define 20 vs 1?
Sorry not really sure. It only comes up once in the fastbook and once in the docs in that same place I linked to in my last reply.