I’ve just run the source code about lesson3-rossman example to figure out each lines meaning. But when I read “prod_df” source code(which reside in fastai/structured.py) and run the dataframe through it I have confusion about the below source code, I don’t know why we need “pd.get_dummies” functions here
> if do_scale: mapper = scale_vars(df, mapper)
> for n,c in df.items(): numericalize(df, c, n, max_n_cat)
> res = [pd.get_dummies(df, dummy_na=True), y, na_dict]
the “scale_vars” function will do the StandardScaler for all the numeric columns
the “numericalize” function will turn all the categorical columns into integer category codes(numeric types)
so after these two function, all the columns in “df” has turned into numeric columns. so my question is why we need to call pd.get_dummies(df,dummy_na=True) to turn one-hot encoding in “df” since now the “df” don’t have any categorical column that means “get_dummies” function has no effect at all. Is that a redundant code or there is some scenario I haven’t figured out. Thanks