Hello, I just completed the first lesson and code every single line by myself. I have two questions about proc_df.
First, how to understand the parameter nas?
Second, I noticed that in the original DataFrame, some columns are ‘text’, such as fiProductClassDesc, so proc_df convert it to a numerical column like any categorical columns?
nas is a dictionary with the columns names as key and the respective median as value. You can easily print it out and check.
You are right, proc_df replaces categorical columns’ values by their category codes. But you have to set the column’s dtype to ‘category’ either manually or using train_cats().
Also, this is all mentioned in the code so you can just press shift + tab and take a look at it.