How to apply proc_df on test data?

Has anyone tried making predictions on the actual test data? I am having some problems with proc_df function. After applying apply_cats on the data, proc_df produces the AttributeError: Can only use .cat accessor with a 'category' dtype error. Please help

I got this problem for dates when i didnt use:
add_datepart(train,‘Date’)

If you have any type of date its worth trying.

1 Like

I found what i was doing wrong. I was applying apply_cats after making all the other changes to df_raw. Make sure you make a copy of df_raw after applying train_cats

1 Like

It is usually best that when you create your df_train variable and process then, you should also create your df_test (test.csv) simultaneously. Now, you process both df_train and df_test together.

Make sure when you apply train_cats to df_train then, you apply apply_cats to df_test right after that. Don’t perform the proc_df on df_train before applying apply_cats on df_test. Can you think why?

It is because if you modify the df_train with proc_df then, categorical features are converted to numerical. So, now apply-cats will not do anything.