Rossman example: add_datepart function in fastai library


(Safouane Chergui) #1

Hello everyone!

In Rossman’s notebook (lesson 6), some of the columns resulting from add_datepart are treated as categorical variables. What if some categories are never seen in the training set (for example if some day of week doesn’t exist in the training set)?

I believe treating days of week for example as categorical variables is good but w’re never quite sure that all days of week figure in training set and so the solution seems to be considering them as continuous variables.

I’d like to know your take on this! Thanks!


(Kyle Nesgood) #2

IIRC, each category gets a “missing” value added in to handle this exact situation. If you have a categorical column with 7 values, I believe you’ll see that the column in the data bunch actually contains 8. I’m on my phone and can’t find it on the docs ATM, but will look when I have a chance.


(Safouane Chergui) #3

Oh great… Thanks! I’ll look into it