I’d like to predict n variables, indexed on date, instead of one.
So, I have daily sales volume of n products, and multiple categorical and continuous variables, created from date, like day of week etc., values taken from Google trends, weather, etc.
I’d like to build a model to predict all sales volumes in one go.
How to do it using fastai tabular?
You can pass in a list of columns for
dep_var and it will do exactly what you want
Do they have to be in cont_vars as well?
No! You actually want to leave them out of cont_vars as cont and cat vars are your independent variables
No problem! Your domain sounds very similar to the Rossmann problem back in part one, that notebook can get you started as well
I did study Rossmann, but the prediction part is still hard for me. So far, I concentrated on data cleaning and building a model. Now, I must learn how to actually use the model.
@tomdraug did you get this working? The docs refer to
dep_var as a str type, not list, for
Yes, it works
I get all variables at once.
self.data = (TabularList.from_df(self.df_train_valid, path=’.’,
@tomdraug Could you share your notebook? I’m working on a similar project and struggling to pull multiple dependent variables
I will send you next week
Thanks @tomdraug I followed your suggestion and it worked well.
Quick remark for newbies like me doing multiple regression, I think it’s important to scale your dependent variable before the prediction, otherwise the largest one might dominate everything and you end up optimizing your NN for only one prediction.
I’m very interested on your work. Might you share your notebook? I would like to learn about it and play with other datasets.