Tabular regression for multiple variables

I’d like to predict n variables, indexed on date, instead of one.

So, I have daily sales volume of n products, and multiple categorical and continuous variables, created from date, like day of week etc., values taken from Google trends, weather, etc.
I’d like to build a model to predict all sales volumes in one go.

How to do it using fastai tabular?

You can pass in a list of columns for dep_var and it will do exactly what you want :slight_smile:


Do they have to be in cont_vars as well?

No! You actually want to leave them out of cont_vars as cont and cat vars are your independent variables :slight_smile:


Cool thank you!!

No problem! Your domain sounds very similar to the Rossmann problem back in part one, that notebook can get you started as well :slight_smile:


I did study Rossmann, but the prediction part is still hard for me. So far, I concentrated on data cleaning and building a model. Now, I must learn how to actually use the model.


@tomdraug did you get this working? The docs refer to dep_var as a str type, not list, for TabularDataBunch.from_df.


Hi @jc849
Yes, it works
I get all variables at once. = (TabularList.from_df(self.df_train_valid, path=’.’,
cat_names=self.cat_vars, cont_names=self.cont_vars,
.label_from_df(cols=self.dep_vars, label_cls=FloatList)

Thanks, that’s great!

@tomdraug Could you share your notebook? I’m working on a similar project and struggling to pull multiple dependent variables

I will send you next week

Thanks @tomdraug I followed your suggestion and it worked well.

Quick remark for newbies like me doing multiple regression, I think it’s important to scale your dependent variable before the prediction, otherwise the largest one might dominate everything and you end up optimizing your NN for only one prediction.

Hi @tomdraug,

I’m very interested on your work. Might you share your notebook? I would like to learn about it and play with other datasets.