Tabular regression for multiple variables

(Tom) #1

Hi
I’d like to predict n variables, indexed on date, instead of one.

So, I have daily sales volume of n products, and multiple categorical and continuous variables, created from date, like day of week etc., values taken from Google trends, weather, etc.
I’d like to build a model to predict all sales volumes in one go.

How to do it using fastai tabular?

0 Likes

(Zachary Mueller) #2

You can pass in a list of columns for dep_var and it will do exactly what you want :slight_smile:

1 Like

(Tom) #3

Thanks!
Do they have to be in cont_vars as well?

0 Likes

(Zachary Mueller) #4

No! You actually want to leave them out of cont_vars as cont and cat vars are your independent variables :slight_smile:

1 Like

(Tom) #5

Cool thank you!!

0 Likes

(Zachary Mueller) #6

No problem! Your domain sounds very similar to the Rossmann problem back in part one, that notebook can get you started as well :slight_smile:

1 Like

(Tom) #7

I did study Rossmann, but the prediction part is still hard for me. So far, I concentrated on data cleaning and building a model. Now, I must learn how to actually use the model.

1 Like

(jack) #8

@tomdraug did you get this working? The docs refer to dep_var as a str type, not list, for TabularDataBunch.from_df.

0 Likes

(Tom) #9

Hi @jc849
Yes, it works
I get all variables at once.
self.data = (TabularList.from_df(self.df_train_valid, path=’.’,
cat_names=self.cat_vars, cont_names=self.cont_vars,
procs=self.procs)
.split_by_idx(range(len(self.df_train_valid)-self.VALID_SIZE,len(self.df_train_valid)))
.label_from_df(cols=self.dep_vars, label_cls=FloatList)
.add_test(TabularList.from_df(self.test_df))
.databunch(bs=64))

0 Likes

(jack) #10

Thanks, that’s great!

0 Likes