I want to oversample to help deal with some class imbalance, so I want separate train and validation dataframes. I don’t believe Tabular has a databunch.from_dfs?
Is there a more efficient way than:
train = (TabularList.from_df(train, path='', cat_var, cont_var, procs)
.split_none()
.label_from_df(dep_var)
.databunch())
valid = (TabularList.from_df(valid, path='', cat_var, cont_var, procs, processor=data.processor)
.split_none()
.label_from_df(dep_var)
.databunch())
train.valid_dl = valid.train_dl
Thanks!