Tabular Dataset with a numerical output ✅

brunosan · July 20, 2019, 5:31pm

I am working with tabular data where the prediction is numerical, not categorical.

I was both having bad results and errors when trying to select any metric different than accuracy. After much search, I found this comment that explains that you must pass label_cls=FloatList to label_from_df for it to be treated as numerical.

Hence:

data = (TabularList.from_df(df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
                           .split_by_idx(list(range(0,last_index)))
                           .label_from_df(cols=dep_var,label_cls=FloatList)
                           .databunch())

Posting as a standalone with the search terms I looked for, so others can find it

muellerzr · July 20, 2019, 5:45pm

Yep thanks for this! Another great example is the Rossmann notebook from lesson 6, it shows a regression problem from start to finish