How can i treat the dependent variable of a TabularDataBunch as continuous?

I am trying to follow the tabular example for fastai v1. I encounter one problem though. The dependent variable in the example is categorical (true or false) while mine now is continuous (sale prices). I can’t find any information in the docs, on how to set the dependent variable type to continuous. When I load the data bunch as follows and then applying the transformations, i get over 600 categories (1 for each price), which is not what i want:

dep_var = 'SalePrice'
cat_names = ['MSSubClass', 'MSZoning', 'Street', 'Alley', 'LotShape', 'LandContour',
            'Utilities', 'LotConfig', 'LandSlope', 'Neighborhood', 'Condition1', 'Condition2',
            'BldgType', 'HouseStyle', 'OverallQual', 'OverallCond', 'YearBuilt', 'YearRemodAdd',
            'RoofStyle', 'RoofMatl', 'Exterior1st', 'Exterior2nd', 'MasVnrType', 'ExterQual', 'ExterCond',
            'Foundation', 'BsmtQual', 'BsmtCond', 'BsmtExposure', 'BsmtFinType1', 'BsmtFinType2',
            'Heating', 'HeatingQC', 'CentralAir', 'Electrical', 'KitchenQual', 'Functional', 'FireplaceQu',
            'GarageType', 'GarageYrBlt', 'GarageFinish', 'GarageQual', 'GarageCond', 'PavedDrive',
            'PoolQC', 'Fence', 'MiscFeature', 'MoSold', 'YrSold', 'SaleType', 'SaleCondition']
cont_names = ['LotFrontage', 'LotArea', 'MasVnrArea', 'BsmtFinSF1', 'BsmtFinSF2', 'BsmtUnfSF', 'TotalBsmtSF',
             '1stFlrSF', '2ndFlrSF', 'LowQualFinSF', 'GrLivArea', 'BsmtFullBath', 'BsmtHalfBath', 
             'FullBath', 'HalfBath', 'BedroomAbvGr', 'KitchenAbvGr', 'TotRmsAbvGrd', 'Fireplaces', 'GarageCars', 
              'GarageArea', 'WoodDeckSF', 'OpenPorchSF', 'EnclosedPorch', '3SsnPorch', 'ScreenPorch', 'PoolArea', 'MiscVal']
procs = [FillMissing, Categorify, Normalize, ]

n_df = len(df)
p_valid = 0.2
n_valid = int(n_df * p_valid)

valid_idx = range(n_df-n_valid, n_df)

data = TabularDataBunch.from_df(
    path, df, dep_var, valid_idx=valid_idx, procs=procs, cat_names=cat_names, cont_names=cont_names,    




CategoryList (1168 items)
[Category 181500, Category 223500, Category 140000, Category 250000, Category 143000]...
Path: data/house



returns a number of 587 unique categories.
This results in an Error during validation, because the validation set contains ‘catogries’ (in fact prices), which are not present in the training set.

As stated above, I can’t find any information on how to treat the dep var as continuous.

Does anyone have an idea?


So I found a way to treat the price as a float.
Either by transforming it like so:

train_df['SalePrice'] = train_df['SalePrice'].astype('float')

or by labeling it with:

data = (TabularList.from_df(train_df, path=path, cat_names=cat_names, cont_names=cont_names, procs=procs)
                           .label_from_df(cols=dep_var, label_cls=FloatList)

This however results in a different error:

RuntimeError                              Traceback (most recent call last)
<ipython-input-70-c6076a6ce3f3> in <module>
----> 1, 1e-2)

~/fastai/lib/python3.6/site-packages/fastai/ in fit(self, epochs, lr, wd, callbacks)
    161         callbacks = [cb(self) for cb in self.callback_fns] + listify(callbacks)
    162         fit(epochs, self.model, self.loss_func, opt=self.opt,, metrics=self.metrics,
--> 163             callbacks=self.callbacks+callbacks)
    165     def create_opt(self, lr:Floats, wd:Floats=0.)->None:

~/fastai/lib/python3.6/site-packages/fastai/ in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     92     except Exception as e:
     93         exception = e
---> 94         raise e
     95     finally: cb_handler.on_train_end(exception)

~/fastai/lib/python3.6/site-packages/fastai/ in fit(epochs, model, loss_func, opt, data, callbacks, metrics)
     87             if hasattr(data,'valid_dl') and data.valid_dl is not None and data.valid_ds is not None:
     88                 val_loss = validate(model, data.valid_dl, loss_func=loss_func,
---> 89                                        cb_handler=cb_handler, pbar=pbar)
     90             else: val_loss=None
     91             if cb_handler.on_epoch_end(val_loss): break

~/fastai/lib/python3.6/site-packages/fastai/ in validate(model, dl, loss_func, cb_handler, pbar, average, n_batch)
     52             if not is_listy(yb): yb = [yb]
     53             nums.append(yb[0].shape[0])
---> 54             if cb_handler and cb_handler.on_batch_end(val_losses[-1]): break
     55             if n_batch and (len(nums)>=n_batch): break
     56         nums = np.array(nums, dtype=np.float32)

~/fastai/lib/python3.6/site-packages/fastai/ in on_batch_end(self, loss)
    237         "Handle end of processing one batch with `loss`."
    238         self.state_dict['last_loss'] = loss
--> 239         stop = np.any(self('batch_end', not self.state_dict['train']))
    240         if self.state_dict['train']:
    241             self.state_dict['iteration'] += 1

~/fastai/lib/python3.6/site-packages/fastai/ in __call__(self, cb_name, call_mets, **kwargs)
    185     def __call__(self, cb_name, call_mets=True, **kwargs)->None:
    186         "Call through to all of the `CallbakHandler` functions."
--> 187         if call_mets: [getattr(met, f'on_{cb_name}')(**self.state_dict, **kwargs) for met in self.metrics]
    188         return [getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs) for cb in self.callbacks]

~/fastai/lib/python3.6/site-packages/fastai/ in <listcomp>(.0)
    185     def __call__(self, cb_name, call_mets=True, **kwargs)->None:
    186         "Call through to all of the `CallbakHandler` functions."
--> 187         if call_mets: [getattr(met, f'on_{cb_name}')(**self.state_dict, **kwargs) for met in self.metrics]
    188         return [getattr(cb, f'on_{cb_name}')(**self.state_dict, **kwargs) for cb in self.callbacks]

~/fastai/lib/python3.6/site-packages/fastai/ in on_batch_end(self, last_output, last_target, **kwargs)
    272         if not is_listy(last_target): last_target=[last_target]
    273         self.count += last_target[0].size(0)
--> 274         self.val += last_target[0].size(0) * self.func(last_output, *last_target).detach().cpu()
    276     def on_epoch_end(self, **kwargs):

~/fastai/lib/python3.6/site-packages/fastai/ in accuracy(input, targs)
     37     input = input.argmax(dim=-1).view(n,-1)
     38     targs = targs.view(n,-1)
---> 39     return (input==targs).float().mean()
     41 def error_rate(input:Tensor, targs:Tensor)->Rank0Tensor:

RuntimeError: Expected object of scalar type Long but got scalar type Float for argument #2 'other'

Excuse me, if these errors seem trivial, but as a beginner, it is really hard to figure it out alone.

Beset regards,


Ok, i found this issue, which deals with the same error.
It does complete now without any errors.

Have to fiddle out the parameters now, to get decent result.


Nice job figuring it out!

Is there a chance you can share a notebook with tabular regression solution?
I tried to adapt my databunch similar to the rossmann notebook couldn’t get it right.

I understand that I need to change the loss function, but it also didn’t worked.

It is strange to me that I can’t find a single notebook/kernel of with tabular regression solution.
Any help will be great.


@Vertigo42. You probably figured it out by now, but in case others are still looking for this kind of information, I posted a write up on forecasting with regression for tabular data. (If the link gets broken at some point in the future, start from my GitHub page.)


How to specify that the dependent variable is categorical? I am doing a classification with 2 classes. If I keep the target variable as strings, it is working fine. But if I convert the class names to 0 and 1 flags, the results are completely different.