Tabular: Specifying Classification Problem instead of Regression?

Hey guys, Im trying to use the newest version of Fastai tabular.

So Im getting a regression output which is not what I want, the thing is my dependant variable is a feature of integers values (from 1 to 14) only, for some reason the model only has 1 output that is in decimals. In the past Fastai would automatically make it a classification problem if the dependant variable column is in integers, am I missing something?

Do I have to specify something in the codes to make it a classification problem?

For your reference, the following is my codes (extracted):

cont_nn, cat_nn = cont_cat_split(train_valid, max_card=9000, dep_var=dep_var)
splits = (list(train_idx),list(valid_idx))

dep_var = ‘place’ ### place column is within [1,2,3,4,5,6,7,8,9,10,11,12,13,14]
procs = [FillMissing, Categorify, Normalize]

to_nn = TabularPandas(train_valid, procs, cat_nn, cont_nn, splits=splits, y_names=dep_var)
dls = to_nn.dataloaders(64)
learn = tabular_learner(dls, layers=[500,250], metrics=accuracy, ps=[0.001,0.01], emb_drop=0.04)

test_df = test.copy()
test_df.drop([‘place’], axis=1, inplace=True)
dl = learn.dls.test_dl(test_df)
learn.get_preds(dl=dl)

Output of predictions:
(tensor([[5.2801],
[3.8952],
[5.9336],
…,
[3.9908],
[5.3693],
[8.3504]]),
None)

1 Like

Pass in y_block = CategoryBlock() to TabularPandas (need to check if this is mentioned in the docs, as that is a common issue, nice question :slight_smile: )

3 Likes

Oh yea this was mentioned in the tabular tutorial page, did not notice that, thanks Mueller for the help!

hey, i have the same issue, can you point me to this ‘tabular tutorial page’ you are referring to please?
can’t seem to find it.

thanks a lot, actually passing in y_block = CategoryBlock() worked, I’ll go through the docs :slight_smile: