Binary classification tabular

I had a few quick questions on the tabular learner. I have a dataset where target is 0 and 1, but for some reasons tabular learner defines the loss as cross entropy and not log loss or some other function which makes more sense for binary classification. Also it seems to return all predictions with logits for each class (again assuming multi class instead of just 1 prob for the positive class). This is the code

data = TabularDataBunch.from_df(‘models’, train_df, target, valid_idx=split[1].values, cont_names=predictors)
learn = tabular_learner(data, layers=[500,200], metrics=[accuracy1, roc()])

This is the loss it’s defining:

FlattenedLoss of CrossEntropyLoss()

and if i try to write a custom loss or metric, it returns inputs and targets of different shapes, one is of shape [bs, 2] the other [bs,1], they also seem to be logits and not probabilities so it’s hard to define the threshold. Can you please advice on what am I doing wrong?

3 Likes

Did, you end up getting any advice on this? I’m looking to obtain predicted probabilities of the binary outcome as well and have been having trouble.

Also have the same question except in respect of a multiclass tabular data problem.

It might be late, I leave my answer just in case someone has the same problem:

You are not doing anything wrong. If you want to use Sigmoid function and binary cross entropy for a binary classification problem, you can try (assuming you have one hidden layer)

learn = tabular_learner(data, layers=[500], metrics=[accuracy1, roc()])
learn.model.layers.add_module(‘4’,torch.nn.Sigmoid())
learn.loss_func = BCEFlat()

You may also have to change the number of channels in the last layer to one if necessary

I used this code to try and change my loss function to BCE.

Error I get when running learn.lr_find()

“ValueError: Target and input must have the same number of elements. target nelement (64) != input nelement (128)”

Attached a screenshot of the learner.