Binary classification with vision_learner

smorrin · April 5, 2024, 2:37pm

I’m currently trying to finetune as resnet model to a binary classification dataset. However, when looking at the layers, fastai uses an output layer with two output features with crossentropy loss. Howevery, what I want to do is to use one output layer that will be passed through a sigmoid function with mse loss. How can I acheive that?

smorrin · April 5, 2024, 3:58pm

I’ve figured it out myself, here’s the solution I used:

I’ve used the datablocks API to create a datablock with a regression target (that will be 0 or 1) as well as a custom labeling function that outputs 0 or 1.
I’ve used a custom accuracy metric (I’ll post the source code below)
I’ve manually added a sigmoid layer to the model’s head.

Here’s what datablock I used:

block = DataBlock(
    blocks=(ImageBlock, RegressionBlock(n_out=1)),
    get_items=get_image_files,
    # I'm trying to predict skin cancer, so in my case the "positive" label will be "Malignant"
    get_y=lambda path : parent_label(path) == "Malignant",
    batch_tfms=aug_transforms()
)

Here’s my custom accuracy metric:

def binary_acc(y_true, y_pred):
    dist = torch.abs(y_true - y_pred)
    return torch.where(dist < 0.5, 1.0, 0.0).mean()

binary_acc_metric = AccumMetric(binary_acc)

learner = vision_learner(dls, resnet18, metrics=[binary_acc_metric])

And here’s how I edited the models head:
learner.model[-1].append(nn.Sigmoid())

I hope this is helpful to anyone having the same problem.