Problem With the Accuracy of my Custom Learner Class

Hello!

I’m implementing my own basic version of a Learner class, in a similar vein to the one in the fastai library. It’s one of the challenges provided in the fastbook.

There is what seems to be a problem with the calculation of the accuracy metric in my class.

The relevant portions of my class are below:

class Learner:
    """A simple attempt to model a learner. Works with PyTorch models."""

    def __init__(self, train_dataloader, valid_dataloader, model,
                 loss_function, metric_function, learning_rate=1.0):

        self._train_dataloader = train_dataloader
        self._valid_dataloader = valid_dataloader
        self._model = model
        self._loss_function = loss_function
        self._metric_function = metric_function
        self._learning_rate = learning_rate

        self._parameters = list(self._model.parameters())

        self._accuracy = None

    ...

    def _update_parameters(self):
        for parameter in self._parameters:
            parameter.data -= parameter.grad.data * self._learning_rate

        # Reset parameters.
        for parameter in self._parameters:
            parameter.grad = None

    def _calculate_accuracy(self):
        accuracies = [self._metric_function(inputs, targets) for inputs,
                          targets in self._valid_dataloader]
        self._accuracy = round(torch.stack(accuracies).mean().item(), 4)

    def _output_accuracy(self, epoch):
        print(f"Epoch: {epoch}; Accuracy: {self._accuracy}")

    def train_model(self, epochs):
        for epoch in range(epochs):
            for x_batch, y_batch in self._train_dataloader:
                predictions = self._make_predictions(x_batch)
                loss = self._calculate_loss(predictions, y_batch)
                self._calculate_gradients(loss)
                self._update_parameters()
            self._calculate_accuracy()
            self._output_accuracy(epoch)

I use my Learner to train a digit classifier that can tell whether a digit is a 3 or a 7.

from fastai.vision.all import *

import learner

# Get data.
path = untar_data(URLs.MNIST_SAMPLE)
Path.BASE_PATH = path

# Put data into tensors.
train3_tensor = torch.stack([tensor(Image.open(image)) for image in
                             (path / 'train' / '3').ls()])
train7_tensor = torch.stack([tensor(Image.open(image)) for image in
                             (path / 'train' / '7').ls()])

valid3_tensor = torch.stack([tensor(Image.open(image)) for image in
                             (path / 'train' / '3').ls()])
valid7_tensor = torch.stack([tensor(Image.open(image)) for image in
                             (path / 'train' / '3').ls()])

# Normalize data.
train3_tensor = train3_tensor.float() / 255
train7_tensor = train7_tensor.float() / 255

valid3_tensor = valid3_tensor.float() / 255
valid7_tensor = valid7_tensor.float() / 255

# Create datasets.
train_x = torch.cat([train3_tensor, train7_tensor]).view(-1, 28 * 28)
train_y = tensor([1] * len(train3_tensor) + [0] * len(
    train7_tensor)).unsqueeze(1)
train_dataset = list(zip(train_x, train_y))

valid_x = torch.cat([valid3_tensor, valid7_tensor]).view(-1, 28 * 28)
valid_y = tensor([1] * len(valid3_tensor) + [0] * len(
    train7_tensor)).unsqueeze(1)
valid_dataset = list(zip(valid_x, valid_y))

# Create dataloaders.
train_dataloader = DataLoader(train_dataset, batch_size=64)
valid_dataloader = DataLoader(valid_dataset, batch_size=64)

# Create model.
linear_model = nn.Linear(28*28, 1)

# Define loss function.
def l1_norm(predictions, targets):
    predictions = predictions.sigmoid()
    return F.l1_loss(predictions, targets)

I define the metric function: it passes the predictions though the sigmoid function, and whatever predictions are larger than 0.5 and are equal to the targets are marked as correct. This is performed for each batch, over which an average is then taken. Then the average over all the batches are taken (this average is calculated by the learner itself with the _calculate_accuracy method).

# Define metric function.
def accuracy(inputs, targets):
    predictions = linear_model(inputs).sigmoid()
    correct_predictions = (predictions > 0.5) == targets
    return correct_predictions.float().mean()

Then I define the learner.

# Create learner.
learner = learner.Learner(
    train_dataloader,
    valid_dataloader,
    linear_model,
    l1_norm,
    accuracy,
)

Then I train the model.

learner.train_model(10)

And get the following output.

Epoch: 0; Accuracy: 0.5009
Epoch: 1; Accuracy: 0.5001
Epoch: 2; Accuracy: 0.4996
Epoch: 3; Accuracy: 0.4995
Epoch: 4; Accuracy: 0.4993
Epoch: 5; Accuracy: 0.4993
Epoch: 6; Accuracy: 0.4992
Epoch: 7; Accuracy: 0.4991
Epoch: 8; Accuracy: 0.4991
Epoch: 9; Accuracy: 0.4991
Epoch: 10; Accuracy: 0.4991
Epoch: 11; Accuracy: 0.4991
Epoch: 12; Accuracy: 0.4991
Epoch: 13; Accuracy: 0.4991
Epoch: 14; Accuracy: 0.4991
Epoch: 15; Accuracy: 0.4991
Epoch: 16; Accuracy: 0.4991
Epoch: 17; Accuracy: 0.4991
Epoch: 18; Accuracy: 0.4991
Epoch: 19; Accuracy: 0.4991

It seems that my model has converged because I also later printed the average loss for each epoch:

Epoch: 0; Accuracy: 0.5009
Average Loss: 0.25076560201135356
Epoch: 1; Accuracy: 0.5001
Average Loss: 0.013846351337164865
Epoch: 2; Accuracy: 0.4996
Average Loss: 0.01272650749265689
Epoch: 3; Accuracy: 0.4995
Average Loss: 0.012236580249740767
Epoch: 4; Accuracy: 0.4993
Average Loss: 0.01163589142683141
Epoch: 5; Accuracy: 0.4993
Average Loss: 0.011318306764167705
Epoch: 6; Accuracy: 0.4992
Average Loss: 0.011210138059825637
Epoch: 7; Accuracy: 0.4991
Average Loss: 0.011073978753461469
Epoch: 8; Accuracy: 0.4991
Average Loss: 0.010936616144913997
Epoch: 9; Accuracy: 0.4991
Average Loss: 0.0108073602966693
Epoch: 10; Accuracy: 0.4991
Average Loss: 0.010670954357019863
Epoch: 11; Accuracy: 0.4991
Average Loss: 0.010529087265934219
Epoch: 12; Accuracy: 0.4991
Average Loss: 0.010392412279412084
Epoch: 13; Accuracy: 0.4991
Average Loss: 0.010272278062197092
Epoch: 14; Accuracy: 0.4991
Average Loss: 0.010172289920671209
Epoch: 15; Accuracy: 0.4991
Average Loss: 0.010087506484102126
Epoch: 16; Accuracy: 0.4991
Average Loss: 0.010012260564198908
Epoch: 17; Accuracy: 0.4991
Average Loss: 0.009943498881840082
Epoch: 18; Accuracy: 0.4991
Average Loss: 0.009878894718829393
Epoch: 19; Accuracy: 0.4991
Average Loss: 0.009815361252436823

The logic of my accuracy function seems to be fine, but it must be going wrong somewhere.

I would really appreciate any insights and help!