`loss.backward` doesn't seem to be working

ForBo7 · August 14, 2022, 8:36am

Hello!

Here’s the rundown. I have included what I deem to be the necessary for understanding.

I have a linear model using PyTorch’s nn.Linear class.

linear_model = nn.Linear(28*28, 1)

I have a custom optimizer class.

class Optimizer:
    """A simple attempt to model an optimizer."""
    
    def __init__(self, parameters, learning_rate):
        """Initialize optimizer attributes."""
        self.parameters = list(parameters)
        self.learning_rate = learning_rate
    
    def step(self):
        """Update the parameters."""
        for parameter in self.parameters:
            parameter.data -= parameter.data.grad * self.learning_rate
    
    def zero_gradient(self):
        """Reset the gradient of each parameter."""
        for parameter in self.parameters:
            parameter.grad = None

And have instantiated this class as such:

optimizer = Optimizer(linear_model.parameters(), 1)

The gradients of this model is calculated by the following function.

def calculate_gradients(x_batch, y_batch, model):
    predictions = model(x_batch)
    loss = l1_norm(predictions, y_batch)
    loss.backward()

l1_norm is defined as such.

def l1_norm(predictions, targets):
    predictions = predictions.sigmoid()
    return torch.where(targets==1, 1-predictions, predictions).mean()

When I run the model, on the very first epoch a runtime error is thrown:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Input In [17], in <cell line: 1>()
----> 1 train_model(linear_model, 20)

Input In [16], in train_model(model, epochs)
      1 def train_model(model, epochs):
      2     for epoch in range(epochs):
----> 3         train_epoch(model)
      4         print(valid_epoch(model), end=', ')

Input In [15], in train_epoch(model)
      2 for x_batch, y_batch in train_dataloader:
      3     calculate_gradients(x_batch, y_batch, model)
----> 4     optimizer.step()
      5     optimizer.zero_gradient()

Input In [9], in Optimizer.step(self)
     10 """Update the parameters."""
     11 for parameter in self.parameters:
---> 12     parameter.data -= parameter.data.grad * self.learning_rate

TypeError: unsupported operand type(s) for *: 'NoneType' and 'int'

What I’ve gathered is that, despite loss.backward being run in calculate_gradients, parameter.data.grad remains of None type and hence cannot be multiplied with self.learning_rate.

Does anyone have an idea as to why loss.backward is keeping parameter.data.grad as None? nn.Linear does make sure requires_grad is set to True so I’m not quite sure what’s going wrong.

I would really appreciate any help! Please let me know if I’m still ambiguous or if I should supply any more information.

BobMcDear · August 14, 2022, 3:47pm

Hello,

One issue that immediately jumps out is the manner of accessing the gradients in your code. Particularly, replacing parameter.data.grad with parameter.grad in the step method of Optimizer should lead to the expected results.

Please let me know if this resolves your problem.

ForBo7 · August 15, 2022, 7:46am

Ahh, the tiniest of mistakes, heh.

This solved it! Thank you for the response!