Duplicating Jeremy's Excel Solver in Python

Hi everyone,

I’m trying to duplicate the solver Jeremy uses in Part 1, Lesson 4 (1:08), but in Python and not Excel. You can find details of the original lecture in @PoonamV’s lecture notes, about 2/3rds of the way down the page:

I’d like to understand how this works on a more basic level so I’m trying to translate to Py.

Code is largely stolen from this course-v3/nbs/dl1/lesson2-sgd.ipynb at master · fastai/course-v3 · GitHub

from fastai.basics import tensor, nn
import torch, numpy

def hypothesis(vertLatents,horiLatents):
    return numpy.dot(vertLatents,horiLatents)

def mse(y_hat,y):
    return ((y_hat-y)**2).mean()

def update():
    y_hat = tensor(hypothesis(vertLatents,horiLatents))
    y_hat.requires_grad = True
    loss = mse(y_hat,blockData)
    if t%10 == 0: print(t,'-------------',loss)
    loss.backward()
    with torch.no_grad():
        a.sub_(lr * a.grad())
        a.grad.zero_()

vecLatents = 10
shape = (20,14)

blockData = tensor(numpy.random.random_sample(shape))
horiLatents = numpy.random.random_sample((vecLatents,shape[1]))
vertLatents = numpy.random.random_sample((shape[0],vecLatents))

a = nn.Parameter(tensor(hypothesis(vertLatents,horiLatents)))
print(a)
lr = 1e-1

for t in range(100):
    update()

This code fails. I think because I’m not calculating the gradient of a. I can see this when I print out a – it doesn’t print the grad, even though I can print out a.requires_grad and that evaluates to True. The error is here:

Traceback (most recent call last):
File “./TestOptimize.py”, line 40, in
update()
File “./TestOptimize.py”, line 24, in update
a.sub_(lr * a.grad())
TypeError: ‘NoneType’ object is not callable

Any ideas of what I’m doing wrong?

NB: I am on lesson 5, halfway through, so if this is spelled out in a subsequent lesson I’d be happy to be pointed that way.

TIA,

Rajan

OK, I was being a dummy, I typed a.grad() instead of a.grad

Also after some reading on PyTorch’s grad function, it occurs to me that maybe the loss.backward() function isn’t able to see that the tensor I had defined as “a” was related to the calculation of loss so I’m going to try to rewrite things to make vertLatents and horiLatents visible to loss.backward()

I’ll update shortly. If anyone else has experience with this area of PyTorch that has to do with .backward() feeding gradients of tensors I’m happy to hear any suggestions.

Solution is here: