Help with simple (celcius to f) linear regression using PyTorch (gradient descent + autograd)

Hi all,

I am watching the Part 1 2018 videos and trying to get hang of PyTorch.

I’m trying to model simple linear regression with a goal of converting celcius to fahrenheit.

However, my bias term seems to be converging extremely slowly. If I increase the learning rate the gradients will blow up. I’m not using PyTorch optimizer but am trying to use PyTorch’s autograd.

As per Celcius to Fahrenheit formula the bias should be 32. But during my 5000 epochs it seems to only converge from the random start value, -0.5949, to -0.2762.

Here’s the notebook:

I’m pretty sure I got some simple concept with PyTorch wrong. Anyone has any idea why the bias weight is updating so awfully slowly and how to fix it? I would really appreciate all help! :slight_smile:

Thanks in advance!

There are nothing wrong with Pytorch. I think the problem is from your data and your choice of learning rate.

I guess you should normalize your data. Your Xs are in range -500 to 500, it can make your MSEloss super high !!

I made some modifications in your code.

  • Rearrange the X from 0 to 1

  • Change the learning rate to 1e-3 to converge faster

Then I come up with the final parameters: 1.7976 and 31.99 that I think quite OK

x = np.array([ [c] for c in np.random.randn(2000)], dtype=np.float32)
y = np.array([ c_to_f(c) + np.random.randn() * 0.3 for c in x], dtype=np.float32)
lr = 1e-3

This is just to show you how to learn the parameters. I haven’t tried to normalize your X and Y. You can work further on this and update your result on this post :smiley: . I am quite interesting on how to do this

My code you can find here.

Hope that helps,

[Update] You can normalize like thisNormalize

But i think the parameters obtained should be modified to reverse back to ones before normalization

1 Like

Yep. This seems to work beautifully!

a,b = (tensor([ 1.8001]), tensor([ 31.9986]))

Where as correct a = 1.8 and b = 32.

Now it’s converging within 50 epochs using just 1e-1 as learning rate. This is more of what I was expecting for… Great! I was feeling lost for a while. :slight_smile:

Thanks a LOT for help! Would have taken a while to spot it myself this time of day. Or maybe waking up tomorrow morning realizing what was the problem… :slight_smile:

Something like this?

def normalize(x):
return (x - x.min())/(x.max() - x.min())
x = np.array([ c for c in np.random.randint(-500, high=500, size=2000)], dtype=np.float32)

x = normalize(X)

y = np.array([ c_to_f(v) for v in x], dtype=np.float32)

Seems to converge pretty nicely also.

Yes. That’s it. Do you get the same value of parameters after training, a=1.8 b=32 ?

Yes!. Of course the rate of convergence is depending on the number of epochs and learning rate,