Hello everyone, I am doing lesson2-sgd.ipynb and I do not make sense about the definition about nn.Parameter even though I have spent my time searching for it online. Can anyone explain it to me?

I really appreciate your help with my problem!

Hey @Cuong,

I actually had to dive a bit into this myself a few months ago as I was trying to pass a `torch.Tensor`

to a function expecting a `torch.nn.Parameter`

. So, data structures in deep learning are only used to articulate/manipulate the theoretical concept behind.

As for parameters, during optimization, you might know that we used both their values and their gradient. So you will need to have access to a parameterâ€™s value for forward and backprop and its gradient, respectively accessible in pytorch for a parameter `p`

using `p.data`

and `p.grad`

.

Hereâ€™s a short illustration:

```
import torch
# Let's define a simple data structure (1-dimensional tensor)
t = torch.tensor([1,2,3])
# This isn't enough for backprop as it only allows the storage of 3 torch.float values
# Fortunately PyTorch has what we need
p = torch.nn.Parameter(t)
# Now notice the difference between
print(t)
# And
print(p)
# The values of the parameters are still accessible using:
print(p.data)
```

At this stage, without any forward or backward propagation, the `p.grad`

will be empty for efficiency purposes.

Consider `torch.nn.Parameter`

being an augmented version of `torch.Tensor`

able to store both values and gradient of a parameter. For more details, I usually check the actual implementation since itâ€™s open-source: https://pytorch.org/docs/stable/_modules/torch/nn/parameter.html

You can have the actual code on Github as well: https://github.com/pytorch/pytorch/blob/master/torch/nn/parameter.py but be aware it might include changes that occurred after the latest release (thus include changes since the version of PyTorch you are currently using).

Let me know if that helped, cheers!

Thank you very much !

@fgfm I underwent the same problem you went through. Specifically, it was when I was trying to define my own linear function (myLinear) from Lesson 5 of Part 1. Initially, i was trying to initialise the weight ad bias tensors as torch.tensors. But that gave me an error, something like

`runtimeerror: expected object of device type cuda but got device type cpu for argument #2 'mat2' in call to _th_mm`

Later on, It worked fine as soon as I initialized the parameters as nn.Parameters.

So from your understanding, is it that the GPU just doesnâ€™t accept anything other than nn.Parameters?

What is nn.Parameters in essence?

Thanks

Hi @PalaashAgrawal,

My apologies, I havenâ€™t been around for a while!

I would have to check your code more thoroughly but from the error you got:

- the call to
`_th_mm`

points out that youâ€™re trying to do matrix multiplication - the function was expecting a cuda tensor, but was provided a cpu tensor

My best guess is that, **in your funciton** there is an inplace operation on a CUDA tensor, where youâ€™re trying to multiply it with a tensor that has not been moved to CUDA. I hope this helps!

Regarding `nn.Parameters`

, you could see it as a pair of tensors:

- what you would call its value would be the first tensor (the
`data`

attribute) - its gradient would be the second tensor (the
`grad`

attribute)

It is only a data structure used to keep track of the gradient of a tensor nearby, which is very useful for autograd, and optimization