Lesson 4 weight initialisiation

shay1309 · October 28, 2020, 9:04am

why do we do this instead of

weights = init_params((28*28))?
Both give us the same thing but I heard Jeremy say something about wanting columns? Please bear with me i am rather confused here, both give us
torch.Size([784]) when we do the .shape to them

vivekverma1019 · October 29, 2020, 4:43am

std argument is 1.0 by default. So it will give same result if you pass std = 1

shay1309 · October 29, 2020, 6:34am

hi vivek so it doesnt matter if I dont include the 1 right? also , why do we do the multiplication of 1 ? in torch.randn(size)*std where std = 1 , Jeremy said std is the variance but if we are multiplying by 1 what’s the point?

Thanks in advance Mr Vivek

nn.Charles · October 29, 2020, 11:35am

Hi Shay, multiplying by 1.0 is useless but think that since std is a parameter, you can also run the function for, let us say, std=2.0 , and that becomes useful. It is just to create a function more general.

shay1309 · October 29, 2020, 11:58am

hi , what is the value of std with respect to this case ? for background info i am doing lecture 4 , solving the mnist 3s and 7s dataset ,on the fastai website we are initialising the weights before the matrix multiplication

nn.Charles · October 29, 2020, 11:59am

In that case, we use 1 to initialise the weights. It is just parametrised if you want to reuse the function, it is good software engineering practise but you can delete the std parameter without problem.

shay1309 · October 29, 2020, 12:02pm

Noted Mr Charles thank you very much