# Lecture 5 - Question about init coeffs(weights)- Why divide the coeffs value by number of layers

Is it necessary to divide by sizes[i+1] in the line below

```````layers = [(torch.rand(sizes[i], sizes[i+1])-0.3)/sizes[i+1]*4 for i in range(n-1)]`
``````

if I am not wrong the numbers are going to be between centered around 0 with the same magnitude if you don’t divide it. Also how did we arrive at multiplying that by 4 (sizes[i+1]*4)

``````def init_coeffs():
hiddens = [10, 10]  # <-- set this to the size of each hidden layer you want
sizes = [n_coeff] + hiddens + [1]
n = len(sizes)
layers = [(torch.rand(sizes[i], sizes[i+1])-0.3)/sizes[i+1]*4 for i in range(n-1)]
consts = [(torch.rand(1)[0]-0.5)*0.1 for i in range(n-1)]
for l in layers+consts: l.requires_grad_()
return layers,consts
``````

I guess it’s for “normalization” purposes (the same thing as subtracting `0.3` from the random tensor), but I’m not entirely sure and would be also interested in the answer.

2 Likes

interested as well