Lesson 5
I shared the below image in our previous meetup. This is an updated version along with annotated code and notes.
Building a minimal Neural Network (Logistic Regression with no hidden layer) from scratch
Let’s walk through step by step and also refer how we code each blocks from the below image
Source: Natural Language Processing with PyTorch by Delip Rao et al.
-
Predictions:
y_hat = model(x)
, here we are using own model. -
Loss function:
loss_func(y_hat, y)
. In addition to that we are also adding it withw2*wd
-
Gradients:
parameter.sub_(learning_rate * gradient)
, performing an inplace subtraction on parameters with product(learning_rate, gradient). But since our model has multiple parameters (weights, biases), we are looping through them using PyTorch parameters. -
Extras:
-
Weight Decay:
- a) w2: using each parameter, we are calculating the sum of squared weights, w2 ,
for p in model.parameters(): w2 += (p**2).sum()
- b) wd: a constant (1e-5)
- multiply w2 and wd & add to regular loss_func
- a) w2: using each parameter, we are calculating the sum of squared weights, w2 ,
-
Weight Decay:
-
Combined
- We are going to calculate the loss for each minibatch by calling update(x,y,lr) on them.
losses = [update(x,y,lr) for x,y in data.train_dl]
-
.item()
turns into a python number in order to plot & see them visually.
- We are going to calculate the loss for each minibatch by calling update(x,y,lr) on them.
def update(x, y, learning_rate):
wd = 1e-5
#prediction
y_hat = model(x)
w2 = 0.
#sum of squared weights
for p in model.parameters():
w2 = w2 + (p**2).sum()
# regular loss
loss = loss_func(y_hat, y) + w2*wd
# updates the gradients in the model ie parameters
loss.backward()
# instruct pytorch not to record these actions for the next gradient calculation
with torch.no_grad():
for p in model.parameters():
#gradients
p.sub_(learning_rate * p.grad)
p.grad.zero_()
return loss.item()
Resources
- Many thanks to (@cedric ) for his notes Building a neural network from scratch
- Pytorch Tutorial also has this in depth article what is torch.nn really? (somewhat hidden gem) from Jeremy
Feedback is welcome.