Lesson 5
I shared the below image in our previous meetup. This is an updated version along with annotated code and notes.
Building a minimal Neural Network (Logistic Regression with no hidden layer) from scratch
Let’s walk through step by step and also refer how we code each blocks from the below image
Source: Natural Language Processing with PyTorch by Delip Rao et al.
- 
Predictions: y_hat = model(x), here we are using own model.
- 
Loss function: loss_func(y_hat, y). In addition to that we are also adding it withw2*wd
- 
Gradients: parameter.sub_(learning_rate * gradient), performing an inplace subtraction on parameters with product(learning_rate, gradient). But since our model has multiple parameters (weights, biases), we are looping through them using PyTorch parameters.
- 
Extras:
- 
Weight Decay:
- a) w2: using each parameter, we are calculating the sum of squared weights, w2 , for p in model.parameters(): w2 += (p**2).sum()
- b) wd: a constant (1e-5)
- multiply w2 and wd & add to regular loss_func
 
- a) w2: using each parameter, we are calculating the sum of squared weights, w2 , 
 
- 
Weight Decay:
- 
Combined
- We are going to calculate the loss for each minibatch by calling update(x,y,lr) on them. losses = [update(x,y,lr) for x,y in data.train_dl]
- 
.item()turns into a python number in order to plot & see them visually.
 
- We are going to calculate the loss for each minibatch by calling update(x,y,lr) on them. 
def update(x, y, learning_rate):
  wd = 1e-5
  #prediction
  y_hat = model(x)
  w2 = 0.
  #sum of squared weights
  for p in model.parameters():
    w2 = w2 + (p**2).sum()
  # regular loss
  loss = loss_func(y_hat, y) + w2*wd
  # updates the gradients in the model ie parameters
  loss.backward()
  # instruct pytorch not to record these actions for the next gradient calculation
  with torch.no_grad():
    for p in model.parameters():
      #gradients
      p.sub_(learning_rate * p.grad)
      p.grad.zero_()
  return loss.item()
Resources
- Many thanks to (@cedric ) for his notes Building a neural network from scratch
- Pytorch Tutorial also has this in depth article what is torch.nn really? (somewhat hidden gem) from Jeremy
Feedback is welcome.
