Lesson 5 : Neural Net from Scratch

We begin by having val_indep and multiply each row of feature data with coeffs.

In the next step (before Jeremy began with Deep Learning), instead of coeff being a vector converted coeff to be a matrix and in deep learning we have layer 1 as a matrix


and then pass the output throgh ReLU like F.relu(indeps@l1)

What I need help with:

  1. I am not really sure what matrix means when I try to visualise in a neural network like below.


  1. If I change n_hidden to may be 5, how does it change the neural network physically?
  • If this means that we will have 5 layers before output layer, why do we pass it through ReLU only in the end. Should’nt it be after each layer we pass through ReLU?
  • Would thinking of a physical representation become a limitation when dealing with advanced stuff?

Figured out,

This matrix is the matrix of weights from inputs to neurons

image source: https://www.researchgate.net/figure/A-simple-neural-network-and-the-mapping-of-the-first-hidden-layer-onto-a-43-Weight_fig2_292077006

1 Like

The hidden layer is a matrix and an RELU.
So M1 R1 M2 R2. If you only had M1 M2 R1 it would be equivalent to M3 R1 (M3 = M1 x M2) so you need the RELU after each matrix to stop the continuous of multipling the matrices. The RELU breaks the matrix multiplication but please remember there are different RELU options from replace negative with 0 or leaky RELU which retains some negativity to the hyperbolic versions of sinh and tanh.
Regards Conwyn