Lesson 5 : Neural Net from Scratch

guptamols · April 25, 2024, 3:25am

We begin by having val_indep and multiply each row of feature data with coeffs.

In the next step (before Jeremy began with Deep Learning), instead of coeff being a vector converted coeff to be a matrix and in deep learning we have layer 1 as a matrix

and then pass the output throgh ReLU like F.relu(indeps@l1)

What I need help with:

I am not really sure what matrix means when I try to visualise in a neural network like below.

If I change n_hidden to may be 5, how does it change the neural network physically?

If this means that we will have 5 layers before output layer, why do we pass it through ReLU only in the end. Should’nt it be after each layer we pass through ReLU?
Would thinking of a physical representation become a limitation when dealing with advanced stuff?

guptamols · April 25, 2024, 5:41am

Figured out,

This matrix is the matrix of weights from inputs to neurons

image source: https://www.researchgate.net/figure/A-simple-neural-network-and-the-mapping-of-the-first-hidden-layer-onto-a-43-Weight_fig2_292077006

Conwyn · April 25, 2024, 1:08pm

The hidden layer is a matrix and an RELU.
So M1 R1 M2 R2. If you only had M1 M2 R1 it would be equivalent to M3 R1 (M3 = M1 x M2) so you need the RELU after each matrix to stop the continuous of multipling the matrices. The RELU breaks the matrix multiplication but please remember there are different RELU options from replace negative with 0 or leaky RELU which retains some negativity to the hyperbolic versions of sinh and tanh.
Regards Conwyn