# Lesson 4 - What happens inside nn.sequential?

Hi everyone! I’m on lesson 4 and I’ve been implementing everything from scratch following Jeremy’s suggestion in the “further research”.

I’d like to continue building the whole neural network from scratch (e.g. implement my own nn.sequential), but I’m having trouble wrapping my head around how the outputs/activations for intermediate layers are treated, and how to keep track of gradients across multiple layers?

For example, let’s say I want to create the following simple neural net:

• Layer 1: 784 inputs, 30 outputs
• ReLU
• Layer 2: 30 inputs, 10 outputs (the labels)

Individually, each layer is quite simple:

• Layer 1 is 30 parameters for each pixel, right? i.e. parameter set of weights (784, 30) & biases (30).
• ReLU is straightforward enough as I just need to max(0, layer1),
• Layer 2 is also similar to Layer 1, just with a different shape ie. weights (30,10), biases (10).

After Layer 2, I know I’ll need to use cross-entropy loss on the final activations.

But now comes the part I don’t understand: what happens before Layer 2, in each layer in our forward and backwards steps? What do I need to do with the 30 output parameters from layer 1 before rectifying them? Is the loss function only measured at the last layer? If so, how do I treat the 30 intermediate activations? Do I need to do anything?

Another issue I have is that the gradients for layer 1 get “lost” after I rectify them and pass them through to layer 2 making .backward() on the loss results not work properly–Layer 1’s parameters have undefined `.grad`, leading to the following error when trying to access the underlying `.data` attribute.

``````# how I am currently treating each layer:
def _forward(self, xb):
self.preds = self.layer1.model(xb)
self.preds = self._ReLU(self.preds)
self.preds = self.layer2.model(xb)
``````

When I try to optimize,

``````     42   def _optimize(self, *args): # my own optimizer based on lesson 4
46   ...

AttributeError: 'NoneType' object has no attribute 'data'
``````

What’s the correct way to do this?

Any help is appreciated! My goal is to implement a simple 2 layer net from scratch without the torch helper functions like `nn.Sequential` and `nn.Linear`. I’m almost there! If anyone can help me understand the above, I’d be extremely grateful!

@rek
That is great! Implementing your own nn.sequential is a really good way to understand Pytorch and its functionality. Its actually not quite difficult.
Are you running your `_optimize` function before passing some data to your model and running `.backward()` method on your loss function?
Each parameters `grad` will be None unless `.backward` is run at least once.
``````from torch import autograd
and it’s telling me `Function 'LogBackward' returned nan values in its 0th output.`, meaning the LogBackward is returning nan.