L3: Finally wrapped my head around how loss tensor "binds" to the weights tensor

fzngagan · June 26, 2025, 7:13am

In the 04-how-does-a-neural-net-really-work.ipynb notebook walkthrough in Lesson 3 of the series, the following was utterly confusing to me:

I wasn’t able to wrap my head around why, simply calling quad_mae on abc set a grad_fn on it and magically associated it with abc such that calling loss.backward() would update abc.grad what???

Hence I did the steps done in quad_mae function manually to see what actually was happening:

Turns out, doing mk_quad(*abc) actually makes the coefficients as single valued tensors and attaches a grad_fn to each of them.

then doing loss2 = mae(quad_f(x), y) again attaches a grad_fn it decides to be appropriate for the use case, in this case MeanBackward0. I believe this was chosen since we did .mean() in the mae function. I tried changing .mean() → .min() in mae and it changed the grad_fn in loss2 to MinBackward1.

Finally, a bit relieved to figure this out.