Hello!
I know what gradients are, but I have a confusion about how many gradients we calculate during gradient descent.
In the chapter 04_mnist_basics, we use a quadratic as our loss function.
f(x) = ax^{2} + bx + c
Now the minimum loss for this function would be the vertex of the quadratic, since the gradient there is zero. The derivative function for the quadratic above is below.
f'(x) = 2ax + b
So we would want to find the weights that would converge the gradient to 0.
However, the lesson calculates multiple gradients; a gradient was calculated for a, b, and c, and I’m getting confused over that.
The concept of gradients I know of is a singular value that represents the slope of a function at a point, and the value of the gradient at that point is given by the corresponding derivative function.
What do the gradients of a, b, and c represent? Each weight has its own gradient and that isn’t making sense to me.
I would appreciate clarfication on this! The relevant section in the lesson if you want to have a look is Stochastic Gradient Descent (SGD).