What exactly "learns" in a convolution?

Hi everyone,

I saw this course online and It has been extremely helpful to me in understanding Neural Networks. So thanks a lot to @jeremy and @rachel.

But I’m still trying to understand how learning happens in Convolutions. In Lesson 4 (or so), Jeremy explained Gradient Descent and I completely understood what goes on in a Dense Layer, but he said Gradient Descent also happen in Convolutions and I haven’t figured how it takes place.

Here is what I know so far:
In a Dense Layer,
a) Weights (w and b) are initialised
b) They are used to calculate wx+b on the input (x)
c) The result is compared against the actual value (y) and a loss function is used to calculate the error and keep track of how far away we are from the actual prediction.
d) SGD adds a little bit to the weights, uses it to compute a derivative. (hope that is what it’s called)
e) The derivatives are subtracted from the loss values and divided by the little bit that was added and the differences are multiplied by the learning rates and subtracted from our previous weights.
The weights are being changed and that, I assume, is where the learning happens.

For convolutions,
a) A number of 3x3 filters are passed over the input. Each of them detecting different features.
b) The results of the filter multiplications are passed through an activation function and passed on to another convolution or maxpooling layer, and so on like that until we get to the Dense layers.
My assumption is that
a) either the filters cannot be changed because each one of them captures some information about the input. If they are constantly changed, the would no longer be able to capture the same information
b) or the changes are not being over written, so the model keeps every single filter for every epoch you run, which I think is highly unlikely.

So please how exactly does learning take place in a Convolution?

No, they’re just updated with SGD in the normal way (exactly like the dense version you describe).

BTW, the gradient is calculated analytically, not by adding a bit to the weights.

very good explanation of convs :