the output shapes of block1_conv1 and block1_conv2 are the same. (64, 224, 224).
does this mean at block1_conv2 there is only ONE filter???
because from what i heard at the lesson, one filter would create the same amount of ouput from the input.

so how does it do the all channels? Sum them up?
For 1 filter:
input: 64x244x244, ouput:1x244x244

From the lesson, i understand that the 3x3 filter will not change the frame size, for each overlapping 3x3 area of input it does a dot product like operation and get 1 sell in output.
But how does it tackle multiple channels? get the 1 sell for each channel then sum them up?

Yep. It’s a weighted sum of all channels. Each filter has HxWxD weights (plus biases) where HxW - are filter size (3,3) and D is depth which is equal to the number of input channels.