I’m attempting to implement a CNN from scratch to consolidate what I’ve been learning with the MOOC (I’m still at the beginning).

Is it correct to assume that when BP happens, all weights are updated simultaneously? i.e. when weights are updated their individual impacts on the outputs are not taken into consideration when calculating the next weight’s adjustments.

The alternative view would be that weights are adjusted from the last layers to the first. i.e. first the weights between the output layer and hidden layer, after that the weights between the input layer and hidden layer, taking into account the new output caused by the adjusted hidden weights.

I found the two lectures on neural nets by MIT’s professor Patrick Winston very insightful and intuitive. I strongly suggest anyone interested in artificial intelligence watch the whole course. He is one of the best professors I had an honor to learn from.

just watched both recommended videos and indeed they were of great help. excellent professor and lectures. I’ll take some time to take the rest of the course.

Yes. Think of it this way…
You are trying to update the weight of the edge which is contributing the most to the error. More the error(log loss) contribution, more we would penalize/change the weight of that edge.