As said in the part 1 of the deep learning course the earlier layer would require less changes, so vanishing gradient should not be a problem. is it correct?
As said in the part 1 of the deep learning course the earlier layer would require less changes, so vanishing gradient should not be a problem. is it correct?