[Lesson 5] Why doesn't the RMSprop term ever become negative?

(Matthew Mitchell) #1

In lesson 5 there are some spread sheets which show the ADAM optimiser working in excel along with its components including the RMS prop gradient descent. The basic formula of RMS prop is given below.

v_t = \beta * g_t + (1-\beta)*g_t^2
\theta_t = \theta_{t-1} - \frac{\alpha * update}{\sqrt{v_t}}

My question is… what is there to stop the expression under the square root becoming negative?

0 Likes