An overview of gradient descent optimization algorithms

Hi all, I’ve written a similar (hopefully more digestible) article on 10 Gradient Descent Optimisation Algorithms, and compiled them into a cheat sheet. Hopefully you will find it useful too. Let me know if you have any feedback!