Overview of Gradient Descent
What is Gradient Descent(GD)?
- It is a type of optimization algorithm to find the minimum of a function (loss function in NN).
Nice Analogy for understanding GD :
- A person stuck in the mountain & trying to get down with minimal visibility due to fog (Source : Wikipedia).
Algorithm
Source: [1]
Variants of Gradient Descent
Source [2]
- Stochastic Gradient Descent: weights updated using one sample at a time hence batch_size is 1, for 100 samples, weights updated 100 times
- Batch Gradient Descent: weight updated using the whole dataset, for 100 samples, weight updated only once
- Mini Batch: middle ground and combination of the above two. Splits the dataset into the batch size of samples of our choice & chosen at random
[1] https://medium.com/@divakar_239/stochastic-vs-batch-gradient-descent-8820568eada1
[2] https://suniljangirblog.wordpress.com/2018/12/13/variants-of-gradient-descent/
I hope this clarifies the different variants of gradient descent.