fast.ai Course Forums

Part 1, online study group

msivanes (Manikandan Sivanesan) February 18, 2020, 12:08am 128

Overview of Gradient Descent

What is Gradient Descent(GD)?

It is a type of optimization algorithm to find the minimum of a function (loss function in NN).

Nice Analogy for understanding GD :

A person stuck in the mountain & trying to get down with minimal visibility due to fog (Source : Wikipedia).

Algorithm

Source: [1]

Variants of Gradient Descent

Source [2]

Stochastic Gradient Descent: weights updated using one sample at a time hence batch_size is 1, for 100 samples, weights updated 100 times
Batch Gradient Descent: weight updated using the whole dataset, for 100 samples, weight updated only once
Mini Batch: middle ground and combination of the above two. Splits the dataset into the batch size of samples of our choice & chosen at random

[1] https://medium.com/@divakar_239/stochastic-vs-batch-gradient-descent-8820568eada1
[2] https://suniljangirblog.wordpress.com/2018/12/13/variants-of-gradient-descent/

I hope this clarifies the different variants of gradient descent.