Part 1, online study group

Overview of Gradient Descent

What is Gradient Descent(GD)?

  • It is a type of optimization algorithm to find the minimum of a function (loss function in NN).

Nice Analogy for understanding GD :

  • A person stuck in the mountain & trying to get down with minimal visibility due to fog (Source : Wikipedia).


Source: [1]

Variants of Gradient Descent

Source [2]

  • Stochastic Gradient Descent: weights updated using one sample at a time hence batch_size is 1, for 100 samples, weights updated 100 times
  • Batch Gradient Descent: weight updated using the whole dataset, for 100 samples, weight updated only once
  • Mini Batch: middle ground and combination of the above two. Splits the dataset into the batch size of samples of our choice & chosen at random


I hope this clarifies the different variants of gradient descent.