Lesson 5 - Questions
This was done as part of the Part 1 online study group.
Audience: Beginner-Intermediate
If you have watched lesson 5 only once/twice, try testing your understanding using the below questions. If you can answer the below questions in two/three sentences, then you have a good understanding of lesson 5 concepts. Else consider reviewing the lecture/notes once again before moving on. Answers can be found in the meeting notes of Lesson 5
- Why ReLUs are needed in the Neural Networks(NN)?
- Is Affine function a linear function?
- Does Bias-Variance trade-off happen in Deep Learning as well?
- What is a Variance?
- Do too many parameters in NN mean higher variance?
- Why freeze is needed for fine-tuning? What happens when we freeze?
- Why unfreeze is needed & train the entire model?
- Can you explain how learning rates are applied to the layers in each of the below cases
1e-3
slice(1e-3)
slice(1e-5, 1e-3)
- Can you identify the 3 different variants of Gradient Descent(GD)? How much of training samples are used & when weights are computed in each of the variants? Does Stochastic gradient descent mean using mini-batches & updating loss after each mini-batch?
- How/When do you update weights and also describe the sequence of operations in backpropagation?
- What is Learning Rate (LR) annealing ? Why are we applying LR?
- Why are we applying the exponential before softmax?
- What is the difference between a loss function & a cross function?
- What is the difference between epoch and iteration?
- Why do we need a cyclical learning rate? And what happens to momentum during one cycle?
- What are entropy and softmax?
- When to use cross-entropy instead of, say, Root Mean Square Error (RMSE)?