After lesson 5, I followed Jeremy’s suggestions and tried a few different pieces of homework.
Writing my own linear class for the Lesson 5 SGD notebook:
- Per Jeremy’s suggestion, I decided to try out writing my own linear class (to use instead of the nn.Linear class)
- I’m still quite new to coding, so this was pretty difficult for me (it took a few hours, with lots of Googling). After completing it, I feel like I know a lot more about PyTorch, so it was definitely a valuable exercise
- The full code is available at GitHub
Adding momentum into the Lesson 2 SGD notebook:
-
I edited the SGD update function to incorporate momentum. For each step, I calculated the size of the step as step = moms * last_step + (1-moms) * gradient. This was then multiplied by the learning rate to calculate the adjustment to the weight. For the first epoch, I set momentum to zero (a previous step does not exist, so the formula does not really work)
-
I found that with the momentum value set to 0.8, the model trained significantly faster than the previous version
-
The code is available at GitHub
Adding momentum into the update function in the Lesson 5 SGD notebook:
-
Similarly to what I did for the Lesson 2 notebook, I edited the Lesson 5 update function to incorporate momentum. This was a bit harder than the Lesson 2 version, since this model contains multiple sets of parameters
-
The code is available at GitHub
Going back to previous models and trying to improve them:
- During Lesson 5, we learnt about some different options you can use when building models (e.g. weight decay, momentum)
- I went back to one of my previous models and played around with the weight decay value (I increased it a lot from its default of 1e-2), and was able to improve the accuracy significantly