Please use this thread to discuss lesson 8. Since this is Part 2, feel free to ask more advanced or slightly tangential questions - although if your question is not related to the lesson much at all, please use a different topic.
Thread for general chit chat (we won’t be monitoring this).
Note that this is a forum wiki thread, so you all can edit this post to add/change/organize info to help make it better! To edit, click on the little edit icon at the bottom of this post. Here’s a pic of what to look for:
Lesson resources
- Edited lesson video
- Slides
- Course notebooks
-
Excel spreadsheets (today’s is called
broadcasting.xlsx
). There’s also a Google Sheet version thanks to @Moody - Ensure your fastai lib is up to date
- You’ll also need to:
conda install nbconvert
- You’ll also need to:
conda install fire -c conda-forge
- Notes thread
Errata
Sometimes, occasionally, shockingly, Jeremy makes mistakes. It is rumored that these mistakes were made in this lesson:
- Jeremy claimed that these are equivalent:
for i in range(ar):
c[i] = (a[i].unsqueeze(-1)*b).sum(dim=0)
c[i] = (a[i,None]*b).sum(dim=0)
But they’re not (noticed by @stas - thanks!) The 2nd one isn’t indexing the second axis. So it should be:
for i in range(ar):
c[i] = (a[i,:,None]*b).sum(dim=0)
Things mentioned in the lesson
- Jeremy’s blog posts about Swift: fast.ai Embracing Swift for Deep Learning and High Performance Numeric Programming with Swift: Explorations and Reflections
- Rachel’s post on starting to blog: Why you (yes, you) should blog
- Numpy docs on broadcasting
- Numpy docs on einsum (has lots of great examples)
- The Matrix Calculus You Need For Deep Learning by Terence Parr and Jeremy
- Detexify (for finding math symbols) and Wikipedia list of symbols
- The matrix multiplication song
- Thread by @pierreguillou seeking feedback on best practices for study groups
Papers
- Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification– 2015 paper that won ImageNet, and introduced ResNet and Kaiming Initialization.
- Understanding the difficulty of training deep feedforward neural networks– paper that introduced Xavier initialization
- Fixup Initialization: Residual Learning Without Normalization – paper highlighting importance of normalisation - training 10,000 layer network without regularisation
Notes and other resources
Use this Jupyter notebook to start running the “Deep Learning From the Foundations” notebooks on Colab
Annotated notebooks for Lessons 8 - 12
Lesson 8 notes by @Lankinen
Lesson 8 notes by @wittmannf
Lesson 8 notes by @gietema
Lesson 8 notes by @Borz
Lesson 8 notes by @timlee
Blog posts and tutorials
- Jake VanderPlas’ explanation of broadcasting
- Mathpix - turns images into LaTeX
-
Tutorial by @jeffhale on "How to use
if __name__=='__main__'
" - Khan academy lesson on the chain rule
- Basic PyTorch Tensor Tutorial (Includes Jupyter Notebook)
- Xavier Initialisation (why divide over sqrt(M)) - Link to a blog post that explains it nicely
- Xavier and Kaiming initialisation : Link to a blog post that explains the two papers, and in particular the math in detail
“Assigned” Homework
- Review concepts 16 concepts from Course 1 (lessons 1 - 7): Affine Functions & non-linearities; Parameters & activations; Random initialization & transfer learning; SGD, Momentum, Adam; Convolutions; Batch-norm; Dropout; Data augmentation; Weight decay; Res/dense blocks; Image classification and regression; Embeddings; Continuous & Categorical variables; Collaborative filtering; Language models; NLP classification; Segmentation; U-net; GANS
- Make sure you understand broadcasting
- Read section 2.2 in Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
- Try to replicate as much of the notebooks as you can without peeking; when you get stuck, peek at the lesson notebook, but then close it and try to do it yourself