Lesson resources
Papers
- mixup: Beyond Empirical Risk Minimization
- Rethinking the Inception Architecture for Computer Vision (label smoothing is in part 7)
- Bag of Tricks for Image Classification with Convolutional Neural Networks
Could I get a refresher on what callbacks and hooks were?
This is the focus of the last lesson. The video is accessible in the general announcements thread.
How does lsuv work on test/val data? do we still adjust mean and std parameters on test data?
Is there a reason we use standard deviation instead of Mean Absolute Deviation?
So just to confirm, LSUV is something you run on all the layers once at the beginning, not during training? What if your batch size is small, could you overfit to that batch?
possible to give an overview of lsuv again ?(high level)
I want to hear Jeremy pronounce Imagenette in French accent!
It’s only the initialization of your network. Then, you train it.
Wait for it
Even though it’s live you can always go back in time on the video if you missed something.
but is it better than say Kaiming?
Yes, it trains better, especially for a deeper network.
Interesting about the size of the images affecting what works and what doesn’t. What’s the intuition behind?
Does it mean that some of the operations don’t really “scale” with size?
Probably the resolution is too low to learn good filters for full-size images?
The network sees less (or more) details, so it can’t get to the same results.
you got it
Oh ok - so it’s really the size of the objects, rather than the size of the images per se
Ah! got it now…thought the readjustment will done after every epoch/few iterations…thanks