Lesson 11 discussion and wiki

jeremy · April 10, 2019, 9:06pm

Lesson resources

Lesson video

Papers

mixup: Beyond Empirical Risk Minimization
Rethinking the Inception Architecture for Computer Vision (label smoothing is in part 7)
Bag of Tricks for Image Classification with Convolutional Neural Networks

Notes and other resources

Raymond-Wu · April 11, 2019, 1:42am

Could I get a refresher on what callbacks and hooks were?

sgugger · April 11, 2019, 1:42am

This is the focus of the last lesson. The video is accessible in the general announcements thread.

raghavab1992 · April 11, 2019, 1:45am

How does lsuv work on test/val data? do we still adjust mean and std parameters on test data?

KevinB · April 11, 2019, 1:45am

Is there a reason we use standard deviation instead of Mean Absolute Deviation?

benjmann · April 11, 2019, 1:45am

So just to confirm, LSUV is something you run on all the layers once at the beginning, not during training? What if your batch size is small, could you overfit to that batch?

harikrishnanrajeev · April 11, 2019, 1:45am

possible to give an overview of lsuv again ?(high level)

maxim.pechyonkin · April 11, 2019, 1:45am

I want to hear Jeremy pronounce Imagenette in French accent!

sgugger · April 11, 2019, 1:45am

It’s only the initialization of your network. Then, you train it.

brismith · April 11, 2019, 1:46am

sgugger · April 11, 2019, 1:46am

Wait for it

JoshVarty · April 11, 2019, 1:46am

Even though it’s live you can always go back in time on the video if you missed something.

tanyaroosta · April 11, 2019, 1:46am

but is it better than say Kaiming?

sgugger · April 11, 2019, 1:46am

Yes, it trains better, especially for a deeper network.

andrea · April 11, 2019, 1:47am

Interesting about the size of the images affecting what works and what doesn’t. What’s the intuition behind?
Does it mean that some of the operations don’t really “scale” with size?

devforfu · April 11, 2019, 1:48am

Probably the resolution is too low to learn good filters for full-size images?

sgugger · April 11, 2019, 1:48am

The network sees less (or more) details, so it can’t get to the same results.

tanyaroosta · April 11, 2019, 1:49am

you got it

andrea · April 11, 2019, 1:50am

Oh ok - so it’s really the size of the objects, rather than the size of the images per se

raghavab1992 · April 11, 2019, 1:50am

Ah! got it now…thought the readjustment will done after every epoch/few iterations…thanks