Is there a reason we use standard deviation instead of Mean Absolute Deviation?
So just to confirm, LSUV is something you run on all the layers once at the beginning, not during training? What if your batch size is small, could you overfit to that batch?
possible to give an overview of lsuv again ?(high level)
I want to hear Jeremy pronounce Imagenette in French accent!
It’s only the initialization of your network. Then, you train it.
Wait for it
Even though it’s live you can always go back in time on the video if you missed something.
but is it better than say Kaiming?
Yes, it trains better, especially for a deeper network.
Interesting about the size of the images affecting what works and what doesn’t. What’s the intuition behind?
Does it mean that some of the operations don’t really “scale” with size?
Probably the resolution is too low to learn good filters for full-size images?
The network sees less (or more) details, so it can’t get to the same results.
you got it
Oh ok - so it’s really the size of the objects, rather than the size of the images per se
Ah! got it now…thought the readjustment will done after every epoch/few iterations…thanks
No it’s the size of the images. You have more pixels and more details in 512x512 images compared to 64x64.
LSUV doesn’t seem to be in fastai. Will it be?
Generalizing a bit: For the past few weeks I’ve often been confused as to whether the code Jeremy is writing in the notebooks are functionality that will be integrated into the fastai library, or whether the functions and classes are meant to be written and used by the user interactively and on the fly.
the accuracies in your leader board seem lower than the single epoch results from last week?
Jeremy is teaching you how to do things by hand, because you’ll always need a technique that isn’t in fastai
Some of what is in the notebooks will find its way in the library, not necessarily all.