Lesson 11 discussion and wiki

sgugger · April 11, 2019, 1:50am

No it’s the size of the images. You have more pixels and more details in 512x512 images compared to 64x64.

nswitanek · April 11, 2019, 1:51am

LSUV doesn’t seem to be in fastai. Will it be?

Generalizing a bit: For the past few weeks I’ve often been confused as to whether the code Jeremy is writing in the notebooks are functionality that will be integrated into the fastai library, or whether the functions and classes are meant to be written and used by the user interactively and on the fly.

Dee · April 11, 2019, 1:52am

the accuracies in your leader board seem lower than the single epoch results from last week?

sgugger · April 11, 2019, 1:52am

Jeremy is teaching you how to do things by hand, because you’ll always need a technique that isn’t in fastai
Some of what is in the notebooks will find its way in the library, not necessarily all.

sgugger · April 11, 2019, 1:53am

Last week was MNIST, it’s a harder dataset.

KevinB · April 11, 2019, 1:53am

Can we hear Sylvain say imagenette in his corny american accent?

neuradai · April 11, 2019, 1:53am

You won’t be able to resolve the same types of features at lower resolutions. Lines will become jagged, curves will become steps, finer features get blurred into coarser ones…etc…

lucaslooper · April 11, 2019, 1:53am

So realistically, we should just default to LSUV for initialization.

andrea · April 11, 2019, 1:54am

If an image is 512x512 but you have 30x30 objects, surely it’s not a massive difference from 30x30 objects in a 60x60 image?

I realise I’m talking about tasks that use objects rather than image classification as we are discussing now

yonatan365 · April 11, 2019, 1:55am

in LSUV: when dividing something by its std, its new std is by definition 1. Why do we need the loop iterating to a certain tolerance?

JoshVarty · April 11, 2019, 1:56am

Certain patterns aren’t clearly visible in smaller images like the 32x32 Cifar-10:

A classifier might have a harder time finding a “fur” feature in that image because of how small it is. Instead the network will likely come to depend on other features.

Also consider that:
512x512 = 262,144
60x60 = 3,600

So you’re operating on an image with 72 times less information.

Edit: That said, if you have the same size of object in both datasets it would be identical. In ImageNet/Cifar10/MNIST, the object we’re trying to classify is usually taking up most of the image.

gamino · April 11, 2019, 1:57am

Is there a performance hit from doing set(list(…)). Does it goes through each item twice?

PierreO · April 11, 2019, 1:58am

To add to that, I seem to recall that some research recently showed that deep classifiers are relying a lot on texture to classify things. Maybe someone can remember what this was and share a link.

andrea · April 11, 2019, 1:58am

Yes ok, agreed. I was talking about size of the objects, rather than the images - sorry if it wasn’t clear. You are taking an image classification example and so objects are obviously smaller when the image is smaller

alenas · April 11, 2019, 1:58am

sgugger · April 11, 2019, 1:58am

Eh, it’s very likely each line is only executed once now that you mention it. Try to print a counter. I think there is a loop in the paper because you do all the layers each time, so the results can interfere with each other.

Edit: I did check and everything is executed once only indeed.

ThomM · April 11, 2019, 1:59am

Is there a reason not to use glob vs. scandir? Oh. He’s answering this right now. Still curious whether glob is as fast.

mediocrates · April 11, 2019, 2:02am

This is a bit off topic, IIRC doesn’t mention glob, but I recall a blog post by the main author of os.scandir detailing his motivation and experience of writing it: https://benhoyt.com/writings/scandir/

nswitanek · April 11, 2019, 2:04am

What descriptive stats do you need to collect on the set of images before proceeding to ConvNets and DL? Do you have sample code for gathering these stats (image dimensions, channels, other things?)?

yonatan365 · April 11, 2019, 2:05am

another LSUV question:
aren’t we interested in std=>1 for the activations after the nonlinearity? which activations are we normalizing here?