Lesson 11 discussion and wiki

No it’s the size of the images. You have more pixels and more details in 512x512 images compared to 64x64.

LSUV doesn’t seem to be in fastai. Will it be?

Generalizing a bit: For the past few weeks I’ve often been confused as to whether the code Jeremy is writing in the notebooks are functionality that will be integrated into the fastai library, or whether the functions and classes are meant to be written and used by the user interactively and on the fly.

3 Likes

the accuracies in your leader board seem lower than the single epoch results from last week?

Jeremy is teaching you how to do things by hand, because you’ll always need a technique that isn’t in fastai :wink:
Some of what is in the notebooks will find its way in the library, not necessarily all.

4 Likes

Last week was MNIST, it’s a harder dataset.

1 Like

Can we hear Sylvain say imagenette in his corny american accent?

9 Likes

You won’t be able to resolve the same types of features at lower resolutions. Lines will become jagged, curves will become steps, finer features get blurred into coarser ones…etc…

1 Like

So realistically, we should just default to LSUV for initialization.

If an image is 512x512 but you have 30x30 objects, surely it’s not a massive difference from 30x30 objects in a 60x60 image?

I realise I’m talking about tasks that use objects rather than image classification as we are discussing now

in LSUV: when dividing something by its std, its new std is by definition 1. Why do we need the loop iterating to a certain tolerance?

1 Like

Certain patterns aren’t clearly visible in smaller images like the 32x32 Cifar-10:

A classifier might have a harder time finding a “fur” feature in that image because of how small it is. Instead the network will likely come to depend on other features.

Also consider that:
512x512 = 262,144
60x60 = 3,600

So you’re operating on an image with 72 times less information.

Edit: That said, if you have the same size of object in both datasets it would be identical. In ImageNet/Cifar10/MNIST, the object we’re trying to classify is usually taking up most of the image.

2 Likes

Is there a performance hit from doing set(list(…)). Does it goes through each item twice?

1 Like

To add to that, I seem to recall that some research recently showed that deep classifiers are relying a lot on texture to classify things. Maybe someone can remember what this was and share a link.

Yes ok, agreed. I was talking about size of the objects, rather than the images - sorry if it wasn’t clear. You are taking an image classification example and so objects are obviously smaller when the image is smaller :slight_smile:

8 Likes

Eh, it’s very likely each line is only executed once now that you mention it. Try to print a counter. I think there is a loop in the paper because you do all the layers each time, so the results can interfere with each other.

Edit: I did check and everything is executed once only indeed.

4 Likes

Is there a reason not to use glob vs. scandir? Oh. He’s answering this right now. Still curious whether glob is as fast.

3 Likes

This is a bit off topic, IIRC doesn’t mention glob, but I recall a blog post by the main author of os.scandir detailing his motivation and experience of writing it: https://benhoyt.com/writings/scandir/

6 Likes

What descriptive stats do you need to collect on the set of images before proceeding to ConvNets and DL? Do you have sample code for gathering these stats (image dimensions, channels, other things?)?

another LSUV question:
aren’t we interested in std=>1 for the activations after the nonlinearity? which activations are we normalizing here?

1 Like