No it’s the size of the images. You have more pixels and more details in 512x512 images compared to 64x64.
LSUV doesn’t seem to be in fastai. Will it be?
Generalizing a bit: For the past few weeks I’ve often been confused as to whether the code Jeremy is writing in the notebooks are functionality that will be integrated into the fastai library, or whether the functions and classes are meant to be written and used by the user interactively and on the fly.
the accuracies in your leader board seem lower than the single epoch results from last week?
Jeremy is teaching you how to do things by hand, because you’ll always need a technique that isn’t in fastai
Some of what is in the notebooks will find its way in the library, not necessarily all.
Last week was MNIST, it’s a harder dataset.
Can we hear Sylvain say imagenette in his corny american accent?
You won’t be able to resolve the same types of features at lower resolutions. Lines will become jagged, curves will become steps, finer features get blurred into coarser ones…etc…
So realistically, we should just default to LSUV for initialization.
If an image is 512x512 but you have 30x30 objects, surely it’s not a massive difference from 30x30 objects in a 60x60 image?
I realise I’m talking about tasks that use objects rather than image classification as we are discussing now
in LSUV: when dividing something by its std, its new std is by definition 1. Why do we need the loop iterating to a certain tolerance?
Certain patterns aren’t clearly visible in smaller images like the 32x32 Cifar-10:
A classifier might have a harder time finding a “fur” feature in that image because of how small it is. Instead the network will likely come to depend on other features.
Also consider that:
512x512 = 262,144
60x60 = 3,600
So you’re operating on an image with 72 times less information.
Edit: That said, if you have the same size of object in both datasets it would be identical. In ImageNet/Cifar10/MNIST, the object we’re trying to classify is usually taking up most of the image.
Is there a performance hit from doing set(list(…)). Does it goes through each item twice?
To add to that, I seem to recall that some research recently showed that deep classifiers are relying a lot on texture to classify things. Maybe someone can remember what this was and share a link.
Yes ok, agreed. I was talking about size of the objects, rather than the images - sorry if it wasn’t clear. You are taking an image classification example and so objects are obviously smaller when the image is smaller
Eh, it’s very likely each line is only executed once now that you mention it. Try to print a counter. I think there is a loop in the paper because you do all the layers each time, so the results can interfere with each other.
Edit: I did check and everything is executed once only indeed.
Is there a reason not to use glob vs. scandir? Oh. He’s answering this right now. Still curious whether glob is as fast.
This is a bit off topic, IIRC doesn’t mention glob, but I recall a blog post by the main author of os.scandir
detailing his motivation and experience of writing it: https://benhoyt.com/writings/scandir/
What descriptive stats do you need to collect on the set of images before proceeding to ConvNets and DL? Do you have sample code for gathering these stats (image dimensions, channels, other things?)?
another LSUV question:
aren’t we interested in std=>1 for the activations after the nonlinearity? which activations are we normalizing here?