Lesson 1 - Non-beginner discussion

Of course! Also for anyone else interested (these are the settings I use for Walk with fastai2 as well!):

  • Base (Canvas) Resolution: 1920x1080
  • Output (Scaled) Resolution: 1920x1080
  • Downscale Filter: Bicubic (Sharpened scaling, 16 samples)
  • Common FPS Values: 30
2 Likes

Thank you! :slight_smile:

This is a great question, I am wondering “why” too :slight_smile:

I’ll run some experiments and check emperically

2 Likes

While this isn’t L1 related directly per say, I’m going to be discussing some multi-label ideas that can be applicable to lesson 1, such as what to do to tell your model I don’t know in image classification. This will be done on the zoom chat in ~10 minutes or so

I wanted to post here as there’s not quite a forum for such an idea that would work well I don’t think

1 Like

I am coming! See you there :slight_smile:

1 Like

Just to add to what @muellerzr said:
"
You need to pass to this transform the mean and standard deviation that you want to use; fastai comes with the standard ImageNet mean and standard deviation already defined. (If you do not pass any statistics to the Normalize transform, fastai will automatically calculate them from a single batch of your data.) "
From the book: https://github.com/fastai/fastbook/blob/master/07_sizing_and_tta.ipynb (Just under Normalization).
I have a question about this part:
"(If you do not pass any statistics to the Normalize transform, fastai will automatically calculate them from a single batch of your data.) "
Why are we doing it on a single batch and not the whole data?
is this because we are doing it on the fly?
Is in’t the imagenet stats calculated on the whole data? or is that on a single batch too?

2 Likes

I think the reasoning here is Jeremy and Sylvain looked into this (as is often done in the library built on finding best practices), and they found that one batch was representative enough for images with what to do. ImageNet stats is the entire ImageNet dataset, yes. (One of them can chime in too if my above assumption is wrong :slight_smile: )

1 Like

Unless you use a very small batch size, one batch is enough to have an estimate of the good stats. This assumes you are prototyping/experimenting on a small dataset. If on a large thing, it’s on you to compute your stats properly and pass the right arguments to Normalize.

3 Likes

What would be a good guideline to what is considered “small” versus “large” dataset in this context? Hundreds of images vs thousands of images? (Or 1,000’s vs 1M?)

I guess it depends on your number of classes more than the size (if you classes are expected to be very different). For instance, on a batch of size 128 on Imagenet, you are very likely to miss 1/10th of the classes, which probably means your stats will be bad. With something like Imagenette (10 classes), even if the same size of Imagenet (and 10 classes), it’s more likely to be close enough of the real stats that your don’t care.
But that’s just me ranting late on a Friday night, so don’t trust what say.

2 Likes

Dutifully noted :slight_smile: So if I’m understanding correctly, compare the context of your dataset vs that of ImageNet (which we should do anyways when considering transfer learning). If the classes are radically different (such as steel parts or clouds), calculate your own dataset stats after an initial test to ensure your ideas work. However if they’re not too different (say PETs), then one batch (of say 64) could be enough because there’s 37 classes and their similar to what shows up in ImageNet. Am I understanding this properly?

ImageNet was just an example. If using a pretrained model, always use the stats of pretraining, cause that’s what the model expects.

2 Likes

Ah, yes my mistake. I’ll blame the fever for that one :wink: If you can manage to represent most of the data and all of the classes in a single batch (be it 64 or 128), then don’t worry about it. If you cannot on your first batch, then calculate. I believe that’s more in line with what you are describing.

1 Like

ah i see, that makes sense thanks @sgugger @muellerzr
calc for each dataset and passing it in may be the approach i’ll take. :slight_smile:

So as if we want to use a pretrained model which doesn’t use the imagenet data in training, we have to use the stats for the specific dataset used to train that initial model. So is there a way to know which dataset is used for the specific model based on the available pretrained models in fastai library?

Also is there a way to see all the different dataset stats available? I wanted to know this for other cases in general as well, for instance can we see all the different image augmentations available for use in fastai like resize, rotate, flip, randomresizedcrop etc. as a list in the documentation? That would be really helpful in hacking away and seeing what’s what while learning…

See dev.fast.ai/vision.augment.html

Generally the pretrained models come from torchvision with the exception of xresnet50. You’d want to look at their documentation for that.

For proof, we can see this here inside vision/models/init.py

Also to add to this a bit more @harish3110

fastai has 3: cifar_stats, imagenet_stats, and mnist_stats found here:

Also x2: looks like all torchvision pretrained models are ImageNet

https://pytorch.org/docs/stable/torchvision/models.html

Great question :slight_smile:

1 Like

Ahh! Thanks!! I’m kinda more interested and curious in the way you backtraced your way into that file and found them all! It kinda gets way too hazy when i try that! :sweat_smile:

1 Like

So as a rule of thumb other than unet and xresnet all can be used using imagenet stats to begin with! :slight_smile:

1 Like

Sure, I can describe that! First: no, I don’t have the entire library locations memorized for every line (just in case you thought that)

In general my steps are like so:

  1. Well, what application am I in? Is it vision related specifically? Look in the vision folder. Does it apply to all data? (Such as untar_data), we’ll start in the data folder.
  2. From there, I’ll read at the top and scan the file quickly and see if it’s mentioned. If not I go look in the other files in that folder. But the key is think about this question: “If I were to put this in a module, where the naming and ideas makes sense and related to the library structure, where would I put me”

This of course is not what I do when I’m not on mobile, I simply do ‘??’ If I’m on my computer.

Remember, unet has an encoder. The encoder is a ResNet so you’d still want ImageNet here :wink:

1 Like