Lesson 1 - Non-beginner discussion

muellerzr · March 20, 2020, 10:30pm

Of course! Also for anyone else interested (these are the settings I use for Walk with fastai2 as well!):

Base (Canvas) Resolution: 1920x1080
Output (Scaled) Resolution: 1920x1080
Downscale Filter: Bicubic (Sharpened scaling, 16 samples)
Common FPS Values: 30

arora_aman · March 20, 2020, 10:32pm

Thank you!

arora_aman · March 20, 2020, 10:35pm

This is a great question, I am wondering “why” too

I’ll run some experiments and check emperically

muellerzr · March 20, 2020, 10:37pm

While this isn’t L1 related directly per say, I’m going to be discussing some multi-label ideas that can be applicable to lesson 1, such as what to do to tell your model I don’t know in image classification. This will be done on the zoom chat in ~10 minutes or so

I wanted to post here as there’s not quite a forum for such an idea that would work well I don’t think

arora_aman · March 20, 2020, 10:38pm

I am coming! See you there

barnacl · March 21, 2020, 3:04am

Just to add to what @muellerzr said:
"
You need to pass to this transform the mean and standard deviation that you want to use; fastai comes with the standard ImageNet mean and standard deviation already defined. (If you do not pass any statistics to the Normalize transform, fastai will automatically calculate them from a single batch of your data.) "
From the book: https://github.com/fastai/fastbook/blob/master/07_sizing_and_tta.ipynb (Just under Normalization).
I have a question about this part:
"(If you do not pass any statistics to the Normalize transform, fastai will automatically calculate them from a single batch of your data.) "
Why are we doing it on a single batch and not the whole data?
is this because we are doing it on the fly?
Is in’t the imagenet stats calculated on the whole data? or is that on a single batch too?

muellerzr · March 21, 2020, 3:06am

I think the reasoning here is Jeremy and Sylvain looked into this (as is often done in the library built on finding best practices), and they found that one batch was representative enough for images with what to do. ImageNet stats is the entire ImageNet dataset, yes. (One of them can chime in too if my above assumption is wrong )

sgugger · March 21, 2020, 3:18am

Unless you use a very small batch size, one batch is enough to have an estimate of the good stats. This assumes you are prototyping/experimenting on a small dataset. If on a large thing, it’s on you to compute your stats properly and pass the right arguments to Normalize.

muellerzr · March 21, 2020, 3:21am

What would be a good guideline to what is considered “small” versus “large” dataset in this context? Hundreds of images vs thousands of images? (Or 1,000’s vs 1M?)

sgugger · March 21, 2020, 3:27am

I guess it depends on your number of classes more than the size (if you classes are expected to be very different). For instance, on a batch of size 128 on Imagenet, you are very likely to miss 1/10th of the classes, which probably means your stats will be bad. With something like Imagenette (10 classes), even if the same size of Imagenet (and 10 classes), it’s more likely to be close enough of the real stats that your don’t care.
But that’s just me ranting late on a Friday night, so don’t trust what say.

muellerzr · March 21, 2020, 3:31am

Dutifully noted So if I’m understanding correctly, compare the context of your dataset vs that of ImageNet (which we should do anyways when considering transfer learning). If the classes are radically different (such as steel parts or clouds), calculate your own dataset stats after an initial test to ensure your ideas work. However if they’re not too different (say PETs), then one batch (of say 64) could be enough because there’s 37 classes and their similar to what shows up in ImageNet. Am I understanding this properly?

sgugger · March 21, 2020, 3:33am

ImageNet was just an example. If using a pretrained model, always use the stats of pretraining, cause that’s what the model expects.

muellerzr · March 21, 2020, 3:35am

Ah, yes my mistake. I’ll blame the fever for that one If you can manage to represent most of the data and all of the classes in a single batch (be it 64 or 128), then don’t worry about it. If you cannot on your first batch, then calculate. I believe that’s more in line with what you are describing.

barnacl · March 21, 2020, 3:41am

ah i see, that makes sense thanks @sgugger @muellerzr
calc for each dataset and passing it in may be the approach i’ll take.

harish3110 · March 21, 2020, 3:43am

So as if we want to use a pretrained model which doesn’t use the imagenet data in training, we have to use the stats for the specific dataset used to train that initial model. So is there a way to know which dataset is used for the specific model based on the available pretrained models in fastai library?

Also is there a way to see all the different dataset stats available? I wanted to know this for other cases in general as well, for instance can we see all the different image augmentations available for use in fastai like resize, rotate, flip, randomresizedcrop etc. as a list in the documentation? That would be really helpful in hacking away and seeing what’s what while learning…

muellerzr · March 21, 2020, 3:48am

See dev.fast.ai/vision.augment.html

Generally the pretrained models come from torchvision with the exception of xresnet50. You’d want to look at their documentation for that.

For proof, we can see this here inside vision/models/init.py

Also to add to this a bit more @harish3110

fastai has 3: cifar_stats, imagenet_stats, and mnist_stats found here:

github.com

fastai/fastai2/blob/master/fastai2/vision/core.py#L15


      
                     'TensorPointCreate', 'get_annotations', 'TensorBBox', 'LabeledBBox', 'encodes', 'encodes', 'PointScaler',
                     'BBoxLabeler', 'decodes', 'encodes', 'decodes']
          
          # Cell
          from ..torch_basics import *
          from ..data.all import *
          
          from PIL import Image
          
          # Cell
          #nbdev_comment _all_ = ['Image','ToTensor']
          
          # Cell
          imagenet_stats = ([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
          cifar_stats    = ([0.491, 0.482, 0.447], [0.247, 0.243, 0.261])
          mnist_stats    = ([0.131], [0.308])
          
          # Cell
          if not hasattr(Image,'_patched'):
              _old_sz = Image.Image.size.fget
              @patch_property

Also x2: looks like all torchvision pretrained models are ImageNet

https://pytorch.org/docs/stable/torchvision/models.html

Great question

harish3110 · March 21, 2020, 4:00am

Ahh! Thanks!! I’m kinda more interested and curious in the way you backtraced your way into that file and found them all! It kinda gets way too hazy when i try that!

harish3110 · March 21, 2020, 4:02am

So as a rule of thumb other than unet and xresnet all can be used using imagenet stats to begin with!

muellerzr · March 21, 2020, 4:04am

Sure, I can describe that! First: no, I don’t have the entire library locations memorized for every line (just in case you thought that)

In general my steps are like so:

Well, what application am I in? Is it vision related specifically? Look in the vision folder. Does it apply to all data? (Such as untar_data), we’ll start in the data folder.
From there, I’ll read at the top and scan the file quickly and see if it’s mentioned. If not I go look in the other files in that folder. But the key is think about this question: “If I were to put this in a module, where the naming and ideas makes sense and related to the library structure, where would I put me”

This of course is not what I do when I’m not on mobile, I simply do ‘??’ If I’m on my computer.

muellerzr · March 21, 2020, 4:05am

Remember, unet has an encoder. The encoder is a ResNet so you’d still want ImageNet here