Lesson 3 In-Class Discussion

How does jeremy split the layers to like hidden, fully connected. How do we know which lays is what?

1 Like

@jeremy While we’re poking around in the guts of fast.ai, is there a way to get out the pytorch model?

If I understand correctly the pretrained models work either on 224x224 or 299x299 images. How does the library cope with different image input sizes?

1 Like

How is this threshold decided? Is it a hyper parameter? or is it learnt?

1 Like

Here are some models

1 Like

Its a hyper parameter.

In logistic regression the threshold is 0.5, meaning it’s more probable than not.
I suppose it could be the same

Take a look at model_meta in conv_learner.py to see how Jeremy is splitting these. If you look at the nitty gritty, he splits the conv layers (almost in half, but not quite) and then has the fully connected/dense layers as another layer group.

*note the splitting is probably somewhat dependent on the model. Maybe @jeremy or @yinterian can share a little insight on how they determine exactly where split. Maybe more thought goes into it than it looks.

Sometimes it is something simple like 0.5. But its not always this straight forward. This was a common issue in the Planet competition so much so that people even made “threshold finders” :slight_smile: you can check this kernel here for example. Basically you just test which threshold seems to give the best results.
https://www.kaggle.com/paulorzp/find-best-f2-score-threshold

5 Likes

Some models can be used with any size like resnet34.

you can change threshold to maximize your intended evaluation metric.

1 Like

I’m not looking for the structure of existing models, I’m looking for a way to export fast.ai models using a pytorch to X method. To do that I’d need to be able to access the model and weights though.

You wan to save weights (look for learn.save)? I am not understanding the question.

Can you elaborate on the freeze and unfreeze methods?
When I’m running the learn.unfreeze()
then we run a different sample with learn.unfreze()

This is on the lesson2-image-models.ipynb

Imagine I want to run the model using only pytorch (without fast.ai). I want to export the model and load it directly from pytorch using only pytorch commands. Weights are a part of that, but the structure of the architecture is important too.

You can read the fast.ai library and paste and cp what you need.

It looks like the model is there in pure pytorch form, I just need to figure out how to reference it correctly. I’ll keep playing around.

Assume you have a model created out of the resnet34 class.

model = torchvision.models.resnet34(pretrained=True) 

(You can create and use your own model too, as long as you inherit it from Pytorch’s nn.Module class. Infact, check resnet34’s PyTorch implementation here and you will understand.)

Now, as usual, you train your model. Once your model is ready, you can extract and load weights like this:

Save:

my_model_state = model.state_dict()
# Save my_model_state to a file

Load:

# Read my_model_state from file
model = torchvision.models.resnet34(pretrained=False) 
model.load_state_dict(my_model_state)
model.eval() # Set it to eval mode for inference (vs. train mode, which is used during training)
# Your model is ready to use

7 Likes

In fastai library, the model can be accessed by:

learn.model 

Given that @jeremy talked about the convolution, relu, maxpool and softmax layers today, we can actually investigate the layers in fastai model.

You can print all the layers like so:

for c in learn.model.children():
    print(type(c))

If you created learn using precompute=False, you should see something like this:

<class 'torch.nn.modules.conv.Conv2d'>
<class 'torch.nn.modules.batchnorm.BatchNorm2d'>
<class 'torch.nn.modules.activation.ReLU'>
<class 'torch.nn.modules.pooling.MaxPool2d'>
<class 'torch.nn.modules.container.Sequential'>
<class 'torch.nn.modules.container.Sequential'>
<class 'torch.nn.modules.container.Sequential'>
<class 'torch.nn.modules.container.Sequential'>
<class 'fastai.layers.AdaptiveConcatPool2d'>
<class 'fastai.layers.Flatten'>
<class 'torch.nn.modules.batchnorm.BatchNorm1d'>
<class 'torch.nn.modules.dropout.Dropout'>
<class 'torch.nn.modules.linear.Linear'>
<class 'torch.nn.modules.activation.ReLU'>
<class 'torch.nn.modules.batchnorm.BatchNorm1d'>
<class 'torch.nn.modules.dropout.Dropout'>
<class 'torch.nn.modules.linear.Linear'>
<class 'torch.nn.modules.activation.LogSoftmax'>

If you created learn using precompute=True, you should see something like this:

<class 'torch.nn.modules.batchnorm.BatchNorm1d'> 2 2
<class 'torch.nn.modules.dropout.Dropout'> 0 0
<class 'torch.nn.modules.linear.Linear'> 2 2
<class 'torch.nn.modules.activation.ReLU'> 0 0
<class 'torch.nn.modules.batchnorm.BatchNorm1d'> 2 2
<class 'torch.nn.modules.dropout.Dropout'> 0 0
<class 'torch.nn.modules.linear.Linear'> 2 2
<class 'torch.nn.modules.activation.LogSoftmax'> 0 0

This suggests that when precompute=True, we make use of and train only the last fully connected layers (the activations of the previous layers are already precomputed and fed in)

10 Likes

Thanks @anandsaha, this is exactly what I was looking for.