Lesson 3 In-Class Discussion

Sree · November 14, 2017, 4:41am

How does jeremy split the layers to like hidden, fully connected. How do we know which lays is what?

Even · November 14, 2017, 4:44am

@jeremy While we’re poking around in the guts of fast.ai, is there a way to get out the pytorch model?

stathis · November 14, 2017, 4:45am

If I understand correctly the pretrained models work either on 224x224 or 299x299 images. How does the library cope with different image input sizes?

anandsaha · November 14, 2017, 4:45am

How is this threshold decided? Is it a hyper parameter? or is it learnt?

yinterian · November 14, 2017, 4:46am

Here are some models

ar_ai · November 14, 2017, 4:46am

Its a hyper parameter.

stathis · November 14, 2017, 4:46am

In logistic regression the threshold is 0.5, meaning it’s more probable than not.
I suppose it could be the same

metachi · November 14, 2017, 4:46am

Take a look at model_meta in conv_learner.py to see how Jeremy is splitting these. If you look at the nitty gritty, he splits the conv layers (almost in half, but not quite) and then has the fully connected/dense layers as another layer group.

*note the splitting is probably somewhat dependent on the model. Maybe @jeremy or @yinterian can share a little insight on how they determine exactly where split. Maybe more thought goes into it than it looks.

jamesrequa · November 14, 2017, 4:46am

Sometimes it is something simple like 0.5. But its not always this straight forward. This was a common issue in the Planet competition so much so that people even made “threshold finders” you can check this kernel here for example. Basically you just test which threshold seems to give the best results.

yinterian · November 14, 2017, 4:47am

Some models can be used with any size like resnet34.

ar_ai · November 14, 2017, 4:48am

you can change threshold to maximize your intended evaluation metric.

Even · November 14, 2017, 4:51am

I’m not looking for the structure of existing models, I’m looking for a way to export fast.ai models using a pytorch to X method. To do that I’d need to be able to access the model and weights though.

yinterian · November 14, 2017, 4:55am

You wan to save weights (look for learn.save)? I am not understanding the question.

gerardo · November 14, 2017, 4:57am

Can you elaborate on the freeze and unfreeze methods?
When I’m running the learn.unfreeze()
then we run a different sample with learn.unfreze()

This is on the lesson2-image-models.ipynb

Even · November 14, 2017, 5:01am

Imagine I want to run the model using only pytorch (without fast.ai). I want to export the model and load it directly from pytorch using only pytorch commands. Weights are a part of that, but the structure of the architecture is important too.

yinterian · November 14, 2017, 5:04am

You can read the fast.ai library and paste and cp what you need.

Even · November 14, 2017, 5:06am

It looks like the model is there in pure pytorch form, I just need to figure out how to reference it correctly. I’ll keep playing around.

anandsaha · November 14, 2017, 5:10am

Assume you have a model created out of the resnet34 class.

model = torchvision.models.resnet34(pretrained=True)

(You can create and use your own model too, as long as you inherit it from Pytorch’s nn.Module class. Infact, check resnet34’s PyTorch implementation here and you will understand.)

Now, as usual, you train your model. Once your model is ready, you can extract and load weights like this:

Save:

my_model_state = model.state_dict()
# Save my_model_state to a file

Load:

# Read my_model_state from file
model = torchvision.models.resnet34(pretrained=False) 
model.load_state_dict(my_model_state)
model.eval() # Set it to eval mode for inference (vs. train mode, which is used during training)
# Your model is ready to use

–

anandsaha · November 14, 2017, 5:23am

In fastai library, the model can be accessed by:

learn.model

Given that @jeremy talked about the convolution, relu, maxpool and softmax layers today, we can actually investigate the layers in fastai model.

You can print all the layers like so:

for c in learn.model.children():
    print(type(c))

If you created learn using precompute=False, you should see something like this:

<class 'torch.nn.modules.conv.Conv2d'>
<class 'torch.nn.modules.batchnorm.BatchNorm2d'>
<class 'torch.nn.modules.activation.ReLU'>
<class 'torch.nn.modules.pooling.MaxPool2d'>
<class 'torch.nn.modules.container.Sequential'>
<class 'torch.nn.modules.container.Sequential'>
<class 'torch.nn.modules.container.Sequential'>
<class 'torch.nn.modules.container.Sequential'>
<class 'fastai.layers.AdaptiveConcatPool2d'>
<class 'fastai.layers.Flatten'>
<class 'torch.nn.modules.batchnorm.BatchNorm1d'>
<class 'torch.nn.modules.dropout.Dropout'>
<class 'torch.nn.modules.linear.Linear'>
<class 'torch.nn.modules.activation.ReLU'>
<class 'torch.nn.modules.batchnorm.BatchNorm1d'>
<class 'torch.nn.modules.dropout.Dropout'>
<class 'torch.nn.modules.linear.Linear'>
<class 'torch.nn.modules.activation.LogSoftmax'>

If you created learn using precompute=True, you should see something like this:

<class 'torch.nn.modules.batchnorm.BatchNorm1d'> 2 2
<class 'torch.nn.modules.dropout.Dropout'> 0 0
<class 'torch.nn.modules.linear.Linear'> 2 2
<class 'torch.nn.modules.activation.ReLU'> 0 0
<class 'torch.nn.modules.batchnorm.BatchNorm1d'> 2 2
<class 'torch.nn.modules.dropout.Dropout'> 0 0
<class 'torch.nn.modules.linear.Linear'> 2 2
<class 'torch.nn.modules.activation.LogSoftmax'> 0 0

This suggests that when precompute=True, we make use of and train only the last fully connected layers (the activations of the previous layers are already precomputed and fed in)

–

Even · November 14, 2017, 5:38am

Thanks @anandsaha, this is exactly what I was looking for.