Short answer:
If it’s a resnet18 with vision_learner, then you can cut like this.
new_head = cut_model(learner.model[-1], 2)
learner.model[-1] = new_head
If data goes over the network now, the feature vectors will come out.
Like this:
x, y = dls.one_batch()
x.shape # torch.Size([64, 3, 224, 224])
feature_vectors = learner.model(x)
feature_vectors.shape # torch.Size([64, 1024])
Longer answer:
So you have something similar like this (it’s just an example based on your text):
learner = vision_learner(dls, resnet18, metrics=accuracy)
learner.fine_tune(2)
It doesn’t matter if it’s a resnet18 or a fancier timm model, at the end of the day these are just pytorch models, so usually they have a bunch of nn.Sequential layers and even more layers sandwiched between the sequential ones.
A common convention is that people call the beginning of the network “body” and the end “head”.
The outermost sequential layer contains “body”, which is a sequential layer with an index of 0, and “head”, which is also a sequential layer with an index of 1.
You can check what is the exact structure of your model in a notebook cell:
learner.model
This will print a very long list of layers, so I just show the relevant part for this resnet18:
Sequential(
(0): Sequential(
(0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(2): ReLU(inplace=True)
(3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
...
)
(5): Sequential(
...
)
(6): Sequential(
...
)
(7): Sequential(
...
)
)
(1): Sequential(
(0): AdaptiveConcatPool2d(
(ap): AdaptiveAvgPool2d(output_size=1)
(mp): AdaptiveMaxPool2d(output_size=1)
)
(1): fastai.layers.Flatten(full=False)
(2): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Dropout(p=0.25, inplace=False)
(4): Linear(in_features=1024, out_features=512, bias=False)
(5): ReLU(inplace=True)
(6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): Dropout(p=0.5, inplace=False)
(8): Linear(in_features=512, out_features=37, bias=False)
)
)
So your problem was with cut_model(learner.model, -1)
you lost the whole “head” (it’s a complicated head :)), but you want to lose the last 7 inner layers in the “head” and keep the first 2 inner layers also in the “head”.
That’s why we use cut_model(learner.model[-1], 2)
, so we gave the “head” as model for the cut_model method (it doesn’t care about submodels and models, any nn layer is a model in itself) and we told it to keep the 1st two layers in this model - which are the AdaptiveConcatPool2d and the fastai.layers.Flatten in the “head”