Feature extraction after fine-tune from flatten-layer

AmorfEvo · December 22, 2022, 1:33pm

Short answer:
If it’s a resnet18 with vision_learner, then you can cut like this.

new_head = cut_model(learner.model[-1], 2)
learner.model[-1] = new_head

If data goes over the network now, the feature vectors will come out.
Like this:

x, y = dls.one_batch()
x.shape  # torch.Size([64, 3, 224, 224])
feature_vectors = learner.model(x)
feature_vectors.shape  # torch.Size([64, 1024])

Longer answer:
So you have something similar like this (it’s just an example based on your text):

learner = vision_learner(dls, resnet18, metrics=accuracy)
learner.fine_tune(2)

It doesn’t matter if it’s a resnet18 or a fancier timm model, at the end of the day these are just pytorch models, so usually they have a bunch of nn.Sequential layers and even more layers sandwiched between the sequential ones.

A common convention is that people call the beginning of the network “body” and the end “head”.
The outermost sequential layer contains “body”, which is a sequential layer with an index of 0, and “head”, which is also a sequential layer with an index of 1.

You can check what is the exact structure of your model in a notebook cell:
learner.model
This will print a very long list of layers, so I just show the relevant part for this resnet18:

Sequential(
  (0): Sequential(
    (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
   ...
    )
    (5): Sequential(
      ...
    )
    (6): Sequential(
      ...
    )
    (7): Sequential(
      ...
    )
  )
  (1): Sequential(
    (0): AdaptiveConcatPool2d(
      (ap): AdaptiveAvgPool2d(output_size=1)
      (mp): AdaptiveMaxPool2d(output_size=1)
    )
    (1): fastai.layers.Flatten(full=False)
    (2): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): Dropout(p=0.25, inplace=False)
    (4): Linear(in_features=1024, out_features=512, bias=False)
    (5): ReLU(inplace=True)
    (6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (7): Dropout(p=0.5, inplace=False)
    (8): Linear(in_features=512, out_features=37, bias=False)
  )
)

So your problem was with cut_model(learner.model, -1) you lost the whole “head” (it’s a complicated head :)), but you want to lose the last 7 inner layers in the “head” and keep the first 2 inner layers also in the “head”.

That’s why we use cut_model(learner.model[-1], 2), so we gave the “head” as model for the cut_model method (it doesn’t care about submodels and models, any nn layer is a model in itself) and we told it to keep the 1st two layers in this model - which are the AdaptiveConcatPool2d and the fastai.layers.Flatten in the “head”