Modifying pretrained resnet - does it "just work"?

(Malcolm McLean) #1

Hello PyTorch experts. I understand that PyTorch is able to track and update weights and gradients automatically. However, because I do not understand exactly how it does this magic I am also not confident I won’t break it.

I would like to modify an existing Resnet model, and have tried to do so by imitating the code in fastai. Would some please check my code?

The task:

  • create a pretrained model with create_cnn

  • the custom head created by fastai looks like this…
    (1): Sequential(
    (0): AdaptiveConcatPool2d(
    (ap): AdaptiveAvgPool2d(output_size=1)
    (mp): AdaptiveMaxPool2d(output_size=1)
    )
    (1): Flatten()
    (2): BatchNorm1d(4096, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): Dropout(p=0.25)
    (4): Linear(in_features=4096, out_features=512, bias=True)
    (5): ReLU(inplace)
    (6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (7): Dropout(p=0.5)
    (8): Linear(in_features=512, out_features=4, bias=True)
    )

  • insert my own layer (module End) before layer 3, and replace layer 4 with a Linear layer that receives a different number of features. Place another module, Start, in front of everything else.

  • assign this new model to the existing learn object.

  • have everything just work. Clear?

The code:

learn = create_cnn(data, arch, metrics=[accCancer])
RNmodel = learn.model
head = RNmodel[1]

myLinear = nn.Linear(in_features=4104, out_features=512, bias=True) 
head = nn.Sequential(*list(head.children())[:3], End(), nn.Dropout(.25), myLinear, *list(head.children())[5:])

model = nn.Sequential(Start(),RNmodel[0],head).cuda()
learn.model = model

The resulting model head looks right and the model appears to train.

(2): Sequential(
    (0): AdaptiveConcatPool2d(
      (ap): AdaptiveAvgPool2d(output_size=1)
      (mp): AdaptiveMaxPool2d(output_size=1)
    )
    (1): Flatten()
    (2): BatchNorm1d(4096, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): End()
    (4): Dropout(p=0.25)
    (5): Linear(in_features=4104, out_features=512, bias=True)
    (6): ReLU(inplace)
    (7): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (8): Dropout(p=0.5)
    (9): Linear(in_features=512, out_features=4, bias=True)
  )

But does it all automagically work right without my doing anything more? I am concerned about the mentions of “registering/initializing parameters” in the PyTorch docs.

Thanks for reading my long question!

1 Like

Adding (image-meta-data) Input-Layer as intermediate layer in CNN
More flexible transfer learning: Hacking pretrained models
(Malcolm McLean) #2

Coming back to answer this a month later, after tracing through many modified models.

Yes, it all just works…

  • grabbing a module (usually pre-trained) by its index.
  • listfying a module’s children and slicing
  • throwing any of these into Sequential
  • assigning a different layer to a model with indexing on the left, as in RNModel[1][3] = myLinear
  • assigning to a Module’s named instance variables, as in myModule.lin2 = myLinear. Here you have to make sure that myModule.forward() still works.

PyTorch properly keeps track of parameters and applies optimizers correctly.

You can also dynamically change during training a layer’s fixed parameters, like requires_grad and dropout probability.

I had to test it to believe it.

1 Like