Hello PyTorch experts. I understand that PyTorch is able to track and update weights and gradients automatically. However, because I do not understand exactly how it does this magic I am also not confident I won’t break it.
I would like to modify an existing Resnet model, and have tried to do so by imitating the code in fastai. Would some please check my code?
The task:
-
create a pretrained model with create_cnn
-
the custom head created by fastai looks like this…
(1): Sequential(
(0): AdaptiveConcatPool2d(
(ap): AdaptiveAvgPool2d(output_size=1)
(mp): AdaptiveMaxPool2d(output_size=1)
)
(1): Flatten()
(2): BatchNorm1d(4096, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): Dropout(p=0.25)
(4): Linear(in_features=4096, out_features=512, bias=True)
(5): ReLU(inplace)
(6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(7): Dropout(p=0.5)
(8): Linear(in_features=512, out_features=4, bias=True)
) -
insert my own layer (module End) before layer 3, and replace layer 4 with a Linear layer that receives a different number of features. Place another module, Start, in front of everything else.
-
assign this new model to the existing learn object.
-
have everything just work. Clear?
The code:
learn = create_cnn(data, arch, metrics=[accCancer])
RNmodel = learn.model
head = RNmodel[1]
myLinear = nn.Linear(in_features=4104, out_features=512, bias=True)
head = nn.Sequential(*list(head.children())[:3], End(), nn.Dropout(.25), myLinear, *list(head.children())[5:])
model = nn.Sequential(Start(),RNmodel[0],head).cuda()
learn.model = model
The resulting model head looks right and the model appears to train.
(2): Sequential(
(0): AdaptiveConcatPool2d(
(ap): AdaptiveAvgPool2d(output_size=1)
(mp): AdaptiveMaxPool2d(output_size=1)
)
(1): Flatten()
(2): BatchNorm1d(4096, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(3): End()
(4): Dropout(p=0.25)
(5): Linear(in_features=4104, out_features=512, bias=True)
(6): ReLU(inplace)
(7): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(8): Dropout(p=0.5)
(9): Linear(in_features=512, out_features=4, bias=True)
)
But does it all automagically work right without my doing anything more? I am concerned about the mentions of “registering/initializing parameters” in the PyTorch docs.
Thanks for reading my long question!