I am working on the dog breeds playground competition on Kaggle. My first attempt was to use a pretrained Resnet model through Pytorch with just one linear layer on the end, as per this awesome post:
I modified the code to predict for all 120 classes and got over 90% accuracy on my validation set. However, I can’t seem to make it any better! I have tried adding more layers to the end of both a Resnet and VGG pretrained network, but can only manage to get about 40% accuracy. I have tried playing with the learning rate, dropout, and number of epochs, but it is just not fitting like the out of the box version. I have included my general approach to modifying the VGG and Resnet models provided by Pytorch below. Any ideas what to try next?
VGG with multiple trainable layers:
class Net(nn.Module): def __init__(self, original_model): super(Net, self).__init__() self.pretrained = nn.Sequential(*list(original_model.children())[:-1]) self.finetuned = nn.Sequential( nn.Linear(512 * 7 * 7, 4096), nn.ReLU(inplace=True), nn.Dropout(p=0.5), nn.Linear(4096, 4096), nn.ReLU(inplace=True), nn.Dropout(p=0.5), nn.Linear(4096, 120), nn.Softmax() ) def forward(self, x): x = self.pretrained(x) x = x.view(x.size(0), -1) x = self.finetuned(x) return x def num_flat_features(self, x): size = x.size()[1:] # all dimensions except the batch dimension num_features = 1 for s in size: num_features *= s return num_features pretrained = models.vgg16(pretrained=True) for param in pretrained.parameters(): param.requires_grad = False net = Net(pretrained)
Resnet with multiple trainable layers:
resnet = models.resnet152(pretrained=True) for param in resnet.parameters(): param.requires_grad = False num_features = resnet.fc.in_features fc_layers = nn.Sequential( nn.Linear(num_features, 4096), nn.ReLU(inplace=True), nn.Dropout(p=0.1), nn.Linear(4096, 120), nn.ReLU(inplace=True), nn.Dropout(p=0.1), ) resnet.fc = fc_layers