So I’m creating cnn from scratch…where I understand full connected layer means…last layer of network has same size of kernel to that of input…
So jeremy said VGG has fully connected layers & it’s kind of slow + heavy
But resnet doesn’t have full connected layers. What does that mean…how to write something without fully connected layers? I thought last step is also mandatory
The problem with a fully-connected layer is that it always expects its input to be a vector of a fixed size. But a convolution layer (or pooling layer) doesn’t care about the size of the input.
So if the entire network is made up of conv / pooling layers, you can more easily use it on images of different sizes. That’s a big reason for why almost no one uses FC layers anymore.
Who says resnet doesn’t have fully connected layers? It has at least one, see here:
self.relu = nn.ReLU(inplace=True)
self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
self.layer1 = self._make_layer(block, 64, layers)
self.layer2 = self._make_layer(block, 128, layers, stride=2,
self.layer3 = self._make_layer(block, 256, layers, stride=2,
self.layer4 = self._make_layer(block, 512, layers, stride=2,
self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
self.fc = nn.Linear(512 * block.expansion, num_classes)
for m in self.modules():
if isinstance(m, nn.Conv2d):
nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
elif isinstance(m, (nn.BatchNorm2d, nn.GroupNorm)):
# Zero-initialize the last BN in each residual branch,
# so that the residual branch starts with zeros, and each residual block behaves like an identity.
if you use the fastai pretrained version with custom head it has 2.
VGG has 3 FC layers in the end but what makes it heavy&slow is that the middle one is huge (4096x4096) and the others have to lead up to and down from that, so there are millions of weights in those final layers. More modern architectures use much smaller FC layers and often only one or two (and yes there are also nets without FCs but resnet is not one of them)
def __init__(self, features, num_classes=1000, init_weights=True):
self.features = features
self.avgpool = nn.AdaptiveAvgPool2d((7, 7))
self.classifier = nn.Sequential(
nn.Linear(512 * 7 * 7, 4096),
def forward(self, x):
x = self.features(x)
x = self.avgpool(x)
(examples from the torchvision implementations used in fastai)
Note that in this case using a 1x1 conv layer is identical to using a fully-connected layer. So while this particular implementation of ResNet has one, others may use a 1x1 conv here.
(Key is the
AdaptiveAvgPool2d layer that precedes it. This reduces the feature map to 1x1.)