In this tutorial from Jeremy: What is torch.nn really? he has an example towards the end where he creates a CNN for mnist. In
nn.Conv2d he makes the inchannels and outchannels:
(1,16), (16,16), (16,10). I get that the last one has to be 10 because there are 10 classes and we want ‘probabilities’ of each class. But why go up to 16 first? How do you choose this value? And why not just go from 1 to 10, 10 to 10, and 10 to 10? Does this have to do with the kernel_size and stride?
All of the images are
28x28 so I can’t see any correlation between these values and 16 either.
class Mnist_CNN(nn.Module): def __init__(self): super().__init__() self.conv1 = nn.Conv2d(1, 16, kernel_size=3, stride=2, padding=1) self.conv2 = nn.Conv2d(16, 16, kernel_size=3, stride=2, padding=1) self.conv3 = nn.Conv2d(16, 10, kernel_size=3, stride=2, padding=1) def forward(self, xb): xb = xb.view(-1, 1, 28, 28) xb = F.relu(self.conv1(xb)) xb = F.relu(self.conv2(xb)) xb = F.relu(self.conv3(xb)) xb = F.avg_pool2d(xb, 4) return xb.view(-1, xb.size(1))