Number of parameters in CONV layer

iamgianluca · January 31, 2020, 2:14pm

Hi everyone,

While working on my notes from courses I took in the recent past, I realized I never fully understood how to compute the number of parameters a CONV layer requires.

Let’s take as an example CONV1. I thought that layer had ((f \times f \times n_{c}) + 1) \times \textrm{# filters} = ((5 \times 5 \times 3) + 1) \times 8 = 608 parameters.

The n_{c} for each filter in the CONV1 layer must be 3 since it has to match the number of channels in the input image. The 1 in the formula above is the bias term, which I’m assuming is just 1 integer. Since the output activation is of size (28 \times 28 \times 8) we will need 8 filters in the CONV1 layer.

What is wrong with my computations?

Pomo · February 1, 2020, 2:39am

I don’t have a definitive answer for you, as I would agree with your computations. Some suggestions:

To see whether your calculations are right, make a nn.conv2d layer and count its parameters.

Also, check out the groups parameter to nn.conv2d. Maybe the type of conv2d you are thinking of is different from Ng’s.

Or maybe he made a mistake. Please let us know what you figure out.

iamgianluca · February 1, 2020, 4:18pm

Good point @Pomo!

import numpy as np
import torch.nn as nn

x = nn.Conv2d(3, 8, 5, 1, 1, bias=True)
x.weight.shape

print(np.prod(x.weight.shape) + x.bias.shape[0])

This snippet confirms my results, 608 parameters. Maybe it was just a typo in Prof. Ng’s slides.

iamgianluca · February 1, 2020, 8:20pm

For completeness, I went back to Coursera and checked the material again. Prof. Ng added a page containing a list of corrections to the slide I posted yesterday which wasn’t available when I took the course