Issues with collecting activation statistics in the cuda_cnn_hooks_init notebook

I’ve noticed the following issues with the section “Hooks” in this notebook.

A) Statistics are being collected during the validation phase as well

class SequentialModel(nn.Module):
    def __init__(self, *layers):
        super().__init__()
        self.layers = nn.ModuleList(layers)
        self.act_means = [[] for _ in layers]
        self.act_stds  = [[] for _ in layers]
        
    def __call__(self, x):
        for i,l in enumerate(self.layers):
            x = l(x)
            self.act_means[i].append(x.data.mean()) #for every x, train as well as valid
            self.act_stds [i].append(x.data.std ())
        return x
    
    def __iter__(self): return iter(self.layers)

However, shouldn’t these statistics be calculated only during the training phase?

Something like:

if self.training: #true after model.train() is called
    self.act_means[i].append(x.data.mean())
    self.act_stds [i].append(x.data.std ())

The graphs don’t really change after the fix(?), but I think the check should be added for consistency.


B) There are 7 graphs (one for each layer), so the following line (at all the relevant places) should be fixed from:

for l in model.act_means: plt.plot(l)
plt.legend(range(6));

to

plt.legend(range(7)); #6 -> 7

or better

layer_names = ["conv8", "conv16", "conv32_1", "conv32_2", "pool", "flatten", "fcc"]
plt.legend(layer_names)

C) The statistics for the “pool” and the “flatten” layer will be identical, and thus won’t show up distinctively on the graph. A note clarifying it might save some confusion for those looking into it in detail.


If the team thinks the changes are worth doing, I’ll be happy to open a PR.

Thanks!

2 Likes

It would be better if we only show the conv layers - I did that in some parts of the notebook, but not all (only because I didn’t get around to it).

Yes.

2 Likes