Trying to understanding layers (visualizations)

amritv · April 13, 2020, 6:11am

Motivated by this recent paper An Overview of Early Vision in InceptionV1, I am trying to understand what each layer looks like (specifically the initial conv2d layers as per the paper).

Using a trained model from this challenge https://forums.fast.ai/t/fastgarden-a-new-imagenette-like-competition-just-for-fun/65909

Using this as the test image:
01_input

This is the snippet of the conv2d0 model summary
01_conv0

To get the conv2d0 layer:

conv0 = custom_hook.stored[0]
conv0.shape

output:
torch.Size([1, 32, 112, 112])

View the image:
show_image(custom_hook.stored[0][0][0])
01_img

fig, axes = plt.subplots(3, 10, figsize=(15,12/3))
fig.subplots_adjust(hspace=0.1, wspace=0.1, left=0, right=1, top=1, bottom=0)
for i, ax in enumerate(axes.flat):
    ax.imshow(custom_hook.stored[0][0][i].cpu(), cmap='Greys')
    ax.set_axis_off()

conv2d0 batch:

conv2d1:

conv2d2:

conv2d3:

conv2d4:

Is this a correct representation of the layers? (And also shouldn’t the conv2d0 layer be less defined compared to conv2d5 layer?)

MicPie · April 13, 2020, 6:18am

It seems that you are still looking at the same input conv2d with the 32 output feature maps. So these feature maps will look very similar as they go over the same input data.

I’m not sure what you mean with more defined but the feature maps further down in the model should get more abstract and smaller in terms of w x h. Try to have a look into the output of a subsequent conv2d.

amritv · April 13, 2020, 6:31am

Thanks for that explanation, yes feature maps are for 1 test image. Poor choice of words, what I meant to say is, as you go deeper doesnt the model become more specific and hence shouldn’t the visualizations become more specific.

I though the initial layers looks for edges and shapes and then the lower layers look for more defined features, these maps show the opposite. (Unless I’m looking at this all wrong)

MicPie · April 13, 2020, 7:47am

…and for the first conv2d layer only, or did I misinterpret this?

The higher layers capture high level features but this does not mean that the feature maps will be easy to interpret (e.g., highlighting the input they are “looking at” or similar).

I tried something in that direction with the older fastai library. There you can see how the activations evolve from the beginning to the end of the model when one image is run through it:

Maybe this setup brings in another helpful view on this topic?

s.s.o · April 13, 2020, 8:05am

you can also check the link from fastai2 examples.

amritv · April 13, 2020, 3:53pm

The images are for the first 5 conv2d layers (conv2d0 custom_hook.stored[0][0][0], conv2d1 custom_hook.stored[0][0][3], conv2d2 custom_hook.stored[0][0][6], conv2d3 custom_hook.stored[0][0][10], conv2d4 custom_hook.stored[0][0][13]

Thanks for sharing your notebook, its a great help but I also see with your visualizations images for layer 80 are pixelated compared to layer 0 (rightly so because the size of the feature maps is reducing).

It is difficult for me to interpret what these images are showing and how these maps may help in seeing how the model makes a prediction. I guess heat maps would be a better way to interpret how the model came to its prediction?

amritv · April 13, 2020, 3:55pm

Thanks for sharing this, alot of info in this notebook

amritv · April 14, 2020, 4:52am

Here is an updated take (thanks to @MicPie and @s.s.o for their inputs). Viewing the feature maps as we convolve through the layers and in turn viewing the activation’s within those feature maps. Notebook is here.

conv2d_0
conv01

conv2d_1
conv11

conv2d_2
conv22

conv2d_3
conv33

conv2d_4
conv44

conv2d_5
conv55

conv66

conv77

Albertotono · April 14, 2020, 6:26pm

This is a great topic,
I was implementing also with my model, (NOT with MISH ) but the normal ResNet

Sequential(
  (0): Sequential(
    (0): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace=True)
    (3): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (4): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (5): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (3): BasicBlock(
        (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (6): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (3): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (4): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (5): BasicBlock(
        (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (7): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (downsample): Sequential(
          (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
          (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        )
      )
      (1): BasicBlock(
        (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (2): BasicBlock(
        (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
  )
  (1): Sequential(
    (0): AdaptiveConcatPool2d(
      (ap): AdaptiveAvgPool2d(output_size=1)
      (mp): AdaptiveMaxPool2d(output_size=1)
    )
    (1): Flatten(full=False)
    (2): BatchNorm1d(1024, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (3): Dropout(p=0.25, inplace=False)
    (4): Linear(in_features=1024, out_features=512, bias=False)
    (5): ReLU(inplace=True)
    (6): BatchNorm1d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (7): Dropout(p=0.5, inplace=False)
    (8): Linear(in_features=512, out_features=4, bias=False)
  )
)

any idea how to extract the heatmap?
Currently, I have this message:

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-37-9b38884f23e1> in <module>
----> 1 v() #First Conv2d layer

<ipython-input-36-6ca3928f607d> in v()
     12         return hook_a,hook_g
     13 
---> 14     hook_a,hook_g = hooked_backward()
     15     activations  = hook_a.stored[0].cpu()
     16     gradients = hook_g.stored[0][0].cpu()

<ipython-input-36-6ca3928f607d> in hooked_backward(cat)
      6 
      7     def hooked_backward(cat=y):
----> 8         with hook_output(m[5][0]) as hook_a:
      9             with hook_output(m[5][0], grad=True) as hook_g:
     10                 preds = m(xb)

/opt/conda/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/container.py in __getitem__(self, idx)
     68             return self.__class__(OrderedDict(list(self._modules.items())[idx]))
     69         else:
---> 70             return self._get_item_by_idx(self._modules.values(), idx)
     71 
     72     def __setitem__(self, idx, module):

/opt/conda/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/container.py in _get_item_by_idx(self, iterator, idx)
     59         idx = operator.index(idx)
     60         if not -size <= idx < size:
---> 61             raise IndexError('index {} is out of range'.format(idx))
     62         idx %= size
     63         return next(islice(iterator, idx, None))

IndexError: index 5 is out of range

The hook seems working fine

but not the v() to visualize the layers

Albertotono · April 14, 2020, 7:01pm

Awesome I changed the model to point to the Sequential part, thank you so much for sharing this.

amritv · April 14, 2020, 7:09pm

Glad you got it working