Learn how to use GradCAM in non-standard models

I am following this great kaggle notebook from iafoss in PANDAS competition.

I am trying to generate heatmaps from the image prediction using Hook but, since the input are concatenated images I am not able to get the proper dimensions.

I am using part of the CAM notebook from fastai2 (I know that this notebook is on fastai1 but the first part seems to work for both) in the following fashion:

#get a set of images
for j,i in enumerate(data.dl(DatasetType.Valid)):
    x0 = i[0]

#recreate hook function
class Hook():
    def hook_func(self, m, i, o): self.stored = o.detach().clone()

#add hook at the end of `enc` which is equivalent to [0]
hook_output = Hook()
hook = learn.model.enc.register_forward_hook(hook_output.hook_func)

#eval. Here x0 is unpacked for proper prediction
with torch.no_grad(): output = learn.model.eval()(*x0)

Here is where problems start. In the CAM notebook the hook_output shape is [1, 512,7,7] (which gets the weigths of the last resnet34 convolutional layer). However, in the PANDA notebook I’ve got torch.Size([240, 2048, 4, 4]). Here I understand that [2048,4,4] is the last convolutional layer of the resnext50 used but, where 240 comes from?

In the notebook it specifies that, after encoder, the shape should be x: bs*N x C x 4 x 4 being bs = 32 and N=12 (number of tiles per image) which, then, should be [384, 2048, 4, 4], right? (I now see that is something related to the number of samples in the validation set, using DatasetType.Train gives the correct dimensions.)

In any case, if I just get the first image and follow the CAM notebook in order to get an activation map I got an error:

act = hook_output.stored[0]

act.shape
#torch.Size([2048, 4, 4])

learn.model.head[-1].weight.shape
#torch.Size([3, 512]). In my case I am using a dataset with 3 different categories

cam_map = torch.einsum('ck,kij->cij', learn.model.head[-1].weight, act)

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
<ipython-input-117-5a494fa5e8d9> in <module>
----> 1 cam_map = torch.einsum('ck,kij->cij', learn.model.head[-1].weight, act)

~/anaconda3/envs/fastai1/lib/python3.7/site-packages/torch/functional.py in einsum(equation, *operands)
    199         # the old interface of passing the operands as one list argument
    200         operands = operands[0]
--> 201     return torch._C._VariableFunctions.einsum(equation, operands)
    202 
    203 

RuntimeError: size of dimension does not match previous size, operand 1, dim 0

Any idea why this is giving this error?
Thanks!

1 Like

You can find a great discussion with iafoss about the issue in the comments of the kaggle notebook. His great feedback lead me to get the CAMs for the last layer of the model.