Replicating a convolution

I’ve been trying to replicate the internals of the models. Softmax works perfectly, but I’m not quite getting the right results for convolutions.

Firstly I grab both the inputs and outputs of a convolution using hooks,

class SaveInput():
    def __init__(self, m): self.hook = m.register_forward_hook(self.hook_fn)
    def hook_fn(self, module, input, output): self.inp = input
    def remove(self): self.hook.remove()
    
x = x[None,0]
sos = [SaveInput(o) for o in [m[-4]]]
py = m(Variable(x.cuda()))
for o in sos: o.remove()
inp = to_np(sos[0].inp[0])

then I checked the convolution - it’s pretty straightforward:

m[-4]

Conv2d(512, 2, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))

then multiply the input by the weights:

np.sum(inp[0,:,0:3,0:3]*to_np(m[-4].weight)[0,:,0:3,0:3])

-1.8281463

but when I check the output the result is consistently off by a little bit (e.g. the model output for the value above is -1.83023). It looks like a rounding error, but I can’t work out where it would come from.

Any thoughts on why the results are different?

1 Like

I worked it out in the end, turned out the convolution had a bias term. The correct formula is:

np.sum(inp[0,:,0:3,0:3]*to_np(m[-4].weight)[0,:,0:3,0:3]) + m[-4].bias

1 Like