I’ve been trying to replicate the internals of the models. Softmax works perfectly, but I’m not quite getting the right results for convolutions.
Firstly I grab both the inputs and outputs of a convolution using hooks,
class SaveInput():
def __init__(self, m): self.hook = m.register_forward_hook(self.hook_fn)
def hook_fn(self, module, input, output): self.inp = input
def remove(self): self.hook.remove()
x = x[None,0]
sos = [SaveInput(o) for o in [m[-4]]]
py = m(Variable(x.cuda()))
for o in sos: o.remove()
inp = to_np(sos[0].inp[0])
then I checked the convolution - it’s pretty straightforward:
m[-4]
Conv2d(512, 2, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
then multiply the input by the weights:
np.sum(inp[0,:,0:3,0:3]*to_np(m[-4].weight)[0,:,0:3,0:3])
-1.8281463
but when I check the output the result is consistently off by a little bit (e.g. the model output for the value above is -1.83023). It looks like a rounding error, but I can’t work out where it would come from.
Any thoughts on why the results are different?