Actually, this really looks like a mistake. I tried to check your assumptions on practice.
You are right about sfs_idxs == [6,5,4,2]
and that slice sfx_idxs[-2:-4:-1] == [4,5]
.
I added some debug output to forward
method of LateralUpsampleMerge
:
def forward(self, x): conv_lat_hook = self.conv_lat(self.hook.stored) print("conv_lat_hook.shape:", conv_lat_hook.shape, "+ x.shape:", x.shape) return conv_lat_hook + F.interpolate(x, self.hook.stored.shape[-2:], mode='nearest')
And when I run learn.summary()
for learner with 256x256 images in data, these are first lines of output:
conv_lat_hook.shape: torch.Size([1, 256, 64, 64]) + x.shape: torch.Size([1, 256, 8, 8]) conv_lat_hook.shape: torch.Size([1, 256, 32, 32]) + x.shape: torch.Size([1, 256, 64, 64])
I tried changing the slice from [-2:-4:-1]
to [0:2:1]
and got this:
conv_lat_hook.shape: torch.Size([1, 256, 16, 16]) + x.shape: torch.Size([1, 256, 8, 8]) conv_lat_hook.shape: torch.Size([1, 256, 32, 32]) + x.shape: torch.Size([1, 256, 16, 16])
That’s better. It seems like author of the code forgot that he has reversed list of encoder’s layers, which change size of image, in sfs_idxs
already.
Looking forward to hearing from the authors.
Maybe we are wrong and that “mistake” was made on purpose and gives better results.
I can’t check that, still can’t make the notebook working: Having problems running pascal.ipynb notebook