Learn.summary() - MaxPool2d image shape change not listed in Output Shape

Understanding the size of the image

From Chap 14; I’m looking at the summary of the Model produced from

def _resnet_stem(*sizes):
    return [
        ConvLayer(sizes[i], sizes[i+1], 3, stride = 2 if i==0 else 1)
            for i in range(len(sizes)-1)
    ] + [nn.MaxPool2d(kernel_size=3, stride=2, padding=1)]


stem = _resnet_stem(3,32,32,64)

The summary starts with:

Layer (type)         Output Shape         Param #    Trainable 
                     128 x 32 x 112 x 112 
Conv2d                                    864        True      
BatchNorm2d                               64         True      
Conv2d                                    9216       True      
BatchNorm2d                               64         True      
                     128 x 64 x 112 x 112 
Conv2d                                    18432      True      
BatchNorm2d                               128        True      
Conv2d                                    4096       True      
BatchNorm2d                               128        True      
Conv2d                                    36864      True      
BatchNorm2d                               128        True      

I think the Output Shape shows the shape which the layer immediately below it outputs, when it changes the shape from the above layer.

Looking at layer_info it only tracks the shape for Layers with “weights”; and so it doesn’t record the change in shape that MaxPool2d creates.

Is this expected behaviour of summary?

1 Like

Summary is one of the more… complex… functions inside the fastai library. I spent a considerable amount of time in there refactoring things and it’s still hard for me to find, but let me help clarify a few things:

For every single layer in the model, we check if it has a weight and it’s output shape, along with if it’s trainable. This information is then fed to our little table we have. As to why that’s not showing up there, I’m not 100% sure.

Just tried an example with resnet, and we get the following:


When ideally that should show a different output shape from the pooling layers. I plan on looking into that

I spent a bit of time poking around in it with a debugger - my thoughts below (Please bear in mind my diagnosis could easily be incorrect!) Hopefully in the near(ish) future I would be creating a PR instead; but I haven’t got my development workflow setup … and have to get a bit more familiar with notebooks still.

At ### 1. in the below code shape is defaulted to '' and then is only set to an actual value around ### 2. & ### 3. - and since MaxPool2d doesn’t have weights it wasn’t getting shape updated to its actual value. You can see in the right column of the screenshot in the first message I had watches defined in layer_info; and same was correctly being defined as False based on the sizes of the activations.

Because of the above; at ###4. and not prev_sz == sz was evaluating to False (… I think; I didn’t screen shot that one) Also; sz was just an empty string; so it lost the the actual numbers to print.

And maybe it would be a nice tidy up to unpack the variables back into the same naming standard used in layer_info? Specifically I found the mixed naming of same to chnged to be slightly confusing.

return (type(m).__name__, params, trainable, shape, same)

for typ,np,trn,sz,chnged in infos:

def layer_info(learn, *xb):
    "Return layer infos of `model` on `xb` (only support batch first inputs)"
    def _track(m, i, o):
        ### 1.
        params, trainable, shape = '', '', ''   
        same = any((x[0].shape[1:] == x[1].shape for x in zip(i, o)))
        ### 2.
        if hasattr(m, 'weight'): # non activation layer
            params, trainable = total_params(m)
            ### 3.
            shape = apply(lambda x: x.shape, o)
        return (type(m).__name__, params, trainable, shape, same)  

    with Hooks(flatten_model(learn.model), _track) as h:
        batch = apply(lambda o:o[:1], xb)
        train_only_cbs = [cb for cb in learn.cbs if hasattr(cb, '_only_train_loop')]
        with learn.removed_cbs(train_only_cbs), learn.no_logging(), learn as l:
            r = l.get_preds(dl=[batch], inner=True, reorder=False)
        return h.stored
    for typ,np,trn,sz,chnged in infos:
        if sz is None: continue
        if j == 0:
            res += f'\n{"":<20} {_print_shapes(sz, bs)[:19]:<20}' # to avoid a double line at the top
        if not chnged and not prev_sz == sz and j > 0: res += "\n" + "_" * n + "\n" + f'{"":<20} {_print_shapes(sz, bs)[:19]:<20}'
        j = 1
        res += f"\n{typ:<20} {'':<20} {np:<10} {str(trn):<10}"
1 Like

Your assumptions are all correct here. I’m actually planning on refactoring the entire thing and simplifying it ideally this weekend, so that should help with that.

Will also put that on the todo :slight_smile:

I’ve opened an issue here, again ideally I can tackle this over the weekend: https://github.com/fastai/fastai/issues/3219

If you have any more thoughts or come up with anything definitely add it on to that discussion please @bgraysea! :smiley:

And of course if you figure out a working solution that hits all those points, absolutely feel free to put in your own PR :slight_smile:

1 Like