How does the create_body function works?

marceauclavel · June 11, 2019, 12:46pm

Hello fastai fellows !
I am trying to create a custom CNN for a regression task.
I created an architecture, similar to the openFace one for the moment:

netOpenFace(
  (layer1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3))
  (layer2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (layer3): ReLU()
  (layer4): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), dilation=1, ceil_mode=False)
  (layer5): LocalResponseNorm(5, alpha=0.0001, beta=0.75, k=1.0)
  (layer6): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1))
  (layer7): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (layer8): ReLU()
  (layer9): Conv2d(64, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (layer10): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (layer11): ReLU()
  (layer12): LocalResponseNorm(5, alpha=0.0001, beta=0.75, k=1.0)
  (layer13): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), dilation=1, ceil_mode=False)
  (layer14): Inception(
    (seq_list): ModuleList(
      (0): Sequential(
        (1_conv): Conv2d(192, 96, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
        (4_conv): Conv2d(96, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (5_bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (6_relu): ReLU()
      )
      (1): Sequential(
        (1_conv): Conv2d(192, 16, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(16, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
        (4_conv): Conv2d(16, 32, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
        (5_bn): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (6_relu): ReLU()
      )
      (2): Sequential(
        (1_pool): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=(0, 0), dilation=1, ceil_mode=False)
        (2_conv): Conv2d(192, 32, kernel_size=(1, 1), stride=(1, 1))
        (3_bn): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (4_relu): ReLU()
      )
      (3): Sequential(
        (1_conv): Conv2d(192, 64, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
      )
    )
  )
  (layer15): Inception(
    (seq_list): ModuleList(
      (0): Sequential(
        (1_conv): Conv2d(256, 96, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
        (4_conv): Conv2d(96, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (5_bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (6_relu): ReLU()
      )
      (1): Sequential(
        (1_conv): Conv2d(256, 32, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
        (4_conv): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
        (5_bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (6_relu): ReLU()
      )
      (2): Sequential(
        (1_pool): LPPool2d(norm_type=2, kernel_size=(3, 3), stride=(3, 3), ceil_mode=False)
        (2_conv): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1))
        (3_bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (4_relu): ReLU()
      )
      (3): Sequential(
        (1_conv): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
      )
    )
  )
  (layer16): Inception(
    (seq_list): ModuleList(
      (0): Sequential(
        (1_conv): Conv2d(320, 128, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
        (4_conv): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
        (5_bn): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (6_relu): ReLU()
      )
      (1): Sequential(
        (1_conv): Conv2d(320, 32, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
        (4_conv): Conv2d(32, 64, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2))
        (5_bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (6_relu): ReLU()
      )
      (2): Sequential(
        (1_pool): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=(0, 0), dilation=1, ceil_mode=False)
      )
    )
  )
  (layer17): Inception(
    (seq_list): ModuleList(
      (0): Sequential(
        (1_conv): Conv2d(640, 96, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
        (4_conv): Conv2d(96, 192, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (5_bn): BatchNorm2d(192, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (6_relu): ReLU()
      )
      (1): Sequential(
        (1_conv): Conv2d(640, 32, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
        (4_conv): Conv2d(32, 64, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
        (5_bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (6_relu): ReLU()
      )
      (2): Sequential(
        (1_pool): LPPool2d(norm_type=2, kernel_size=(3, 3), stride=(3, 3), ceil_mode=False)
        (2_conv): Conv2d(640, 128, kernel_size=(1, 1), stride=(1, 1))
        (3_bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (4_relu): ReLU()
      )
      (3): Sequential(
        (1_conv): Conv2d(640, 256, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
      )
    )
  )
  (layer18): Inception(
    (seq_list): ModuleList(
      (0): Sequential(
        (1_conv): Conv2d(640, 160, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(160, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
        (4_conv): Conv2d(160, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1))
        (5_bn): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (6_relu): ReLU()
      )
      (1): Sequential(
        (1_conv): Conv2d(640, 64, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
        (4_conv): Conv2d(64, 128, kernel_size=(5, 5), stride=(2, 2), padding=(2, 2))
        (5_bn): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (6_relu): ReLU()
      )
      (2): Sequential(
        (1_pool): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=(0, 0), dilation=1, ceil_mode=False)
      )
    )
  )
  (layer19): Inception(
    (seq_list): ModuleList(
      (0): Sequential(
        (1_conv): Conv2d(1024, 96, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
        (4_conv): Conv2d(96, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (5_bn): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (6_relu): ReLU()
      )
      (1): Sequential(
        (1_pool): LPPool2d(norm_type=2, kernel_size=(3, 3), stride=(3, 3), ceil_mode=False)
        (2_conv): Conv2d(1024, 96, kernel_size=(1, 1), stride=(1, 1))
        (3_bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (4_relu): ReLU()
      )
      (2): Sequential(
        (1_conv): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
      )
    )
  )
  (layer21): Inception(
    (seq_list): ModuleList(
      (0): Sequential(
        (1_conv): Conv2d(736, 96, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
        (4_conv): Conv2d(96, 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (5_bn): BatchNorm2d(384, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (6_relu): ReLU()
      )
      (1): Sequential(
        (1_pool): MaxPool2d(kernel_size=(3, 3), stride=(2, 2), padding=(0, 0), dilation=1, ceil_mode=False)
        (2_conv): Conv2d(736, 96, kernel_size=(1, 1), stride=(1, 1))
        (3_bn): BatchNorm2d(96, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (4_relu): ReLU()
      )
      (2): Sequential(
        (1_conv): Conv2d(736, 256, kernel_size=(1, 1), stride=(1, 1))
        (2_bn): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (3_relu): ReLU()
      )
    )
  )
  (layer22): AvgPool2d(kernel_size=(3, 3), stride=(1, 1), padding=(0, 0))
  (layer25): Linear(in_features=736, out_features=128, bias=True)
  (resize1): UpsamplingNearest2d(scale_factor=3, mode=nearest)
  (resize2): AvgPool2d(kernel_size=4, stride=4, padding=0)

Then I pass this arch in cnn_learner to create my model:

cnn_learner(data, arch, metrics=[metric(i) for i in range(8)])

Then this cnn_learner tries to create a body as following (source code), with my custom arch:

def create_body(arch:Callable, pretrained:bool=True, cut:Optional[Union[int, Callable]]=None):
    "Cut off the body of a typically pretrained `model` at `cut` (int) or cut the model as specified by `cut(model)` (function)."
    model = arch(pretrained)
    cut = ifnone(cut, cnn_config(arch)['cut'])
    if cut is None:
        ll = list(enumerate(model.children()))
        cut = next(i for i,o in reversed(ll) if has_pool_type(o))
    if   isinstance(cut, int):      return nn.Sequential(*list(model.children())[:cut])
    elif isinstance(cut, Callable): return cut(model)
    else:                           raise NamedError("cut must be either integer or a function")

But when the line model = arch(pretrained) is fired, I get the exception AttributeError: 'bool' object has no attribute 'size'.
In fact, my architecture is called with pretrained as input (which is a boolean), whereas the arch must have a tensor typed input…

Maybe someone can help me understand this buried part of the fastai library
Thank you

muellerzr · June 11, 2019, 1:00pm

Try calling it in as a function. Eg mymodel(), as it wants it to be callable. What it’s doing is wherever you call your slice, it cuts the model off. You can try this manually by doing each step in the source code.

marceauclavel · June 11, 2019, 1:28pm

Ok, thank you,
I didn’t get that a model is not a callable.
The arch object must be a function taking a boolean as parameter (pretrained), returning an instance of a class heriting from nn.Module.
arch is not just an nn.Module object.

Let’s face the next exception

KevinB · June 26, 2019, 6:55pm

For me, the key was to wrap the model that I had created in a def because it wants a True or False of whether or not your architecture should be pretrained so for me, I just had to do something like this:

def superres(pretrained=False, **kwargs):
    """
    Creating an architecture for super resolution as defined in this paper: http://arxiv.org/abs/1603.08155
    Supporting Material: https://cs.stanford.edu/people/jcjohns/papers/fast-style/fast-style-supp.pdf
    
    This is a modification of the x4 architecture because this is designed to go from 25 px to 100 px 
    instead of 72 px to 288 px which is what is done in the paper
    """
    model = nn.Sequential(
        nn.Conv2d(3, 64, 5, padding=2),
        res_block(64),
        res_block(64),
        res_block(64),
        res_block(64),
        nn.ConvTranspose2d(64, 64, 3, stride=2, padding=1, output_padding=1),
        nn.ConvTranspose2d(64, 64, 3, stride=2, padding=1, output_padding=1), 
        nn.Conv2d(64, 3, 5, padding=2)
    )
    if pretrained:
        assert pretrained==False, "Pretrained not currently available" #Leaving this structure in place because this is where the pretrained weights would be placed
    return model

Which I found when I checked to see what the different models looked like:

Signature: models.resnet18(pretrained=False, **kwargs)
Source:   
def resnet18(pretrained=False, **kwargs):
    """Constructs a ResNet-18 model.

    Args:
        pretrained (bool): If True, returns a model pre-trained on ImageNet
    """
    model = ResNet(BasicBlock, [2, 2, 2, 2], **kwargs)
    if pretrained:
        model.load_state_dict(model_zoo.load_url(model_urls['resnet18']))
    return model
File:      ~/anaconda3/envs/fastai/lib/python3.7/site-packages/torchvision/models/resnet.py
Type:      function