24_tutorial.siamiese - Mismatched amount of elements

nobistijnb · October 6, 2020, 1:37pm

Hello,

I’m trying to modify the ‘Using fastai on a custom new task’ tutorial to suit my problem. In the tutorial, 2 images are given as input, but I would like to pass a tensor of shape ‘torch.Size([18, 3])’ instead.

My implementations

I’ve created a custom fastuple and Transform as described at this point in the tutorial which both appear to work as expected.

Joints - My equivalent of the SiameseImage class contains the aforementioned 18x3 tensor and its label.

class Joints(fastuple):
    def show(self, ctx=None, **kwargs):
        joints, label_idx = self
        if isinstance(label_idx, Tensor):
            label_idx = label_idx.item()
        label = body_labels.vocab[label_idx]
        tensor_img = get_joints_img_tensor(joints)
        return show_image(tensor_img, title=f'label: {label}', ctx=ctx, **kwargs)

JointsTransform - My equivalent of the SiameseTransform class returns a Joints object when encodes() is called.

class JointsTransform(Transform):
    def __init__(self, bodies, body_labels):
        self.joints = L([body.get('joints') for body in bodies])
        self.labels = L([body.get('class') for body in bodies])
        
        self.body_labels = body_labels
    
    def encodes(self, body):
        # Get joints and label
        joints = body.get('joints')
        label = body.get('class')
        label_idx = self.body_labels.l2i[label]
        
        # Transform joints and label to tensors
        joints_tensor = joints_to_tensor(joints)
        label_tensor = torch.tensor(label_idx)
        
        return Joints(joints_tensor, label_tensor)

Both of the above classes seem to function as expected and I instantiate a DataLoaders object as follows:

splits = RandomSplitter()(bodies)
tfm = JointsTransform(bodies, body_labels)
tls = TfmdLists(bodies, tfm, splits=splits)
dls = tls.dataloaders(after_item=[ToTensor], 
                      after_batch=[IntToFloatTensor, 
                                   Normalize.from_stats(*imagenet_stats)],
                      bs=16)

Problem

But now I am stuck at using this data with a learner. If I continue following the tutorial and create a custom Model, the amount of elements is mismatched:

class JointsModel(Module):
    def __init__(self, encoder, head):
        self.encoder, self.head = encoder, head
        
    def forward(self, x1):
        return self.head(x1)

joints_per_body = 18

encoder = create_body(resnet34, cut=-2)
head = create_head(joints_per_body*3, n_out=6)
model = JointsModel(encoder, head)

def joints_splitter(model):
    return [params(model.encoder), params(model.head)]

learn = Learner(dls,
                model,
                loss_func=CrossEntropyLossFlat(),
                splitter=joints_splitter,
                metrics=accuracy)

learn.freeze() results in

RuntimeError: running_mean should contain 2 elements not 54

I note however that the custom model seems unneccesary and try to use the cnn_learner() method:

learn = cnn_learner(dls, 
                    resnet34, 
                    metrics=error_rate, 
                    n_out=6, 
                    loss_func=CrossEntropyLossFlat())

And now learn.freeze() results in

RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 7, 7], but got 3-dimensional input of size [16, 18, 3] instead

I have looked around a lot but have not been able to find out what I did wrong. I hope someone can help and point me in the right direction.

I tried making the post as concise af possible but if more information, code, or error log is helpfull I will glady provide it.

Thanks for reading and kind regards
Stijn