Hello,
I’m trying to modify the ‘Using fastai on a custom new task’ tutorial to suit my problem. In the tutorial, 2 images are given as input, but I would like to pass a tensor of shape ‘torch.Size([18, 3])’ instead.
My implementations
I’ve created a custom fastuple and Transform as described at this point in the tutorial which both appear to work as expected.
Joints
- My equivalent of the SiameseImage
class contains the aforementioned 18x3 tensor and its label.
class Joints(fastuple):
def show(self, ctx=None, **kwargs):
joints, label_idx = self
if isinstance(label_idx, Tensor):
label_idx = label_idx.item()
label = body_labels.vocab[label_idx]
tensor_img = get_joints_img_tensor(joints)
return show_image(tensor_img, title=f'label: {label}', ctx=ctx, **kwargs)
JointsTransform
- My equivalent of the SiameseTransform
class returns a Joints
object when encodes()
is called.
class JointsTransform(Transform):
def __init__(self, bodies, body_labels):
self.joints = L([body.get('joints') for body in bodies])
self.labels = L([body.get('class') for body in bodies])
self.body_labels = body_labels
def encodes(self, body):
# Get joints and label
joints = body.get('joints')
label = body.get('class')
label_idx = self.body_labels.l2i[label]
# Transform joints and label to tensors
joints_tensor = joints_to_tensor(joints)
label_tensor = torch.tensor(label_idx)
return Joints(joints_tensor, label_tensor)
Both of the above classes seem to function as expected and I instantiate a DataLoaders
object as follows:
splits = RandomSplitter()(bodies)
tfm = JointsTransform(bodies, body_labels)
tls = TfmdLists(bodies, tfm, splits=splits)
dls = tls.dataloaders(after_item=[ToTensor],
after_batch=[IntToFloatTensor,
Normalize.from_stats(*imagenet_stats)],
bs=16)
Problem
But now I am stuck at using this data with a learner. If I continue following the tutorial and create a custom Model, the amount of elements is mismatched:
class JointsModel(Module):
def __init__(self, encoder, head):
self.encoder, self.head = encoder, head
def forward(self, x1):
return self.head(x1)
joints_per_body = 18
encoder = create_body(resnet34, cut=-2)
head = create_head(joints_per_body*3, n_out=6)
model = JointsModel(encoder, head)
def joints_splitter(model):
return [params(model.encoder), params(model.head)]
learn = Learner(dls,
model,
loss_func=CrossEntropyLossFlat(),
splitter=joints_splitter,
metrics=accuracy)
learn.freeze()
results in
RuntimeError: running_mean should contain 2 elements not 54
I note however that the custom model seems unneccesary and try to use the cnn_learner()
method:
learn = cnn_learner(dls,
resnet34,
metrics=error_rate,
n_out=6,
loss_func=CrossEntropyLossFlat())
And now learn.freeze()
results in
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [64, 3, 7, 7], but got 3-dimensional input of size [16, 18, 3] instead
I have looked around a lot but have not been able to find out what I did wrong. I hope someone can help and point me in the right direction.
I tried making the post as concise af possible but if more information, code, or error log is helpfull I will glady provide it.
Thanks for reading and kind regards
Stijn