A single batch in dls
is a tuple of 3
tensors:
- A batch of image 1.
- A batch of image 2.
- A batch of boolean labels.
SiameseModel
is specified as follows:
class SiameseModel(Module):
def __init__(self, encoder, head):
self.encoder, self.head = encoder, head
def forward(self, x1, x2):
ftrs = torch.cat([self.encoder(x1), self.encoder(x2)], dim=1)
return self.head(ftrs)
learn
is specified in the usual way:
learn = Learner(dls, model, loss_func=loss_func, splitter=siamese_splitter, metrics=accuracy)
How does Learner
know that the first two items in the tuple are the inputs? Does it always consider the last item in the tuple as the target?
If so, then what happens if there are multiple targets (e.g., an object’s class + location), but only one input (e.g., an image)?