Chapter 15: `SiameseModel` - How does `Learner` distinguish inputs from target?

A single batch in dls is a tuple of 3 tensors:

  1. A batch of image 1.
  2. A batch of image 2.
  3. A batch of boolean labels.

SiameseModel is specified as follows:

class SiameseModel(Module):
    def __init__(self, encoder, head):
        self.encoder, self.head = encoder, head
    
    def forward(self, x1, x2):
        ftrs = torch.cat([self.encoder(x1), self.encoder(x2)], dim=1)
        return self.head(ftrs)

learn is specified in the usual way:

learn = Learner(dls, model, loss_func=loss_func, splitter=siamese_splitter, metrics=accuracy)

How does Learner know that the first two items in the tuple are the inputs? Does it always consider the last item in the tuple as the target?

If so, then what happens if there are multiple targets (e.g., an object’s class + location), but only one input (e.g., an image)?

In the creation it specified:

def get_x(t): return t[:2]
def get_y(t): return t[2]

Despite that, when you create a Dataset (https://docs.fast.ai/data.core.html#Datasets) there is an option n_inp that means the number of inputs.

1 Like