How to adopt ConvLearner&databunch to structured loss function and data input?

Hi,

I was tried to adopt fast.ai to metric learning, but failed so far. There is PyTorch code for doing this (https://github.com/DagnyT/hardnet) , but conversion is not obvious to me.
Could you please help me with following case?

Data: there are image patches, which are stored in quite custom format, but there is already torch.vision dataset class for it.
But what is important, is that sampling procedure is required to output two patches of the same class:

def __getitem__(self, index):
    a, p = self.data[t[0]], self.data[t[1]]
    img_a = transform_img(a)
    img_p = transform_img(p)
    return (img_a, img_p)

Note, that img_a and img_p are images of the same class and they are aligned.

Training. During training, image patches go through model to get descriptors and then distance matrix “all_a to all_p” is calculated. The loss function is operation on the distance matrix itself and does not require labels.

   pbar = tqdm(enumerate(train_loader))
   for batch_idx, data in pbar:
         data_a, data_p = data
          out_a = model(data_a)
          out_p = model(data_p)
          loss = loss_HardNet(out_a, out_p, margin=1.0)

note that loss is not just MSE so that out_a should be equal to out_p, it does compare every descriptor to each other.

So my question is how to adopt DataBunch class and ConvLearner to accept such inputs and losses? Or, may be it would be easier for me to just cut out one_cycle_policy and use it in plain PyTorch?

You can completely have a dataset that returns two images. You can have even have data augmentation be applied the same way to the two of them. In your case, you want them as your input and you don’t seem to have a target (which fastai will want) so I’d suggest creating a custom dataset that returns with get_item:

return [img_a, img_b], 0

(0 being there to please fastai, and you won’t really use it). Then you wrap your model like to treat those two inputs and return the two outputs:

class TopModel(nn.Module):
    def __init__(self, model):
        super().__init__()
        self.model = model

    def forward(self, input):
        return [self.model(input[0]), self.model(input[1])]

Lastly, you put a custom loss function to your Learner, that would look like this:

def custom_loss(output, target):
    return loss_HardNet(output[0], output[1], margin=1.0)

(note that we ignore the target).

3 Likes

Hi,

Thanks for your help!
Your solution looks simpler, that I have implemented 10 min ago. Basically, I did
return [img_a, img_b], 0 thing and then wrote custom collate function, which stacks images in batch dimension and then modified loss to split them back.

When training finish, I`ll share results. Thanks again for promt response!

def collate_pos(data):
    images_a, images_p, labels = zip(*data)
    a = torch.stack(images_a, 0)
    p = torch.stack(images_p, 0)
    l = torch.stack(labels, 0).view(-1)
    out_data = torch.cat([a,p], dim = 0)
    labels = torch.cat([l,l], dim = 0)
    return out_data, labels
from fastai import DataBunch
data1 = DataBunch(train_loader, test_loader,    collate_fn = collate_pos)

Hi,I also have a similar problem, but I am a newer to this forum and have not yet mastered how to post.
I want to convert my pytorch code to fastai,I have a dataloader as follows:

class SceneFlowDatset(Dataset):
      ¡¡¡
      left_img = processed(left_img)     # tensor
      right_img = processed(right_img)  #tensor
      disparity = np.expand_dims(disparity, 0)  #numpy array
      return {"left": left_img,
                    "right": right_img,
                    "disparity": disparity}

Note:My torch.vision dataset class return a image patches,and the ‘disparity’ is the target.I have seen many DataBunch in the example, where target(y) is separated from x.It seems like ‘train_ds = TensorDataset(train_x, train_y)’,So I have to rewrite a dataset to separate my target?

About Loss_func:
My model return a tuple tensor:

class Pytorch_IresNet(nn.Module):
    ```
    return predict_final, r_res2_predict, r_res1_predict, r_res0

And my loss_func input is:predict_final, r_res2_predict, r_res1_predict, r_res0,and the target:

loss = model_loss(predict_final, disp_predict, r_res2_predict, r_res1_predict, r_res0, target)

when I put my custom loss function to my Learner,I think that would look like this:

def custom_loss(output,target):
    return model_loss(output[0], output[1],output[2], output[3],target)

The problem is, how do I pass in the target when my target(‘disparity’) is not separated from x(left_img,right_img)?