Siamese Network Architecture using fast.ai library

parth_hehe · April 19, 2018, 8:14am

The Kaggle’s Quora Pairs competition’s objective is to figure out if 2 questions have the same meaning. This should help users find similar questions and reduce duplicate content on Quora.

One solution could to be create a language model using the dataset. And then forming a Siamese Network (reference to Siamese twins, image below from Medium article) that takes in 2 questions and compares the output activations using cross entropy (or Manhattan distance).

How can this Siamese network architecture be implemented using the fast.ai library.
I have trained the language model on the Quora dataset.
Would I need to implement this architecture in PyTorch, or can I use some fast.ai modules to create this architecture.

brian · August 14, 2018, 6:52pm

@parth_hehe I’m looking into doing something similar. What did you end up doing?

javiersuweijie · August 16, 2018, 3:59pm

This is as far as i got to a fully working model. I can’t seem to get the same accuracy as stated in the blog post. I must have missed something out in my implementation. Feel free to run it and see if it works for you dataset.

So far im getting 80~% accuracy.

class SiameseSentence(nn.Module):
    initrange=0.1
    def __init__(self, ntoken, emb_sz, n_hid, n_layers, pad_token, bidir=False, dropouth=0.3, wdrop=0.5):
        super().__init__()
        self.ndir = 2 if bidir else 1
        self.bs = 0
        self.encoder = nn.Embedding(ntoken, emb_sz, padding_idx=pad_token)
        self.rnns = nn.LSTM(emb_sz, n_hid, n_layers)
       
        self.encoder.weight.data.uniform_(-self.initrange, self.initrange)
        self.emb_sz,self.n_hid,self.n_layers,self.dropouth = emb_sz,n_hid,n_layers,dropouth

    def forward(self, inputs):
        sl, _, bs = inputs.size()

        emb_0 = self.encoder(inputs[:,0,:])
        emb_1 = self.encoder(inputs[:,1,:])
        
        outputs0, hiddens0 = self.rnns(emb_0)
        outputs1, hiddens1 = self.rnns(emb_1)
        
        distance = self.distance(outputs0[-1], outputs1[-1])
        #set_trace()
        return distance
    
    def distance(self, x1, x2):
        return torch.exp(-torch.norm((x1 - x2), 1, 1))

TheShadow29 · August 16, 2018, 5:37pm

I got siamese-mobilenet with fastai version here https://github.com/TheShadow29/pyt-mobilenet/blob/master/code/MobileNet_Siamese.ipynb. However, I couldn’t get contrastive loss to work for some reason. I instead used a simple cross entropy loss on the final distance. The experiments are run on cifar10

jeremy · August 17, 2018, 5:10am

It’s pretty close! One thing they do in the blog post is to freeze the weights on the embeddings - have you done that? Are you using the same optim and hyperparams? Have you checked whether your weight initialization is the same?

javiersuweijie · August 17, 2018, 4:21pm

@jeremy In my case, I didn’t use any pretrained embeddings. After training around 10 epochs while decreasing the learning rates, I got around 82.8% which is more or less the same results. Thanks! Really enjoying pytorch so far.

nunenuh · May 23, 2019, 3:04pm

which one is better, extending Learner and create siamese learner or use the example that given by @javiersuweijie?