Sentence similarity

brian · August 29, 2018, 11:23pm

@jeremy I’ve been attempting to use the pre-trained LM from lesson 10 to create sentence vectors. I’d like to use the vectors to create a semantic search system.
My first attempt at using pooled hidden states as vectors ( described here ) showed that semantically different sentences weren’t appreciably different from semantically similar ones. Further attempts to build a classifier from the LM to predict entailment yielded similar results. The classifier is a Siamese Network available here

Questions:

Should the pooled hidden states of a LM produce vectors suitable for determining sentence similarity? In other words, would you expect 2 semantically similar sentence to have a greater cosine similarity than 2 unrelated sentences?
I’m not sure how to proceed. Does this look like a reasonable approach? What do you do when you get stuck on a problem like this?
Am I missing something obvious?

Any insight is greatly appreciated.

jeremy · August 30, 2018, 2:00am

ULMFit is all about fine-tuning. I wouldn’t expect it to work without that step. I would expect it to work well for semantic similarity if you fine tune a siamese network on a ULMFit encoder.

brian · August 30, 2018, 5:37pm

Thanks for your input!

brian · September 3, 2018, 1:07am

Update on my progress:
I made 2 changes that have boosted my performance from 40% to 50%.
The first was to sort my sentences by length.
The other was that I switched to the MultiBatchRNN encoder.

50% is still a very poor result, so I’m going to dig in further to the InferSent code to what might be different.

The other thing I did was to validate my loader and model code with the original IMDB task.
I was able to get good results, but not as good.

Update: I’ve gotten 61% accuracy now. Better but not great. Infersent gets an accuracy of 84.5% on SNLI.

jeremy · September 6, 2018, 4:45am

You’re making quite progress!

brian · September 20, 2018, 12:33pm

Update: I changed my vector concatenation to the way that InferSent does it. So my forward pass now looks like this:

def forward(self, in1, in2):
        u = self.encode(in1)
        v = self.encode(in2)
        features = torch.cat((u, v, torch.abs(u-v), u*v), 1)
        out = self.linear(features)
        return out

This has improved my accuracy to about 71%

armheb · September 20, 2018, 10:13pm

Hi
I read your code in your ULMFiT_Classify notebook and it seemed that you don’t tokenize your input when making the Dataset . sorry if I’m wrong .

brian · September 24, 2018, 4:57pm

Tokenizing happens in a different notebook. I wanted to split-up tokenizing, pertaining and classification to make the notebooks clearer.

avinregmi · October 13, 2018, 3:31pm

When I’m running your ULMFit_classify after running other two notebooks it’s giving me an error. Says “ValueError: optimizing a parameter that doesn’t require gradients”. I have been stuck here for the past few hours!!

brian · October 15, 2018, 4:22am

Hey sorry you got stuck!
Looks like you are using an earlier versions of PyTorch. FastAi defaults to 0.31 but I’m using 0.4.1.

Please upgrade and try again.

brian · October 16, 2018, 9:10pm

Did upgrading to 0.4.1 work fo you?

avinregmi · October 16, 2018, 9:28pm

Hello Brian, I got it to work but I’m not able to train it. When I’m running the fit function with multiple epochs I’m running out of GPU memory. I’m using Google Compute Engine with tesla 11gb memory. Which GPU did you use? Also if its possible can you please upload your pre-trained model? That would be a really good help for me. I’ve attached a picture where I run out of memory. After running this cell, it gives me CUDA memory error. I’ve also tried reducing the batch size.

08%20PM

brian · October 16, 2018, 9:59pm

I’m using a GTX 1080Ti. It’s also has 11GB of memory. You can try reducing the BPTT parameter as well as the batch size.

I can upload my retrained model, but it’ll take a while. I’ve got a slow upload speed.

avinregmi · October 16, 2018, 10:13pm

Thanks, Brian. I really appreciate your help. Are you uploading to github or another source?

brian · October 16, 2018, 10:17pm

@avinregmi Here is my pre-trained SNLI language model. Hope it helps:

avinregmi · October 16, 2018, 10:51pm

Thanks, Brian!!!

avinregmi · October 16, 2018, 11:12pm

Hey Brian, I only found one file for the model which is snli_language_model.pt. I’m looking for the siamese_model0.81.pt file that you trained. When I try to load the language model you gave me as siamese_model it gives me an error. I believe there suppose to be another file for siamese model.

Thanks for your help again Brian!

brian · October 17, 2018, 3:30pm

avinregmi · October 18, 2018, 6:26pm

Hey Brian, so i updated to the version or torch you’re using by $ pip install ‘torch==0.4.1’ and now i’m getting this error. I’m using the instruction from this link to install fastai Fastai v0.7 install issues thread . After creating a virtual environment, i’m installing torch 0.4.1 as the default fastai installation is 0.31. But after running ULMFIT_classify notebook i’m getting this error.

btw, this is the first cell in the notebook where all the imports are happening. Which version of python are you using. I’m using 3.6.6 and I believe that might be the issue. When I made a virtual environment with python 3.7 I’m getting another error. Please help!!!

stas · October 18, 2018, 6:38pm

FYI: fastai 0.7 requires pytorch<0.4

it appears you’re using fastai 1.0.x, if that’s what you want please follow the instructions completely: https://github.com/fastai/fastai/blob/master/README.md#installation