Sentence similarity


(Brian) #21

@jeremy I’ve been attempting to use the pre-trained LM from lesson 10 to create sentence vectors. I’d like to use the vectors to create a semantic search system.
My first attempt at using pooled hidden states as vectors ( described here ) showed that semantically different sentences weren’t appreciably different from semantically similar ones. Further attempts to build a classifier from the LM to predict entailment yielded similar results. The classifier is a Siamese Network available here

Questions:

  1. Should the pooled hidden states of a LM produce vectors suitable for determining sentence similarity? In other words, would you expect 2 semantically similar sentence to have a greater cosine similarity than 2 unrelated sentences?
  2. I’m not sure how to proceed. Does this look like a reasonable approach? What do you do when you get stuck on a problem like this?
  3. Am I missing something obvious?

Any insight is greatly appreciated.


(Jeremy Howard) #22

ULMFit is all about fine-tuning. I wouldn’t expect it to work without that step. I would expect it to work well for semantic similarity if you fine tune a siamese network on a ULMFit encoder.


(Brian) #23

Thanks for your input!


(Brian) #24

Update on my progress:
I made 2 changes that have boosted my performance from 40% to 50%.
The first was to sort my sentences by length.
The other was that I switched to the MultiBatchRNN encoder.

50% is still a very poor result, so I’m going to dig in further to the InferSent code to what might be different.

The other thing I did was to validate my loader and model code with the original IMDB task.
I was able to get good results, but not as good.

Update: I’ve gotten 61% accuracy now. Better but not great. Infersent gets an accuracy of 84.5% on SNLI.


(Jeremy Howard) #25

You’re making quite progress!