@jeremy I’ve been attempting to use the pre-trained LM from lesson 10 to create sentence vectors. I’d like to use the vectors to create a semantic search system.
My first attempt at using pooled hidden states as vectors ( described here ) showed that semantically different sentences weren’t appreciably different from semantically similar ones. Further attempts to build a classifier from the LM to predict entailment yielded similar results. The classifier is a Siamese Network available here
Questions:
- Should the pooled hidden states of a LM produce vectors suitable for determining sentence similarity? In other words, would you expect 2 semantically similar sentence to have a greater cosine similarity than 2 unrelated sentences?
- I’m not sure how to proceed. Does this look like a reasonable approach? What do you do when you get stuck on a problem like this?
- Am I missing something obvious?
Any insight is greatly appreciated.