Hello! I’m looking for suggestions of using BERT (or BERT-as-service).
I’m building kind of information retrieval system and trying to use BERT as semantic search engine. In my DB I have objects with descriptions like “pizza”, “falafel”, “Chinese restaurant”, “I bake pies”, “Chocolate Factory Roshen” and I want all these objects to be retrieved by a search query “food” or “I’m hungry”. With some score of semantic relatedness, of course: so “pizza” should be on top of results for “Italian food” search and “falafel” - for “street food” search.
First of all, does it look like semantic sentence similarity task or more like word similarity? I expect maximum sequence length to be 10-15 words, on average up to 5 words. So that, should I look into fine-tuning and if yes, on what task? GLUE? Or maybe on my own data, creating dataset like STS-B? Or maybe it’s better to extract ELMo-like contextual word embedding and then average them?
Really appreciate any suggestion. Thanks in advance!
P.S. My current approach is Universal Sentence Encoder + hnswlib. Works but I seek better results.
P.P.S. I tried to fine-tune BERT on STS-B but the result is worse than my current approach. Maybe I’m missing something.