Sentiment Pipeline in Transformers - # of Results

Hi all,

I’ve been trying to play around with BERT & Friends in Huggingface’s transformers package. I stumbled upon their Pipelines, and want to play around with sentiment analysis.

When I try to expand their example, I get some results I was not expecting.

When copying their example verbatim, I get the expected result

In [1]: from transformers import pipeline
   ...:
   ...: nlp = pipeline("sentiment-analysis")
   ...:
   ...: print(nlp("I hate you"))
   ...: print(nlp("I love you"))
[{'label': 'NEGATIVE', 'score': 0.9991129}]
[{'label': 'POSITIVE', 'score': 0.99986565}]

If I combine the sentences in a list, I also get the same results

In [2]: print(nlp(["I hate you", "I love you"]))
[{'label': 'NEGATIVE', 'score': 0.9991129}, {'label': 'POSITIVE', 'score': 0.99986565}]

However, I cannot create a list of three sentences

In [3]: print(nlp(["I hate you", "I love you", "I am ambivalent to you"]))
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-235662c92ee9> in <module>
----> 1 print(nlp(["I hate you", "I love you", "I am ambivalent to you"]))

/anaconda3/envs/bertExp2/lib/python3.6/site-packages/transformers/pipelines.py in __call__(self, *args, **kwargs)
    503     def __call__(self, *args, **kwargs):
    504         outputs = super().__call__(*args, **kwargs)
--> 505         scores = np.exp(outputs) / np.exp(outputs).sum(-1)
    506         return [{"label": self.model.config.id2label[item.argmax()], "score": item.max()} for item in scores]
    507

ValueError: operands could not be broadcast together with shapes (3,2) (3,)

Moreover, other times, combining two sentences into one nlp() call does lead to different results!

In [4]: sent1 = "Four score and seven years ago our fathers brought forth on thi
   ...: s continent, a new nation, conceived in Liberty, and dedicated to the pr
   ...: oposition that all men are created equal."
   ...: sent2 = "We hold these truths to be self-evident, that all men are creat
   ...: ed equal, that they are endowed by their Creator with certain unalienabl
   ...: e Rights, that among these are Life, Liberty and the pursuit of Happines
   ...: s"
   ...: print(nlp(sent1))
   ...: print(nlp(sent2))
   ...: print(nlp([sent1, sent2]))
   ...: print(nlp([sent2, sent1]))
[{'label': 'POSITIVE', 'score': 0.99777234}]
[{'label': 'POSITIVE', 'score': 0.9786857}]
[{'label': 'POSITIVE', 'score': 1.627845}, {'label': 'POSITIVE', 'score': 0.9786857}]
[{'label': 'POSITIVE', 'score': 0.59639287}, {'label': 'POSITIVE', 'score': 0.99197847}]

(note how the score for first sentence in the pair is now very different than before).

Sometimes, the score could even flip!

In [5]: sent3 = "We the People of the United States, in Order to form a more per
   ...: fect Union, establish Justice, insure domestic Tranquility, provide for
   ...: the common defence, promote the general Welfare, and secure the Blessing
   ...: s of Liberty to ourselves and our Posterity, do ordain and establish thi
   ...: s Constitution for the United States of America."
   ...: print(nlp([sent2, sent3]))
[{'label': 'NEGATIVE', 'score': 0.12308767}, {'label': 'POSITIVE', 'score': 0.9993409}]

Now, sentence 2 is considered negative!

Has anyone had experience with these pipelines, who might be able to explain how to work with this function?

Thanks!