Google's new assistant and Duplex

bhollan · May 9, 2018, 2:27pm

This blows me away. There’s a link to a blog post with more details in the article.

Interogativ · May 9, 2018, 7:25pm

Incredible, it’s like Watson on Steroids, or a complete fake. Be interesting to see it in the wild.

ravivijay · May 9, 2018, 8:22pm

Wonder what the ethical implications are. Remembered this line - doing all things one can do without stopping to ask if one should do them.

jeremy · May 9, 2018, 8:39pm

Indeed. There’s a good article about that here: https://www.washingtonpost.com/news/the-switch/wp/2018/05/08/a-google-program-can-pass-as-a-human-on-the-phone-should-it-be-required-to-tell-people-its-a-machine/

nchukaobah · May 10, 2018, 12:33am

What about forensic counter measures? With the speed of development of fake images, fake voice/audio, there does not seem to be a similar racheting up of research into these forensic counter measures (looking through arxiv and cvpr and iccv papers). Do you know of people or places focused on these counter measures?

Vishucyrus · May 10, 2018, 6:29am

If it’s really this efficient … We shouldn’t be far from passing the Turing Test…?? What do you think…

malrod · May 10, 2018, 7:53am

To me it’s scary how unprepared for this the general public is. You mention AI and most people laugh and are dismissive. “AI will never happen because…”. And yet, we already have GANs generating fake videos, and Wavenet +RNNs creating human sounding ‘assistants’.
Politicians seem to be clueless about what’s happening with AI, we have to count on self-regulation and benevolence on part of tech-giants. Everybody will finally wake up when malicius agents start using this technology to cause some major disaster.
WaveNet can be trained to imitate a person’s voice given a relative short sample. Now imagine you receive a phone call from your parents or a friend who appear to be in distress and ask you to transfer them some money. Or someone sounding like you doctor or lawyer calls and asks you for some personal details fo ‘fill some paperwork’ or gaps in documentation. Then it’s turns out you handed the information to a ‘duplex’ and it’s been used to defraud you in some way.

aleDL · May 10, 2018, 11:49am

Looks very scripted that call … I doubt it would work in other situations without specific training. It is not general AI. But am eager to try it out anyway.

tensoralex · May 10, 2018, 1:50pm

I wonder if they eventually add “angry customer” mode…

tensoralex · May 10, 2018, 2:20pm

Likely there are going to be soon “defensive” services to distinguish such type of calls. Until there is some common sense property behind - it would be relatively easy to train classifier to figure it out.

Interogativ · May 11, 2018, 6:06pm

Looks like someone at google is listening. See this article

blakewest · May 16, 2018, 2:53pm

Also, regarding “passing the turing test”, it kinda depends what you mean. As with most things like this, we aren’t gonna wake up one day and have a general AI that “passes the turing test”, it will be a slow, methodical process where we see each step along the way.
The official Google AI blog post notes that this

One of the key research insights was to constrain Duplex to closed domains, which are narrow enough to explore extensively. Duplex can only carry out natural conversations after being deeply trained in such domains. It cannot carry out general conversations.

I’m not exactly sure how it’s a “research insight” that AI is easier trained with narrow tasks, but regardless, it’s clear that you can’t start asking Duplex what it’s thoughts are on the Iran deal, and hope to get sane responses.

Also interesting, Duplex is not all DL! The speech generation (which was the most wild part for me) is a mix of concatenation techniques (also very narrow I assume), and DL techniques. And the “umms”, “ya knows”, and “mm hmm’s” are programmed in specifically to make things sound more natural. So again, we’re not quite at the full AI thing yet…

This is super impressive. But also, I feel like we know enough now to be aware of what the state of the art is, and we’ve seen how the state of the art tends to move pretty incrementally. If Google had some amazing SOTA DL product for text to speech, they would publish something on it. So if we see something that just blows the SOTA out of the water, we should probably assume it’s not using DL, or that we’re missing something.

Anyway, this is super cool! And yeah, it should totally tell you it’s a machine calling…

chrizbo · May 17, 2018, 4:23pm

I wrote a piece discussing the ethical, moral, legal, and human-centered issues with this technology:

Would love to hear people’s thoughts on it.

ravivijay · May 21, 2018, 10:33pm

The video in these posts is truly scary. I wonder if nobody wants to question Google just because they say they won’t be evil. Which they’ve stopped saying as per recent news.

https://samim.io/p/2018-05-19-my-one-my-favorite-quotes-of-now-seems-to-be-very-mult/