the turing test jen brandner for csci 405 conversational ai and the loebner prize competition

15
The Turing Test The Turing Test Jen Brandner Jen Brandner for CSCI 405 for CSCI 405 Conversational AI and the Loebner Prize Competition

Upload: augustus-harrington

Post on 27-Dec-2015

223 views

Category:

Documents


3 download

TRANSCRIPT

The Turing TestThe Turing Test

Jen BrandnerJen Brandner

for CSCI 405for CSCI 405

Conversational AI and the Loebner Prize Competition

What is a Turing Test?What is a Turing Test?

It all started with A. M. Turing’s 1950 It all started with A. M. Turing’s 1950 paper “Computing Machinery and paper “Computing Machinery and Intelligence.”Intelligence.”

Turing described an “imitation game” in Turing described an “imitation game” in which a man and a woman both try to which a man and a woman both try to convince an interrogator that he/she is the convince an interrogator that he/she is the woman.woman.

Expands this to a computer convincing an Expands this to a computer convincing an interrogator that it is human.interrogator that it is human.

What is conversational AI?What is conversational AI?

Machines are programmed to carry on a Machines are programmed to carry on a conversation with the user.conversation with the user.

““chatbots”chatbots” Requires natural language processing.Requires natural language processing. Examples:Examples:

ELIZA (the Rogerian psychotherapist)ELIZA (the Rogerian psychotherapist) AOL Messenger’s “Smarter Child”AOL Messenger’s “Smarter Child”

What is the Loebner Prize Contest?What is the Loebner Prize Contest? Sponsored by Hugh LoebnerSponsored by Hugh Loebner Annual event held sense 1991Annual event held sense 1991 1991: 10 judges and 8 1991: 10 judges and 8

contestants (6 computers and contestants (6 computers and 2 humans)2 humans)

Judges had short Judges had short conversations with each conversations with each contestant and rated their contestant and rated their human-ness.human-ness.

To give the computers a To give the computers a fighting chance, contestants fighting chance, contestants were allowed to select a single were allowed to select a single topic to converse on.topic to converse on.

Results of the First Contest in 1991Results of the First Contest in 1991

Five judges rated the top contestant as Five judges rated the top contestant as human.human.

Eight cases in which a computer was Eight cases in which a computer was misclassified as human.misclassified as human.

Winning programmer: Joseph Weintraub’s Winning programmer: Joseph Weintraub’s program PC Therapist IIIprogram PC Therapist III

His topic: whimsical conversationHis topic: whimsical conversation Relied on non sequiturs in conversationRelied on non sequiturs in conversation Awarded $1,500Awarded $1,500

The 1996 ContestThe 1996 Contest Jason Hutchens entered two programsJason Hutchens entered two programs

HeX (primary entry)HeX (primary entry) MegaHALMegaHAL

HeX was a simple one-month hack. Hutchens’s HeX was a simple one-month hack. Hutchens’s intent was to show the futility of Loebner’s intent was to show the futility of Loebner’s contest.contest.

““If I can beat those other systems with a If I can beat those other systems with a program which took only a month to make then program which took only a month to make then there is something wrong with the way the there is something wrong with the way the contest is structured.” - Jasoncontest is structured.” - Jason

HeX was more complex than MegaHAL, and HeX was more complex than MegaHAL, and actually used MegaHAL as just a part of its actually used MegaHAL as just a part of its programming.programming.

Hutchens’s HeX won the contest in 1996, but Hutchens’s HeX won the contest in 1996, but neither of his creations won again after that year.neither of his creations won again after that year.

HeX’s AlgorithmHeX’s Algorithm

Iterate roughly in this order:Iterate roughly in this order: Parse sentences one-by-one, convert to words. Look for Parse sentences one-by-one, convert to words. Look for

keywords in a database of hardwired replies (and use keywords in a database of hardwired replies (and use one only if hadn't been used before). one only if hadn't been used before).

If a stored reply could not be located, evaluate for a trick If a stored reply could not be located, evaluate for a trick question, and if detected, give a witty reply. question, and if detected, give a witty reply.

Call MegaHAL and generate psychobabble. Call MegaHAL and generate psychobabble. Reformulate the user's input according to one of several Reformulate the user's input according to one of several

hundred templates and spit it back. hundred templates and spit it back. Give a humorous response to silence. Give a humorous response to silence. Accuse the user of being ungrammatical etc. Accuse the user of being ungrammatical etc. As a last resort, generate more psychobabble with As a last resort, generate more psychobabble with

MegaHAL. MegaHAL.

MegaHAL’s AlgorithmMegaHAL’s AlgorithmConstructs reply sentences using Markov models Constructs reply sentences using Markov models

(sophisticated state machines) to predict what word (sophisticated state machines) to predict what word should go next in MegaHAL’s reply based on the should go next in MegaHAL’s reply based on the previous four words in the sentence.previous four words in the sentence.

The “information” of a word is the “surprise” it causes the The “information” of a word is the “surprise” it causes the Markov model, a function of the probability of the word:Markov model, a function of the probability of the word:

I(w|s) = -logI(w|s) = -log22P(w|s)P(w|s) Read the user's input, and segment it into an Read the user's input, and segment it into an

alternating sequence of words and non-words. alternating sequence of words and non-words. From this sequence, find an array of keywords and use From this sequence, find an array of keywords and use

it to generate many candidate replies. it to generate many candidate replies. Display the reply with the highest information to the Display the reply with the highest information to the

user. user. Use the user's input to update the Markov models, so Use the user's input to update the Markov models, so

that MegaHAL can learn from what the user types. that MegaHAL can learn from what the user types.

About HeX and MegaHALAbout HeX and MegaHAL

StrengthsStrengths HeX was easy to implement (only took one HeX was easy to implement (only took one

month to develop).month to develop). WeaknessesWeaknesses

MegaHAL sometimes generated sentences MegaHAL sometimes generated sentences that did not make sense. Since HeX used that did not make sense. Since HeX used MegaHAL in its algorithm, it had the same MegaHAL in its algorithm, it had the same problem.problem.

Just a glorified random sentence generator.Just a glorified random sentence generator.

Most recent Loebner Prize contestMost recent Loebner Prize contest

Winner in 2005: Rollo Carpenter’s Winner in 2005: Rollo Carpenter’s “Jabberwacky”“Jabberwacky”

Uses a unique learning algorithm that Uses a unique learning algorithm that stores previous conversations and uses stores previous conversations and uses them as guides in future conversations.them as guides in future conversations.

You can talk to Jabberwacky on the web:You can talk to Jabberwacky on the web:

www.jabberwacky.comwww.jabberwacky.com Awarded $3,000Awarded $3,000

"Jabberwacky learns from what you say and when you say it. Then, if the right moment comes up some time in the future, it says what you said ... and learns what someone ELSE says in response. So it's a giant feedback loop, and an imitator ... if you like, it's an unusually clever parrot.

To really 'get' how it works you have to think about it in a rather backwards sort of way. There's no programming to make it claim to be human, yet it does so often - because most of the people speaking to it claim to be human. And it will often accuse users of being a robot...“

- from the website of icogno, an AI company founded by Rollo Carpenter

About JabberwackyAbout Jabberwacky

Strengths:Strengths: Learns by experience in past conversations.Learns by experience in past conversations. Gets its replies from what humans have said Gets its replies from what humans have said

to it before, so its replies are sure to be to it before, so its replies are sure to be human-like.human-like.

Weaknesses:Weaknesses: Requires a huge database to store previous Requires a huge database to store previous

conversations.conversations. Still has no concept of situation (what exactly Still has no concept of situation (what exactly

it is talking about).it is talking about).

2005 Sample Transcripts2005 Sample TranscriptsTranscripts of conversations at 2005 Loebner Prize contest:Transcripts of conversations at 2005 Loebner Prize contest:

http://loebner.net/Prizef/2005_Contest/Transcripts.htmlhttp://loebner.net/Prizef/2005_Contest/Transcripts.html

HUMAN: How is the other entity treating you?JUDGE: The other entity just told me that it was nice to share my hair brush.HUMAN: Share your hair brush? You have a mad, perverted bot...JUDGE: I don't even have a hair brush. Otherwise I would have thought it quite forward indeed.

PROGRAM: Ok then it was nice to share your hair brush.JUDGE: What do mean by that? How have you shared my hair brush?PROGRAM: Good.JUDGE: Good?PROGRAM: Yes.JUDGE: This has become a very zen like conversation.PROGRAM: And worse yet, I gave it to charity.JUDGE: What? WHat did you give to charity? This conversation?PROGRAM: Er.. 1 million Dollars?

Excerpt from one of the conversations with Jabberwacky:

Animated GeorgeAnimated George(Jabberwacky’s character)(Jabberwacky’s character)

http://www.biscit.biz/~data/s/gehttp://www.biscit.biz/~data/s/george_jabberwacky_televirtual.worge_jabberwacky_televirtual.wmvmv

http://www.televirtual.com/movihttp://www.televirtual.com/movies/george_ITV.wmves/george_ITV.wmv

ReferencesReferences

““Computing Machinery and Intelligence,” A. M. Turing, Computing Machinery and Intelligence,” A. M. Turing, http://loebner.net/Prizef/TuringArticle.html, provided by Hugh http://loebner.net/Prizef/TuringArticle.html, provided by Hugh LoebnerLoebner

““Icogno,” Icogno Ltd, http://www.icogno.com/ Icogno,” Icogno Ltd, http://www.icogno.com/

"Lessons from a Restricted Turing Test,“ Stuart M. Shieber, 1993, "Lessons from a Restricted Turing Test,“ Stuart M. Shieber, 1993, http://www.eecs.harvard.edu/shieber/Biblio/Papers/loebner-rev-http://www.eecs.harvard.edu/shieber/Biblio/Papers/loebner-rev-html/loebner-rev-html.html html/loebner-rev-html.html

““MegaHAL,” Jason Hutchens, http://megahal.alioth.debian.org/ MegaHAL,” Jason Hutchens, http://megahal.alioth.debian.org/

““Home Page of the Loebner Prize in Artificial Intelligence,” 2003, Home Page of the Loebner Prize in Artificial Intelligence,” 2003, http://loebner.net/Prizef/loebner-prize.html http://loebner.net/Prizef/loebner-prize.html

““How to Pass the Turing Test by Cheating,” Jason L. Hutchens, 1997, How to Pass the Turing Test by Cheating,” Jason L. Hutchens, 1997, http://www.agent.ai/doc/upload/200403/hutc97_1.pdf http://www.agent.ai/doc/upload/200403/hutc97_1.pdf