what does(n't) work for chatbots - ai ukraine · what does(n't) work for chatbots ... i...

Post on 25-May-2020

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

what does(n't) work for chatbots

Oleksandr Khryplyvenkom3oucat@gmail.com

################################################################################

have you tried fixing your device

with a screw driver? | 10W 50K vocab

try screw driver | 3W 4K vocab

#### !2/50

!3/50

I'd like to refund i don't know

my unit is broken i don't know

do you work? i don't know

can i see your manager? i don't know

whatever question? i don't know

NNs suffer from imbalanced datasets

!4/50

data cleaning preprocessing

vocabulary reduction

data separation

labeling

clustering

dataset balancing!5/50

vocabulary reduction

frequency, lemmatization, spelling

dataset balancing

reduce bigger, increase smaller

AB A B

!6/50

labeling

intents, slots, rewards, start/end of answer

data separation

conversational, query, task completion

cluster separated data

for typical problems!7/50

componentsNLU

State Machine

NLG a=answer

o=question

B

s, π : S → A

→→

→→(SM)

!8/50

NLUselects a model to use from

{conversational, query, task completion}

intent detection & slot values https://dialogflow.com/

"wanna pizza margarita @ work, pay cash"

order_pizza(loc=_, pay=cash)!9/50

SMmakes you able to define policies

π S → A

"wanna pizza margarita @ work, pay cash"order_pizza(loc=_, pay=cash)o(bservation) =

H(istory): work: loc = 49,72S(tate) = order_pizza(loc=49,72, pay=cash)

(olicy): π

(ction space) (order_pizza(loc=*, pay=*)) =

= a_deliver(order(loc=loc, pay=pay)) = a(ction)!10/50

NLGruns on state machine

depends on policy

templates(user-friendly answers)

a_greet = random.choice(["Hi", "Hello", "Greetings"])

!11/50

ChatbotsB

conversational

task completion(TC)

text query(TQ)

!12/50

how can Ihelp you? prove

you’re nota bot

42

conversationalsustain a chat

covers a lot of existing text w/o labeling

suffers from imbalanced datasetsNLU, SM, NLG, TQ: all in one

hard to trainhard to manage NERs !13/50

conversational

plain seq2seq

seq2seq + attention

transformer

ranging

BIG models/ensembles

!14/50

plain seq2seq(uni/bi directional LSTMs)

lots of samples

very high probability to stuck in

simple to grasp

"i don't know" | dirty data

×

!15/50

seq2seq + attention

https://www.youtube.com/watch?v=RLWuzLLSIgw beam search

https://github.com/tensorflow/nmt

https://arxiv.org/pdf/1510.03055.pdf

official version is easy to modify

takes more time to train than transformer

if cooked properly, performs the bestbeam search out-of-the boxMMI may be easily addedworks(better) when trained on one to many

MMI

<latexit sha1_base64="JriRhTI76RygZDK6iijmYMxGb+4=">AAACBHicdVDLSgMxFM34rPVVdekmWARXZTJSO8uiG5cV7APaoWQyt21o5kGSUYahWz/ArX6CO3Hrf/gF/obpQ9CiBwKHc+7h5h4/EVxp2/6wVlbX1jc2C1vF7Z3dvf3SwWFLxalk0GSxiGXHpwoEj6CpuRbQSSTQ0BfQ9sdXU799B1LxOLrVWQJeSIcRH3BGtZE6PTYCNoagXyrbFeJUXZdgu3JRdci5Y4hNiOPWMKnYM5TRAo1+6bMXxCwNIdJMUKW6xE60l1OpORMwKfZSBQllYzqErqERDUF5+ey/E3xqlAAPYmlepPFM/ZnIaahUFvpmMqR6pJa9qfind09VZoJL6/XA9XIeJamGiM23D1KBdYynjeCAS2BaZIZQJrk5ALMRlZRp01vRNPN9Pv6ftJwKMfzGKdcvFx0V0DE6QWeIoBqqo2vUQE3EkECP6Ak9Ww/Wi/Vqvc1HV6xF5gj9gvX+Ba2PmTw=</latexit>

!16/50

i can't use this device in my lifeyou can use the browse by device type make search engine on our home page i searched there and it says it should workif you have any question later let us know /common answer | unrelated context/i would like to refund i will let them know to refund you the refund may i talk to your manageri will let her know to send it to you

!17/50

you're pissing me offyeswhy you're doing this?i do not know if it is possible to beto be what?you can also use the browse i made browsing and it assures me i should use my chargeryou can check our youtube channel to see if there is anything you could do /from time to time it gives helpful answers because trained on good data/

!18/50

i just watched your channel and it says i'm charging it in wrong wayi am sorry i am glad i can helpwhat are you sorry of? you haven't helped me!i am not sure what you mean by more question i will have to check on forum for that devicethank you a lot for keeping trying i will let her know to send you the documentyou just ensured me that you'll look for a solution on forumi will have to check on forum coupon codeno need a coupon, need my device working!you can use the search engine on our home page to link you to the manual for your device !19/50

my unit doesn't workwhat firmware version are you usei'm using firmware version 2.1 you can try update firmware if you wouldok, I'll update it, thanksyes correct

nice chat sample

!20/50

transformer

https://ai.googleblog.com/2017/08/transformer-novel-neural-network.html

available on github

does worse than nmt

feedforward

quick to train

official version is not so easy to customize

×

too generic answers(only attention)

!21/50

my ubuntu doesn't worki would say that is a bit of a ofa bit of what? i would say bitcan you tell me something meaningful? i do not know i can't get online on my laptop what is the problemi can't install softwarewhat is the errorthe error is: exception casei would try the alternate eol of the ubuntu !22/50

BIG models, ensembles

https://github.com/jiweil/Neural-Dialogue-Generation

if you've got lucky, you get diverse answers

require too much resources

not so easy to modify, even run

require labeled data

high chance "I don't know" | YOUR data

!23/50

ranging (concept)

q = argmin D(quser, ∀q ∈ Q)

π = f : Q → Aa = maxscorea |q

simple to implement

heavily depends on encoding quality

easy to debugworks for TQ(text query)

(huge labeled datasets)!24/50

OR

chat context

for sentence embedding - you can simply add avg so far to each embedding as a context

https://ai.googleblog.com/2018/05/smart-compose-using-neural-networks-to.html

!25/50

how can Ihelp you? prove

you’re nota bot

order goddamnthe most expensive

pizza!

TChave to complete a task

or conclude there's no way you can do it

tractable, easy to debug & modifyrequires lots of manual workmost used approaches require NLU !26/50

(RL) in discrete space

hard to keep LSH as non-linear homomorphism(metric changes)

f : Vℓ → ℝn → {0,1}m → Vℓ

π : S → Apretrained NN

LSH

pretrained NN = skipthoughts, BiMPM...R(eward) depends on how successfully you complete a task

word/sentence → embedding

×

!27/50

• We may say, we simply learn policy

• q/a = all successive user's/agent's sentences(utterance) before agent's/(user's) ones respectively

• a = a

• s = q(aq)(aq)(aq)... | all (q...q) so far

π : S → A

(RL) in continuous spacestate definition

?

!28/50

the problem even worse when a is a sentence-wise vector

s ∈ ℝm, a ∈ ℝn, Q ∈ ℝ(q . . . ), (a) ∈ vocab

a = arg maxQ

∂Q(s, a)∂a

s

aQ

(q…)

(a)

s

aQ

(q…)

(a)

DQN

DQN

!29/50

a ∈ ℝn!getting back to discrete V space is non-trivial

we have to train another network

a ∈ ℝn → a ∈ Vℓ, ℓ = sen len in words

!30/50

Formsinterpretable

lots of manual work

allows RL in discrete space(converges)

datasets are optional (coldstart)

need NLU

need to define default policy

if state defined wrong, pisses off users

<latexit sha1_base64="JriRhTI76RygZDK6iijmYMxGb+4=">AAACBHicdVDLSgMxFM34rPVVdekmWARXZTJSO8uiG5cV7APaoWQyt21o5kGSUYahWz/ArX6CO3Hrf/gF/obpQ9CiBwKHc+7h5h4/EVxp2/6wVlbX1jc2C1vF7Z3dvf3SwWFLxalk0GSxiGXHpwoEj6CpuRbQSSTQ0BfQ9sdXU799B1LxOLrVWQJeSIcRH3BGtZE6PTYCNoagXyrbFeJUXZdgu3JRdci5Y4hNiOPWMKnYM5TRAo1+6bMXxCwNIdJMUKW6xE60l1OpORMwKfZSBQllYzqErqERDUF5+ey/E3xqlAAPYmlepPFM/ZnIaahUFvpmMqR6pJa9qfind09VZoJL6/XA9XIeJamGiM23D1KBdYynjeCAS2BaZIZQJrk5ALMRlZRp01vRNPN9Pv6ftJwKMfzGKdcvFx0V0DE6QWeIoBqqo2vUQE3EkECP6Ak9Ww/Wi/Vqvc1HV6xF5gj9gvX+Ba2PmTw=</latexit>

!31/50

TEST_CASE = [ [ "Hey, buddy, I have nothing to do. I'd like to talk to someone", [ ("NO_INTENTS_DETECTED", dict()), ] ], [ "Could you give me a pepperoni and one more", [ ("specify_pizza", {"pizza_type": "pepperoni", "quantity": 1}), ("order_pizza", dict()), ] ], ... [ "yep, that's all", [ ("confirm", dict()) ] ] ]

SM test sample

!32/50

Hey, buddy, I have nothing to do. I'd like to talk to someone Hello What kind of pizza do you want? you can select pre-set pizzaz or you can make you own pizza from scratch from the following ingredients: cheddar salami mozzarella tomatoes sweet red pepper pineapple chicken bacon chili red beef preset pizza types are: margarita pepperoni chicken vegetarian calzone Could you give me a pepperoni and one more What kind of pizza do you want? you can select pre-set pizzaz or you can make you own pizza from scratch from the following ingredients: cheddar salami mozzarella tomatoes sweet red pepper pineapple chicken bacon chili red beef preset pizza types are: margarita pepperoni chicken vegetarian calzone which address deliver to? delivery time? which payment type do you prefer? ... Would you like to add something else to your order? yep, that's all {'delivery_address': '80 banana st', 'delivery_time': '14:00', 'payment_type': 'cash', 'cart': [('pizza', {'pizza_type': 'pepperoni', 'quantity': 1}), ('custom_pizza', {'quantity': 1, 'ingredients': {'cheese': 1}})]} !33/50

question:words = sentence.lower().split()

s1cat

s2cat

s1dog

(this something like google's dialog flow does)- is it a state?- when it is a state?

'cat' and 'meow' in words =

'cat' and 'scratch' in words =

'dog' and 'barks' in words =

S definition

!34/50

sdefault = sc = S∖{s1cat, s2

cat, s1dog}

This is a state if you define a complement state:

Then your task is to define a policy which will reach a terminalstate: when all slots for all intents are set or proven they couldn't be

Often, action for the default state is fallback to conversational

...Still, is S a well-defined state space?

!35/50

A definitiona is a predefined answer from a bot to a user which makes(convinces) the user to fill some slot or reveal an intent

also, there's hidden 'a' part - change bot state (if needed)

intent_1(_, _, 'slot_3_val') ->

db.save(intent_1.slot_3)

ask_a_user_fill_slot_1_for_intent_1()

!36/50

until there are ambiguities your state is bad defined

ambiguity resolution!(context, pronouns, ...) based on previous observations

π : S → A...for now your model is fully tractable

No ambiguities, then s as defined above is a state

intent_1("panda", "eats", "shoots"| bank in history) -> call_the_police

intent_1("panda", "eats", "shoots"| animal in history) -> take_a_photo

!37/50

SM(recall)makes you able to define policies

π S → A

"wanna pizza margarita @ work, pay cash"order_pizza(loc=_, pay=cash)o(bservation) =

H(istory): work: loc = 49,72S(tate) = order_pizza(loc=49,72, pay=cash)

(olicy): π

(ction space) (order_pizza(loc=*, pay=*)) =

= a_deliver(order(loc=loc, pay=pay)) = a(ction)!38/50

same works for ranging

!39/50

text query(TQ)the best text search you can

by documents you have

your card 70%cash 60%pound of flesh 90%

how can Ipay?

+ pretrained models available- hard/impossible to train on own data - hardcore labeling

!40/50

Used if you have some unstructured text DBs, like manuals, HOWTOs, etc.

BiDAFhttps://allenai.github.io/bi-att-flow/

https://research.fb.com/downloads/babi/

bAbi

!41/50

All 3 within a dialoguehow can Ihelp you? prove

you’re nota bot

order goddamnthe most expensive

pizza!

42

how can Ipay?

your card 70%cash 60%pound of flesh 90%

still hungry?

!42/50

company size

data

s2ss2s+att

att

ensemblesranging

forms

TQ

log scale

R<latexit sha1_base64="Oqw4IX1NbrtHQ4FdS01jhIKtqYc=">AAACBXicbVC7TsMwFL3hWcqrwMhiUSExVQkLbFRiYSyIPlAbVY7rtFZtJ7IdUIg68wGs8AlsFSvfwcjEb+C0HaDlSJaOzrlH9/oEMWfauO6ns7S8srq2Xtgobm5t7+yW9vYbOkoUoXUS8Ui1AqwpZ5LWDTOctmJFsQg4bQbDy9xv3lOlWSRvTRpTX+C+ZCEj2FjpriOwGQQBuumWym7FnQAtEm9Gyhfjx68qANS6pe9OLyKJoNIQjrVue25s/Awrwwino2In0TTGZIj7tG2pxIJqP5scPELHVumhMFL2SYMm6u9EhoXWqQjsZH6gnvdy8V/vAevUBufWm/Dcz5iME0MlmW4PE45MhPJKUI8pSgxPLcFEMfsBRAZYYWJscUXbjDffwyJpnFY8y6/dctWFKQpwCEdwAh6cQRWuoAZ1ICDgGV7g1Xly3pyx8z4dXXJmmQP4A+fjB0Dgm5o=</latexit><latexit sha1_base64="rTqW+aXUTRJS6OMND9p3m2O5PKA=">AAACBXicbVC7TsMwFL0pr1JeBUYWiwqJqUpY6FiJhbEg+kBtVDmu01q1nch2QFHUmQ9ghU9gQ6x8B1/Ab+C0GaDlSJaOzrlH9/oEMWfauO6XU1pb39jcKm9Xdnb39g+qh0cdHSWK0DaJeKR6AdaUM0nbhhlOe7GiWAScdoPpVe53H6jSLJJ3Jo2pL/BYspARbKx0PxDYTIIA3Q6rNbfuzoFWiVeQGhRoDavfg1FEEkGlIRxr3ffc2PgZVoYRTmeVQaJpjMkUj2nfUokF1X42P3iGzqwyQmGk7JMGzdXfiQwLrVMR2Mn8QL3s5eK/3iPWqQ0urTdhw8+YjBNDJVlsDxOOTITyStCIKUoMTy3BRDH7AUQmWGFibHEV24y33MMq6VzUPctv3FrTLToqwwmcwjl4cAlNuIYWtIGAgGd4gVfnyXlz3p2PxWjJKTLH8AfO5w+XQZkM</latexit>

R<latexit sha1_base64="Oqw4IX1NbrtHQ4FdS01jhIKtqYc=">AAACBXicbVC7TsMwFL3hWcqrwMhiUSExVQkLbFRiYSyIPlAbVY7rtFZtJ7IdUIg68wGs8AlsFSvfwcjEb+C0HaDlSJaOzrlH9/oEMWfauO6ns7S8srq2Xtgobm5t7+yW9vYbOkoUoXUS8Ui1AqwpZ5LWDTOctmJFsQg4bQbDy9xv3lOlWSRvTRpTX+C+ZCEj2FjpriOwGQQBuumWym7FnQAtEm9Gyhfjx68qANS6pe9OLyKJoNIQjrVue25s/Awrwwino2In0TTGZIj7tG2pxIJqP5scPELHVumhMFL2SYMm6u9EhoXWqQjsZH6gnvdy8V/vAevUBufWm/Dcz5iME0MlmW4PE45MhPJKUI8pSgxPLcFEMfsBRAZYYWJscUXbjDffwyJpnFY8y6/dctWFKQpwCEdwAh6cQRWuoAZ1ICDgGV7g1Xly3pyx8z4dXXJmmQP4A+fjB0Dgm5o=</latexit><latexit sha1_base64="rTqW+aXUTRJS6OMND9p3m2O5PKA=">AAACBXicbVC7TsMwFL0pr1JeBUYWiwqJqUpY6FiJhbEg+kBtVDmu01q1nch2QFHUmQ9ghU9gQ6x8B1/Ab+C0GaDlSJaOzrlH9/oEMWfauO6XU1pb39jcKm9Xdnb39g+qh0cdHSWK0DaJeKR6AdaUM0nbhhlOe7GiWAScdoPpVe53H6jSLJJ3Jo2pL/BYspARbKx0PxDYTIIA3Q6rNbfuzoFWiVeQGhRoDavfg1FEEkGlIRxr3ffc2PgZVoYRTmeVQaJpjMkUj2nfUokF1X42P3iGzqwyQmGk7JMGzdXfiQwLrVMR2Mn8QL3s5eK/3iPWqQ0urTdhw8+YjBNDJVlsDxOOTITyStCIKUoMTy3BRDH7AUQmWGFibHEV24y33MMq6VzUPctv3FrTLToqwwmcwjl4cAlNuIYWtIGAgGd4gVfnyXlz3p2PxWjJKTLH8AfO5w+XQZkM</latexit>

RL

!43/50

Yandex Алисаhttps://www.youtube.com/watch?v=_law_tey0OQ

Amazon Alexa + skills

Google dialog flowhttps://dialogflow.com/

https://getstoryline.com/Storyline(Alexa skills)

https://deeppavlov.ai/Deep Pavlov

Ecosystem?

!44/50

some open problems

!45/50

Chinese room (open domain)

!46/50

no good objectives

(ACC is mostly used)

Best evaluated by

!47/50

utterance-utterance

Hi. My phone doesn’t work I broke it.

Hello. Tell me model of your phone. You mean physically or some program doesn’t work?

Either boring data labeling...

... or merge utterance in a single sentence?!48/50

Hi. My phone doesn’t work I broke it.

Hello. Tell me model of your phone. You mean physically or some program doesn’t work?

˟

then nmt on I/O all pairs

sounds good, does work. But don't know why.

too long | high I/O variation sentences stuck in "I don't know"

!49/50

thanks!

see this soon on

m3ucat@gmail.com

https://oleksandr-khryplyvenko.github.io/

top related