siri, watson and natural language...

Siri, Watson and Natural Language Processing

Intelligent Software Lab. POSTECHProf. Gary Geunbae Lee

Contents Siri, Watson and NLP Multi-domain ontology reasoning dialog systems

for intelligent assistant Multi-strategy knowledge search Question

Answering systems Multi-party open proactive dialog systems Dialog system for English education

Siri, Watson and NLP

4

Apple SiriSection

• Siri (IOS)– Released in October 4, 2011– Collaborated dialog interface,

using Web-search engine (Bing, Google, Yahoo)

– offshoot of CALO project funded by DARPA

– Implemented with Nuance Speech

Siri – your wish is its command

• Services– Built-in apps

• Worldwide– Phone, FaceTime, Music, Mail, Messages,

Calendar, Reminders, Notes, Contacts, Weather, Stocks, WebSearch, find My Friends,Alarms, World Clock and Timer, Wolfram:Alpha,Wikipedia search

• In the US– Maps, Local search with Yelp!

– Chatting • (maybe) it does not support discourse contexts.

– Ecosystem of partners (before apple purchased Siri)• Restaurant and Business Questions: OpenTable, Gayot, CitySearch, BooRah,

Yahoo Local, ReserveTravel, Localeze• Events and Concert information: Eventful, StubHub, LiveKick• Movie Information: MovieTickets, RottenTomatoes, New York Times• Factual Question Answering: Bing Answers, Wolfram Alpha• Web Search: Bing, Yahoo, Google

Sample Dialogs (chatting)U: Who are youS: I’m just a humble virtual assistantU: Seriously, who are you?S: Who I am isn’t importantU: No for real I really want to know who you areS: OK, maybe not.U: Are you SiriS: That’s me!U: What is the meaning of lifeS: I can’t answer that now, but give me some time to write a very long

play in which nothing happens.U: Once again what is the meaning of lifeS: Try and be nice to people, avoid eating fat, read a good book every

now and then, get some waking in, and try to live together in peace and harmony with people of all creeds and nations

Sample Dialogs (tasks)U: Remind me to order some flowers for my lovely wife and adorable

daughter tomorrow at 9 AMS: Here’s your reminder for tomorrow at 9 am, Shall I create it?U: Yes pleaseS: OK, I’ll remind you.U: Check apple stock priceS: Apple Inc. went up significantly today. 400 dollars and 29 centsU: Send to EwdisonS: What do you want to say to Ewdison Then?U: Hey, hope you’re feeling better, see you tomorrow at work … (Siri barges

in)S: I updated your message. Ready to send it?

• (Message: Hey hope you’re feeling better)U: NoS: To continue, you can Send, Cancel, Review, or Change itU: Change it…

Reminder

Stock

Messages

Architecture

• Active ontologies as a method and apparatus for building Siri

Survey on Siri Patent 8

Fig. 1 Intelligent Automated Assistant (from Siri Patent)

9

Google NowSection

• Google now (Android)– Initial-Released in July, 2012

(Galaxy nexus) – Collaborated dialog interface,

using Google Voice search and Web search

– Leverages Google Knowledge Graph project, analyzing meaning and connections of result

– Context understanding for proactive service

10

MS CortanaSection

• Cortana (Windows Phone)– Released in April 2, 2014– Collaborated dialog interface,

using Bing search Engine and Azure Cloud service

– Can also recognize music– Well known for predicting

winners of first 14 matches of 2014 World Cup

– Show with MS deep neural network to identify cats

Dan Jurafsky

Question Answering: IBM’s Watson

• Won Jeopardy on February 16, 2011!

WILLIAM WILKINSON’S “AN ACCOUNT OF THE PRINCIPALITIES OF

WALLACHIA AND MOLDOVIA”INSPIRED THIS AUTHOR’SMOST FAMOUS NOVEL

Bram Stoker

Dan Jurafsky

Types of Questions in Modern Systems

• Factoid questions• Who wrote “The Universal Declaration of Human Rights”?• How many calories are there in two slices of apple pie?• What is the average age of the onset of autism?• Where is Apple Computer based?

• Why, how (procedure), what is (definition), list up, etc…• Complex (narrative) questions:

• In children with an acute febrile illness, what is the efficacy of acetaminophen in reducing fever?

• What do scholars think about Jefferson’s position on dealing with pirates?

13/44

KB

14

IBM Watson Platform and Application

GenieMD Inc.health care app

Majestyk Apps.edu support app

Red Ant.retail sale business intelligence app

15

IBM Watson - Recent ApplicationsSection

Watson Engagement Advisor

WatsonDiscovery Advisor

Watson Explorer

발표자프레젠테이션 노트--KNOW-MEReflexis StorePulseReflexis Systems, Inc.날씨, 소셜미디어 동향, 지역 행사, 뉴스 등의 정보원으로부터 소비동향을 예측하고이에 대응하기 위한 판매전략을 도출하여 통보http://www.businesswire.com/news/home/20141008006500/en/IBM-Reflexis-Tap-Power-Watson-Transform-Retail#.VDktWPl_swA

--EMPOWER-MEWatson discovery advisorBaylor College of Medicine, Johnson & Johnson자연어 이해 능력을 이용하여 다양한 분야에서 생산되는 대량의 문헌을 이해하고 분석하여인간이 미처 발견하지 못한 가설 혹은 데이터 상의 연결점을 도출하여 통보http://www.ibm.com/smarterplanet/us/en/ibmwatson/discovery-advisor.html

--ENGAGE-MERecipe generation demo at SXSW 2014Institute of Culinary EducationWatson 시스템이 메뉴에 사용할 주 재료와 문화권 등의 스타일을 지정받은 뒤기존 조리법의 선호도 및 재료간의 조화에 대한 정보를 이용하여 새로운 조리법을 생성http://asmarterplanet.com/blog/2014/02/food-thought-ibm-watson-whips-creativity.html

16

IBM Watson – EcosystemSection

Recipe generation

• Watson Developer Cloud• Public API

• Watson Content Store• Content providing network

• Watson Talent Hub• Talent expert matching

Dan Jurafsky

Language Technology

Coreference resolution

Question answering (QA)

Part-of-speech (POS) tagging

Word sense disambiguation (WSD)

Paraphrase

Named entity recognition (NER)

ParsingSummarization

Information extraction (IE)

Machine translation (MT)Dialog

Sentiment analysis

mostly solved

making good progress

still really hard

Spam detection

Let’s go to Agra!

Buy V1AGRA …

✓

✗

Colorless green ideas sleep furiously.

ADJ ADJ NOUN VERB ADV

Einstein met with UN officials in Princeton

PERSON ORG LOC

You’re invited to our dinner party, Friday May 27 at 8:30

PartyMay 27add

Best roast chicken in San Francisco!

The waiter ignored us for 20 minutes.

Carter told Mubarak he shouldn’t run again.

I need new batteries for my mouse.

The 13th Shanghai International Film Festival…

第13届上海国际电影节开幕…

The Dow Jones is up

Housing prices rose

Economy is good

Q. How effective is ibuprofen in reducing fever in patients with acute febrile illness?

I can see Alcatraz from the window!

XYZ acquired ABC yesterday

ABC has been taken over by XYZ

Where is Citizen Kane playing in SF?

Castro Theatre at 7:30. Do you want a ticket?

The S&P500 jumped

What’s hard – ambiguities, ambiguities, all different levels of ambiguities

John stopped at the donut store on his way home from work. He thought a coffee was good every few hours. But it turned out to be too expensive there. [from J. Eisner]

- donut: To get a donut (doughnut; spare tire) for his car?- Donut store: store where donuts shop? or is run by donuts? or looks like a

big donut? or made of donut?- From work: Well, actually, he stopped there from hunger and exhaustion,

not just from work.- Every few hours: That’s how often he thought it? Or that’s for coffee?- it: the particular coffee that was good every few hours? the donut store?

the situation- Too expensive: too expensive for what? what are we supposed to conclude

about what John did?

Dan Jurafsky

non-standard English

Great job @justinbieber! Were SOO PROUD of what youve accomplished! U taught us 2 #neversaynever & you yourself should never give up either♥

segmentation issues idioms

dark horseget cold feet

lose facethrow in the towel

neologisms

unfriendRetweet

bromance

tricky entity names

Where is A Bug’s Life playing …Let It Be was recorded …… a mutation on the for gene …

world knowledge

Mary and Sue are sisters.Mary and Sue are mothers.

But that’s what makes it fun!

the New York-New Haven Railroadthe New York-New Haven Railroad

Why else is natural language understanding difficult?

Levels of Language

• Phonetics/phonology/morphology: what words (or subwords) are we dealing with?

• Syntax: What phrases are we dealing with? Which words modify one another?

• Semantics: What’s the literal meaning?• Pragmatics: What should you conclude from the

fact that I said something? How should you react?

20

21

Recent Trend of Application using NLPSection

• Summary of Gartner Report, 2014

22

Recent Trend of Application using NLP/AISection

• Summary of Gartner Report (cont.)– Scale of Market:

• 53 billion $ in 2012, will grow to 113 billion $ in 2017• About 6 billion $ in 2015 in domestic market (2012, KISTI report)

– Riffle effect: • About 1.1 billion users will use Intelligent Personal Assistant system in 2015• About 1 billion vehicles will using Artificial Intelligence

– NLP using Deep Learning: • Recent Watson adopted cloud system for distributed computing• MS launched “Adam” project using Neural Network technique

IOT2H (Internet of things to Human)

23

Siri KGSDS UI/UX

WATSON

Red antMajestykapps

genieMD

IOT2H Platform-communication (logos-pathos-ethos): natural language processing/emotion-thinking (smart): reasoning/ontology-knowledge (exo-brain): knowledge question answering/retrieval

IOT2H service- co-op service (human in the loop) for health, home, mobile, education

Multi-domain ontology reasoning

dialog systems for intelligent

assistant

SPOKEN DIALOG SYSTEM (SDS)

Interactive Question Answering New challenges for Question Answering System [TREC ciQA; HLT-NAACL2006 workshop]

Series of related questions in a session / Interact with other people Should handle anaphora, ellispses and other discourse related problems But still mainly user initiative; no dialog “management”

POS Tagging

Answer TypeIdentification

AnswerJustification

Query Formation

Dynamic AnswerPassage Selection

Answer Finding

DocumentRetrieval

Answer Type

Answer1

Question-m

Question2Question1

……..

Answer2…….Answer-m

Tele-service

Car-navigation Home networking

Robot interface

SDS APPLICATIONS

ASR (automatic speech recognition)

FeatureExtraction Decoding

AcousticModel

PronunciationModel

LanguageModel

버스 정류장이어디에있나요?

Speech Signals Word Sequence

버스정류장이어디에있나요?

NetworkConstruction

SpeechDB

TextCorpora

HMMEstimation

G2P

LMEstimation

WO

)()|(maxargˆ WPWOPWLW∈

=

SPEECH UNDERSTANDING (in general)

Computer Program

Speaker ID /Language ID

Sentiment / Opinion

Named Entity / Relation

Topic / Intent

Speech Segment

Summary

Syntactic / Semantic Role

SQL

Meaning Representation

Dave /English

Nervous

LOC = pod bayOBJ = door

Control the Spaceship

Open the doors.

Open=Verb, the=Det. ...

select * from DOORS where ...

REPRESENTATION Semantic frame (slot/value structure) [Gildea and Jurafsky, 2002]

An intermediate semantic representation to serve as the interface between user and dialog system

Each frame contains several typed components called slots. The type of a slot specifies what kind of fillers it is expecting.

“Show me flights from Seattle to Boston”

ShowFlight

Subject Flight

FLIGHT Departure_City Arrival_City

SEA BOS

FLIGHT

SEABOS

Semantic representation on ATIS task; XML format (left) and hierarchical representation (right) [Wang et al., 2005]

Knowledge-based Systems Knowledge-based systems:

Developers write a syntactic/semantic grammar A robust parser analyzes the input text with the grammar Without a large amount of training data

Previous works MIT: TINA (natural language understanding) [Seneff, 1992] CMU: PHEONIX [Pellom et al., 1999] SRI: GEMINI [Dowding et al., 1993]

Disadvantages1) Grammar development is an error-prone process2) It takes multiple rounds to fine-tune a grammar3) Combined linguistic and engineering expertise is required to

construct a grammar with good coverage and optimized performance

4) Such a grammar is difficult and expensive to maintain

31

Two Classification Problems

HOW TO SOLVE: STATISTICAL APP

Find Korean restaurants in Daeyidong, PohangInput:

Output: SEARCH_RESTAURANT

Dialog Act Identification

FOOD_TYPE ADDRESS CITY

Find Korean restaurants in Daeyidong, PohangInput:

Output: Named Entity Recognition

Encoding:

x is an input (word), y is an output (NE), and z is another output (DA).

Vector x = {x1, x2, x3, …, xT} Vector y = {y1, y2, y3, …, yT} Scalar z

Goal: modeling the functions y=f(x) and z=g(x)

PROBLEM FORMALIZATION

x Find Korean restaurants

in Daeyidong

, Pohang .

y O FOOD_TYPE-B O O ADDRESS-B O CITY-B O

z SEARCH_RESTAURANT

MACHINE LEARNING FOR SLU Background: Maximum Entropy (a.k.a logistic regression)

Conditional and discriminative manner Unstructured! (no dependency in y) Dialog act classification problem

Conditional Random Fields [Lafferty et al. 2001] Structured versions of MaxEnt (argmax search in inference) Undirected graphical models Popular in language and text processing Linear-chain structure for practical implementation Named entity recognition problem

z

x

yt-1 yt yt+1

xt-1 xt xt+1

fk

gk

hk

DIALOG MANAGEMENT GOAL Answer your query (e.g., question and order)

given the task domain It includes : Provide query results Ask further slot information Confirm user utterance Notify invalid query Suggest the alternative

Related to dialog complexity and task complexity.

In practice Find the best system action a given the dialog state s

DESIGN ISSUES Task complexity How hard the task is? How much the system has domain knowledge?

Simple Complex

Call Routing

CollaborativePlanning

WeatherInformation

Conversational English Tutoring

AutomaticBanking

DESIGN ISSUES Dialog complexity Which dialog phenomena are allowed

Initiative strategies e.g., system-initiative vs. user-initiative vs. mixed-initiative

Meta-dialogs; the dialog itself e.g., Could you hold on for a minute?

Subdialogs; clarification/confirmation e.g., You selected KE airlines, is it right?

Multiple dialog threads e.g., domain switching

DIALOG EXAMPLES Example 3

U: I’d like to have African food in Gangnam, Seoul S: Sorry, there are no African restaurants. S: How about American restaurants in Gangnam, Seoul?U: No I don’t like it.S: What is your favorite food?U: I like grilled and seasoned beef S: So, how about Korean restaurants?U: Good.

Mixed-initiative Implicit/Explicit confirmation Recommends the alternative when query fails Most natural dialog flow

KNOWLEDGE-BASED DM (KBDM) Rule-based approaches Early KBDMs were developed with handcrafted

rules (e.g., information state update). Simple Example [Larsson and Traum, 2003]

Agenda-based approaches Recent KBDMs were developed with domain-

specific knowledge and domain-independent dialog engine.

AGENDA-BASED DM RavenClaw DM (CMU) Using Hierarchical Task Decomposition

A set of all possible dialogs in the domain Tree of dialog agents Each agent handles the corresponding part of the dialog

task

[Bohus and Rudnicky, 2003]

Vanilla EXAMPLE-BASED DM (EBDM) Example-based approaches

Dialog State Space

Domain = Building_GuidanceDialog Act = WH-QUESTIONMain Goal = SEARCH-LOCROOM-TYPE=1 (filled), ROOM-NAME=0 (unfilled)LOC-FLOOR=0, PER-NAME=0, PER-TITLE=0Previous Dialog Act = , Previous Main Goal = Discourse History Vector = [1,0,0,0,0]Lexico-semantic Pattern = ROOM_TYPE 이어디지 ?System Action = inform(Floor)

Dialog CorpusUSER: 회의 실이 어디지 ?[Dialog Act = WH-QUESTION][Main Goal = SEARCH-LOC][ROOM-TYPE =회의실]SYSTEM: 3층에 교수회의실, 2층에대회의실, 소회의실이있습니다. [System Action = inform(Floor)]

Turn #1 (Domain=Building_Guidance)

Dialog Example

Indexed by using semantic & discourse features

Having the similar state

),(argmax* heSe iEei∈

=

Cheongjae Lee, Sangkeun Jung, Seokhwan Kim, Gary Geunbae Lee. Example-based dialog modeling for practical multi-domain dialog system. speech communications, 51:5 (466-484), May 2009

STOCHASTIC DM Supervised approaches [Griol et al., 2008] Find the best system action to maximize the

conditional probability P(a|s) given the dialog state Based on supervised learning algorithms

MDP/POMDP-based approaches [Williams and Young, 2007] Find the optimal system action to maximize the reward

function R(a|s) given the belief state Based on reinforcement learning algorithms

In general, a dialog state space is too large So, generalizing the current dialog state is important

Template-based System Utterance Generation

System Utterance Generator

SystemTemplate

DB

System Action

Dialog Frame

Retrieved Result

Inform_cast

Program : 시크릿 가든

Cast : 현빈, 하지원

의주인공은입니다.

OOD/DD (Out-of-Domain/Domain Detection)

Utterance

Domain Detection

IN-DOMAIN

Task Dialog Service

OOD-CHAT

Chat Dialog Service

OOD-TASK

Rejection Message

OOD Utterance Rejection (Confidence Combination Approach)

Score– S(i) = λFOR * SFOR(i) + λDOD * SDOD + λDAC(i) * SDAC + λIDV(i) * SIDV(i)

FOR

DAC

IDV

DOD

NER

Positive example : IN-DOMAIN corpusNegative example : OOD-CHAT corpusFeature : lexical unigram & bigram

Data : IN-DOMAIN corpusFeature : lexical unigram & bigram

Corpus : TID corpusFeature : lexical unigram & bigram

Data : TID corpusFeature : lexical features+ Named entity dictionary

Positive example : TID corpusNegative example : OOD-CHAT corpusFeature : OOV-LSP unigram & bigram

ScoreFOR

ScoreDOD

ScoreDAC

ScoreIDV

FinalIn-DomainVerification

IN-DOMAIN

OOD

λ

Seonghan Ryu, Jaiyoun Song, Sangjoon Koo, Soonchoul Kwon, Gary Geunbae Lee. Detecting multiple domains from user’s utterance in spoken dialog system. Proceedings of the international workshop series on spoken dialog systems (IWSDS 2015), Jan 2015, Busan

MULTI-MODAL DIALOG SYSTEM

x y

InputGesture

OutputSystem

Response

(x, y)

Training examples

Learning algorithm

InputSpeech

Inputface

TASK PERFORMANCE AND USER PREFERENCE Task performance and user preference for

multimodal over speech only interfaces [Oviatt et al., 1997] 10% faster task completion, 23% fewer words, (Shorter and simpler linguistic constructions) 36% fewer task errors, 35% fewer spoken disfluencies, 90-100% user preference to interact this way.

• Speech-only dialog system

Speech: Bring the drink on the table to the side of bed

• Multimodal dialog System

Speech: Bring this to herePen gesture:

Easy, Simplified

user utterance !

Dialog System Development Toolkit Features

Web-based Interface Providing easy-to-use interfaces for developers Controlling complicated processes in an efficient and stable manner

Domain Dialog Corpus

Definition SLU Corpus

NLG Template

Contents

Statistics

Validation

Training

Evaluation

Dialog System

Log Analysis

Design Acquisition& Annotation

RunningTraining Maintenance

WorkflowScreen shot

Donghyeon Lee, Kyungduk Kim, Cheongjae Lee, Junhwi Choi, Gary Geunbae Lee. D3 toolkit: A development toolkit for daydreaming spoken dialog system. Proceedings of the 2nd International Workshop on Spoken Dialog Systems Technololgy (IWSDS 2010), Oct 2010, Japan. (LNAI 6392, Springer)

AUTOMATED DIALOG SYSTEM EVALUATION

Sangkeun Jung, Cheongjae Lee, Kyungduk Kim, Minwoo Jeong, Gary Geunbae Lee. Data-driven user simulation for automated evaluation of spoken dialog systems, computer speech and language, 23(4): 479-509, Oct 2009

Querying with Inference Engine

Match entry ChannelFeb 5 ManU vs Chelsea football KBS

Let’s watch Wayne Rooney’s game

SLU

Wayne Rooney : Person name

Query Generation

SELECT ?match ?entry ?channelFROM WHERE { ?match owl:hasMonth owl:Dec .

?match owl:hasDay owl:d_12 .owl:Rooney owl:isMemberOf ?t .?match owl:hasTeam ?t .?match owl:hasEntry ?entry?match owl:hasChannel ?channel}

Result

HyeongJong Nho, Cheongjae Lee, Gary Geunbae Lee. Ontology-based inference for information-seeking in natural language dialog system. Proceedings of the 6th IEEE international conference on industrial informatics (IEEE INDIN 2008) July 2008, Dajeon Korea

51

Platform: Multi-Domain Ontology Reasoning Intelligent Assistant Dialog System Platform

Spoken Language Understanding (SLU)

Input Sentence

Knowledge Graph

Intent Determination Named Entity Recognition

Output

Action Selection

Response Generation

Service Execution

Complete

POMDP-based Disambiguation

Discourse & Anaphor Processing

YesNo

Ontology / Reasoning Service AgentA

PITask DB/KB

52

Open-Domain Spoken Language Understanding

• Traditional spoken dialog systems first detect a domain from the input sentence and perform domain-specific SLU

Ontology

Input Sentence

Spoken Language Understanding (SLU)

Intent Determination Named Entity Recognition

Domain Selection

Semantic Representation

Domain

Input Sentence

Domain Detection

SLUTV Program Guide

SLUMusic Guide

SLURestaurant Guide

• However, we first perform open-domain SLU

• We exploit ontology as important resource in understanding processes

Patent pending

53

• Open named entity recognition (AIDA)– 1. mentions are detected using the Stanford NER Tagger– 2. mentions are mapped onto canonical entities in a knowledge base

Open-Domain Spoken Language Understanding

Mentions

Candidate EntitiesKnowledge Bases

Mention-EntityPair

Entity-EntityPair

Yosef et al. “AIDA: An Online Tool for Accurate Disambiguation of Named Entities in Text and Tables,” Proc. VLDB 2011

Open-Domain Named Entity Recognition

Detection of NE Mentions

Input Sentence

Dictionary

Filtering of NE Candidates

NE Candidates

Filtered NE Candidates

Evaluation of NE Combinations Semantic LM

Generation of NE Combinations

NE Combinations

Best NE Combination

Overall Architecture Goals

Mendes et al., “DBPedia Spotlight: Shedding Light on the Web of Documents”, Proc. International Conference on Semantic Systems 2011 Yosef et al. “AIDA: An Online Tool for Accurate Disambiguation of Named Entities in Text and Tables” Proc. International Conference on

Very Large Databases 2011 Roth et al. “Wikificationand Beyond: The Challenges of Entity and Concept Grounding”, Tutorial in ACL, 2014

Large-scale named entity

dictionary from Knowledge base

(e.g. DBpedia, Freebase, Yago)

Entity type disambiguation is

performed based on semantic

language model

Detection of Multi-intents from a Sentence

Traditional spoken dialog systems focus on processing simple input sentences

that express only one intent → single intent (SI) type

However, in the real world, users often express multiple intents (MIs) within one

dialog turn → MI conjunctive (MI.C) and MI non-conjunctive (MI.N) types

We named this task MI detection (MID)

“what is the genre of big bang theory and tell me the story about it”

Detection of multi-intents

search-genre

search-introduction

User’s Utterance

55

Detection of Multi-intents from a Sentence

POS Tagging

Detection of Conjunction

Disambiguation of Sentence Boundary

Restoration of Original Sentences

Evaluation of multi-intent hypotheses

Detection of single-intent

Input Sentence

POS-tagged Input Sentence

Multi-intenthypotheses

Single-intent

Final answer

Multi-intenthypotheses

SI MI.C MI.N Avg.

Baseline 97.04% 65.37% 65.08% 87.50%

Proposed 96.62% 92.11% 94.40% 95.61%

SI MI.C MI.N Avg.

Baseline 96.64% 60.32% 63.02% 86.15%

Proposed 95.95% 94.17% 92.07% 95.10%

Korean

English

Overall Architecture Results

Seonghan Ryu, Junhwi Choi, Younghee Kim, Sangjoon Koo, Gary Geunbae Lee. A two-stage approach to multi-intent detection for spoken language understanding. Submitted to the 40th international conference on acoustics, speech and signal processing (ICASSP 2015), April 2015, Brisbane

Out-of-Domain / Domain Detection

Traditional spoken dialog systems assumed that all user utterances belong to only one domain

0.8 0.2 0.3 0.9

Extraction of Features

“I want news now”

Binary Classification

...Feature vector: X

xi = [0 ... 1]y = {positive, negative}

Word sequence: W

x1 x2 xn-1 xn

PositivePositive or negative: y

Ryu et al. “A hierarchical domain model-based multi-domain selection framework for multi-domain dialog systems,” Proc. Coling 2012 Ryu et al. “Exploiting out-of-vocabulary words for out-of-domain detection in dialog systems,” Proc. BigComp 2014 Ryu et al. “Detecting Multiple Domains from User’s Utterance in Spoken Dialog System,” Proc. IWSDS 2015

However, in the real world, users often express multi-domain requests or out-of-domain requests

We proposed a framework that performs multi-domain detection and out-of-domain detection

In each domain, various features are extracted from an input sentence and perform binary classification

Any news is on now?

User’s Utterance

Spoken Language Understanding

TV epg

Radio epg

Out-of-Domain / Domain Detection

Extraction of features

Input sentence

Part-of-speech tagging

Preprocessed sentence

Intent determination

NER

Intent determination model

NER model

Lexical LM scoring

Intent and NEco-occurrence table

Lexical LM score

LSP LM scoring

LSP LM score

Intent

Named entities

Mapping

Semantic consistency

Lexical LM

LSP LM

LSP lexicon

x1: confidence score of intent determination

x4: probability of the input sentence x5: probability of the lexico-semantic pattern of the input sentence

x2: confidence score of named entity recognition

x3: semantic consistency of intent and named entities

※We are currently working on exploiting distributed word representation in language modeling

59

POMDP-DM with Hybrid ArchitectureSection

• Motivation of proposed method– Uncertainty Problem in Deterministic-DM

• Difficulty in making proper actions for given ambiguous input– Scalability Problem in POMDP-DM

• Difficulty in designing / tracking dialog state• Difficulty in training POMDP policy• Difficulty in eliciting system action

• Core idea of the hybrid architecture – Generate summary meta-actions with POMDP framework– Translate the actions into system output with Deterministic framework

60


• Concept diagram of proposed architecture

Ambiguous Input Meta Action

Meta Action Selector Service DM

Input

CorrespondingComponent

OutputMeta Action = Confirm

System Action

POMDPAction Selector

Service DM(Rule-based DM,

Example-based DM)

Meta Action = Submit

61


• Main architecture of proposed architecture

Tracker Part

TrackerModel

FeatureExtractor

Meta action selector

POMDPAction

Selector

SummaryState

Service Provider

ServiceDialog

Management

SlotDB

ResponseDB

User Input Recognition

ASR/NLUResult

Corresponding Architecture

POMDPModel

Ambiguous User Input

ASR/NLUResult

Tracked Result

MetaAction

PhonemeMatcher

Confirm 1st value

Request Slot Value

Provide Service Sentence

POMDPArchitecture

62


• Tracking Belief State– Estimation of observation 𝑜𝑜 from NLU hypothesis 𝐻𝐻 : 𝑃𝑃(𝑜𝑜|𝐻𝐻)

• Phoneme/Word-level Matcher• Example : 𝑃𝑃 𝑜𝑜𝑚𝑚𝑚𝑚𝑚𝑚𝑚𝑚 𝑎𝑎𝑚𝑚𝑙𝑙𝑙𝑙 = 𝑚𝑚𝑚𝑚𝑎𝑎𝑚𝑚 ≈ 0.78 ,𝑃𝑃 𝑜𝑜𝑙𝑙𝑒𝑒𝑒𝑒 𝑎𝑎𝑚𝑚𝑙𝑙𝑙𝑙 = 𝑚𝑚𝑚𝑚𝑎𝑎𝑚𝑚 ≈ 0.0

– Estimation of belief update : b s′ s = 𝑃𝑃(𝑠𝑠𝑠|𝑠𝑠, 𝑎𝑎, 𝑜𝑜)• Rule-based Tracking to relieve computational complexity

south north east west

Area

Probability

none

U1 : western food please?

S1 : How may I help you?

south north east west

Area

Probability

none

U2 : No, I don’t mean it

S2 : You mean west restaurant?

63


• Generating Meta-Action– Construction of Summary-State

• Bulid Summary-State for 1st , 2nd value in each slot value• Also Build “User Intention Slot” Summary-State for 1st , 2nd value

Blaise Thompson and Steve Young. “Bayesian update of dialog states : A POMDP framework for spoken dialog systems”Computer Speech & Language 2010 vol. 24 Issue 4. pp. 562-588

64


• Generating Meta-Action (cont.)– Construction of POMDP framework

• Construct separate POMDP framework for UI slot and NE slots• Train each POMDP framework independently

0

1

1st 2nd

b'(s)

0

1

1st 2nd

b'(s)


UI NE #1

POMDP Action Selector

Submit

Restart

Meta Response

SystemActionModel

Template DB

Slot DB

NE #2

0

1

1st 2nd

b'(s)

Output Sentence

Submit

65


• Generating Meta-Action (multiple slot values)– Construction of POMDP framework

• Construct separate POMDP framework for UI slot and NE slots• Train each POMDP framework independently

0

1

1st 2nd

b'(s)

0

1

1st 2nd

b'(s)


UI NE #1

POMDP Action Selector

Submit

Restart

Meta Response

SystemActionModel

Template DB

Slot DB

NE #2

0

1

1st 2nd

b'(s)

Output Sentence

Submit

UI – POMDP Model Training

NE – POMDP Model Training

Model Construction

UI – POMDP Model Training

NE – POMDP Model Training

Model Construction

66


• Experiment (Change of Reward on Learning curve)– Observing learning curve in training process

• Each POMDP Component were trained in 400 Epochs Convergence over reward was observed

-60

-50

-40

-30

-20

-10

0

10

0 100 200 300 400

Ave

rage

Rew

ard

Epoch

Average Reward [UI Slot]

-120

-100

-80

-60

-40

-20

0

20

0 100 200 300 400

Ave

rage

Rew

ard

Epoch

Average Reward [NE Slot]

Sangjun Koo, Seonghan Ryu, Kyusong Lee, Gary Geunbae Lee. Scalable summary-state pomdp hybrid dialog system for multiple goal drifting requests and massive slot entity instances. Proceedings of the international workshop series on spoken dialog systems (IWSDS 2015), Jan 2015, Busan

67

Ontology-based Inference System [1/3]Section

• Ontology-based inference– Integrate cross-domain knowledge by ontology and its inference rules

• Used for :– IOT dialog system– Smart home– Smart healthcare

68


• Ontology– OWL

• Family of knowledge representation languages for knowledge bases

• Inference rules– SWRL

• OWL-DL + RuleML– The datalog sublanguage of Horn clause

FastComputer(? c)← Computer ? c ⋀ hasCPU(? c, ? cpu)⋀hasSpeed(?cpu, ? sp)⋀HighSpeed(? sp)

69


Spoken Language Understanding

Natural Language Generation

Dialog Manager

Dialog Modeling

Knowledge Manager

Inference Engine

Ontology Resource

ASR output

Named entitiesUser intentions

System response

KnowledgeSystem action

Generated querystatement

70

An Example Scenario of Inference ProcessSection

Semantic Representation for Input Sentence

raw utterance sentence I want to eat something spicy.

intent ask_food_recommendationnamed entity something spicyAfter Searching Ontologiesintent ask_food_recommendation_in_fridge

named entity something spicyKnowledge for User/Environmentsspeaker Tomfavorite food Tteok-bokki, Spaghetti, ...fridge materials Tteok, hot pepper paste, spring onion,

...Knowledge for Reasoningpremise

1Tom likes Tteok-bokki, Spaghetti, ...

premise2

In the fridge, there are Tteok, hot pepper paste, spring onion, ...

premise3

The recipe of Tteok-bokki is Tteok, hot pepper paste, and spring onion

premise4

Tteok-bokki is a kind of spicy food

premise5

Tom now wants to eat something spicy

Output of DMsystem action suggest_foodoutput sentence template How about eating {food} ?

output sentence How about eating Tteok-bokki?

71

Experimental resultSection

Before inference

Added weather info

After inference

Jaiyoun Song, Seonhan Ryu, Sangjun Koo, Gary Geunbae Lee. Ontology reasoning-based intelligent assistant for smart home. Proceedings of the IEEE spoken language technology workshop (SLT 2014), Dec 2014, Nevada (demo presentation)

Deep Neural Network

Deep Neural Network

Neural network + Multiple non-linear hidden layers

Why Deep Neural Network?

Deep levels of abstraction Mimics the cognition process of human

From low level (simple) to high level (complex)

73

Why Deep Neural Network?

Integrated learning Automatic feature extraction

Traditional machine learning methods

Deep neural network : Feature extractor + Classifier

74

Deep Neural Network Today

Difficulties of DNN

1. Difficult to train

Cannot use back-propagation algorithm (vanishing gradients)

Unsupervised pre-training

2. Computation-intensive Many parameters

GPGPU, cloud computing

3. Over-fitting

regularization, drop-out technique

75


Deep Belief Network [Hinton 06]

Pre-train layers with an unsupervised learning algorithm

Then, fine-tune the whole network by supervised learning

DBN are stacks of restricted Boltzmann machines (RBMs)

76


Autoencoder

NN whose the output is the same as the input

To learn is to compress data

Autoencoder learns the encoding (representation) of data

�𝑚𝑚

𝑦𝑦𝑚𝑚 − 𝑥𝑥𝑚𝑚 2

Learn to minimize

77


Drop-out method [Hinton 12]

Drop out some weights randomly

Can reduce over-fitting problem

78


Rectified linear unit (ReLU) Activation function used instead of sigmoid

Sparse coding : only some neurons have non-zero values

𝑔𝑔 𝑥𝑥 = max(0,𝑥𝑥)

0

0

79


Convolution neural network [LeCun 98] Sparse network with local features within the window only

The weights are shared between windows

Popular for image recognition

Two kinds of layers, alternatively Convolution layer : Extract features from the previous layer

Max-pooling layer : Sub-sample by taking the maximum

80

Deep Neural Network Today for NLP

Word representation 1. One-hot vector

Ex. [0 0 0 0 0 …… 0 0 0 1 0 0 …… 0 0 0 0 0 0 0]

High dimension : 20K (speech) ~ 3M (Google 1T)

2. Class-based word representation

Hard clustering

Ex. Brown clustering (Brown et al. 1992)

3. Continuous representation

Ex. Latent semantic analysis (LSA)

Random projection

Latent Dirichlet Allocation (LDA)

HMM clustering

Distributed representations (Neural word embedding)

Dense vector

Used as pre-training and supervised training improves the representation

81


Neural network language model (NNLM) Language model

Model to predict the next word given the context

NN language model

Two hidden layers

Training complexity is high

Between hidden output

Ex. Hierarchical softmax

Negative sampling

Ranking (hinge loss)

w(t-3)

w(t-2)

w(t-1)

w(t)

input projection hidden output

82


Neural network language model (NNLM) Negative sampling (unsupervised training)

A word and its context is a positive sample

A random word in that context is a negative sample

Trained to be Score(positive) > Score(negative)

w(t-3)

w(t-2)

w(t-1)

w(t)

input projection hidden output

83


Word2vec Remove the hidden layer

1000x speed-up

Continuous bag-of-words (CBoW)

Predicts the current word given the context

Skip-gram

Predicts the context given the wordw(t-3)

w(t-2)

w(t-1)

w(t)

input projection output

84

DNN based Korean Dependency Parsing [Changki Lee,2014]

Transition-based + Backward parsing O(N)

Constituency corpus Dependency corpus

After pre-processing

Deep learning based ReLU + Dropout

Better than sigmoid

Korean Word Embedding NNLM, Ranking (hinge loss, logit loss)

Word2vec

Feature Embedding Auto tagged PoS (stack + buffer)

Dependency label (stack)

Distance information

Valency information

Mutual information Massive corpus automatic parsing 85

DNN based Multi-lingual Multi-task NLP [postech on-going]

Collecting corpus for all languages, all tasks is impossible

Adaptive learning

Methods to use one language/task’s information for another language/task

Distributed word embedding is essential

Language transfer One language to another language

Pre-train with one language and further train with another language

Multi-task learning

One task to another task

Hidden layers : trained with all tasks’ data

Output layer : trained with task-specific data Parsing

Tagging

Semanticrole labeling

*Sooncheol Kwon, Byungsoo Kim, Seonyeong Park, Sangdo Han, Gary Geunbae Lee. Multi-lingual knowledge transfer for dependency parsing using deep neural network, submitted

86

Demo – youtube postech isoft https://www.youtube.com/watch?v=4jg0Tknl-Rw

multi-domain dialog system multi-modal dialog system (smart home) Inference dialog system One-step asr error correction

Multi-strategy knowledge search

Question Answering (QA) systems

89

Multi-Source Hybrid Question Answering (QA) System

User Input

Keyword ProcessQuestion Processing

Sparql Query Search

Web basedRDF Triple Database

SparqlQuery

generator

SparqlQuerySearch

Answer Selection

Answer

AnswerMerging

DBpedia

Open Information Extraction (off line)

WebDocuments

(wiki)

RDF T ripleExtrac tion

Documentsindexing

Detect Question or Keywords

AnswerRanking

Question Keywords

Keyword to Entity

Keyword to Property

Query Generator

TripleExtractor

NaturalLanguage Generator

Report

TemplateDB

DocumentProcessing

Relevant WebDocuments

IR/web search(Lucene)

AnswerProcessing

PassageRetrieval

Passages

Possible A nswer Extrac tion &Formulation

Question Analysis

Slot,Template Extractor

Keyword,Answer Type

Extractor

Keyword,Answer Type

Slot, Template

90

Section

RDFKnowledge

Base

Human 지식베이스

Answer candidates Generation

Answer candidates

Features for Query

Lexical Alteration

Disambiguation

Merge retrieved

Real SPARQL generation

Focus, LAT, query template…Graph pattern for SPARQL

Synonym/alternative words

KB property/entity

• Template-based approach [Unger et al, WWW 2012]

Knowledge based QA System

Seonyeong Park, Hyosup Shim, Gary Geunbae Lee. Isoft at QALD-4: Semantic similarity based question answering system over linked data. in Cappellato, L., Ferro, N., Halvey, M., and Kraaij, W., (eds.), CLEF 2014 Labs and Workshops, Notebook Papers (Qald task). CEUR Workshop Proceedings, vol-1180, CEUR-WS.org (2014). Sept 2014, Sheffield

91

Section

Knowledge based QA System

Semantic parsing based approach [Berant et al, EMNLP 2013]

Where was Obama born?

92

QA on Knowledge base Section

• Semantic parsing based approach – Maps natural language sentences to formal semantic representations– Independent of word order, paraphrase– Translating into KB query language is relatively easy

– Requires

• Well-defined formal representation• Set of concepts (Knowledge base)

– Previous researches focused on toy-sized KBs, recent ones utilizes bigger, more general KBs like Dbpedia, Freebase such as Sempre, paraSempre.

93


• Semantic parsing based approach [Berant & Liang, ACL 2014]– Process

• Making segmentation– Generate segmentations of sentence

• Translate segments into KB vocabulary– Lookup each segment from KB vocabulary– Keep two dictionaries of KB concepts

» Dictionary of entity» Dictionary of property

– Match named entity by string similarity– Match property by natural language – property model

• Combining– Combine segment into single formal representation– Performed based on combining rule

• Rewrite to query language

94


• Semantic parsing based approach– Example

• Question : Where was Barack Obama born?• Segmentation

– [Where] was [Barack Obama] [born] ?

• Disambiguation– [Where] Type.Location– [Barack Obama] Barack_Obama– [born] PeopleBornHere

• Combining– Type.Location ^ PeopleBornHere.Barack_Obama

95


• Compiling natural language-to-property mapping– Aligning approach

• Align pseudo triples from text to ones from KB– Extract pseudo triple from text

» Ex> Mary Todd married Abraham Lincoln on November 4, 1842»

– Disambiguate to KB entities» »

– Align to existing “real” triples in KB» pseudo triple» real triple from KB

– Collect matched phrase-property pairs from aligned triples» prefix “!” means reverse order

96


• Experiment

– Test set• 138 questions from webquestions train sets• Wh-questions on figure with single answer• Containing no alternative forms of named entity

precision recall F1-score

0.7301 0.8911 0.8025

Seonyeong PARK, Hyosup SHIM, Sangdo HAN, Byeongsoo KIM, Gary Geunbae LEE. Multi-source hybrid question answering system. Proceedings of the international workshop series on spoken dialog systems (IWSDS 2015), Jan 2015, Busan (demo presentation)

97

Open Information Extraction [Etzioni, Wu@UW] Section

• Extract triples from an sentence.– triple format : < argument1 ; relation ; argument2 >– argument : noun phrases in an sentence– relation : phrase shows relationship between two arguments

• Ex) sentence : Gautama Buddha taught primarily in Northeastern Indiatriple : < Gautama Buddha ; taught in ; Northeastern India >

• Open IE does not require any pre-specified relations.• Suitable for IE on the Web scale.

98

Open Information ExtractionSection

Dependency Pattern based IE [WOE, Wu&Weld, ACL 2010]• Extraction template :

• Ex)

• Learning extraction template– collect training data (triple-sentence pair) automatically (bootstrapping)– learn extraction template from training data

arg1 arg2rel

nsubj prep_in< arg1 ; rel in ; arg2 >

Gautama Buddha taught primarily in Northeastern India

nsubjadvmod

nnamod

prep_in

triple : < Guatama Buddha ; taught in ; Northeastern India >

99


SRL based IE [Christensen et al, NAACL workshop 2010]• SRL : identifying arguments of a predicate with their roles

– possible to convert SRL result to Open IE triples

• Ex) Eli Whitney created the cotton gin in 1793– SRL result

• predicate : created• arg0 : Whitney (Eli Whitney)• arg1 : gin (cotton gin)• argm-TMP : in (in 1793)

– conversion to Open IE triple style• < Eli Whitney ; created ; cotton gin >• < Eli Whitney ; created cotton gin in ; 1793 >

100


Current Implementation (postech, combined)• Bootstrapped Dependency pattern + SRL result

– SRL : can only extract verb mediated relation with relatively high precision

– Bootstrapped Dependency pattern : can extract both verb and noun mediated relation

– Ex) Princeton economist Paul Krugman was awarded the Nobel prize• verb mediated relation :

– < Princeton economist Paul Krugman ; was awarded ; the Nobel prize >

• noun mediated relation : – < Princeton ; economist ; Paul Krugman >

– Apply SRL based extraction to verb mediated relation, and Dependency pattern based extraction to noun mediated relation

101


Experiment • Test Data

– Data from “Fader et al, Identifying Relations for Open Information Extraction, 2011, EMNLP”

– Randomly selected 500 web sentences -> We used 100 sentences among them

• Result–

*Byungsoo KIM, Hyosup SHIM, Sangdo HAN, Soonchoul KWON, Seonyeong PARK, Gary Geunbae LEE. Relation disambiguation using ontology type checking and semantic relatedness. Submitted

102

Applying Open IE to Knowledge Base Section

Knowledge Base Augmentation• Triples extracted from Open IE can be used to augment

existing knowledge bases– need argument and relation mapping to canonical form on the ontology

(disambiguation)

• Ex) Einstein married Elsa Lowenthal on 2 June 1919.– triple from Open IE

• < Einstein ; married ; Elsa Lowenthal >– disambiguation

• Einstein → Albert_Einstein• Elsa Lowenthal → Elsa_Einstein• married → spouse

– DBpedia ontology RDF triple• < dbr:Albert_Einstein ; dbo:spouse ; dbr:Elsa_Einstein >

103

Applying Open IE to Knowledge BaseSection

Disambiguation with Constraint• Relation phrases on the ontology have proper argument type.• Ex) < Alain_Connes ; birthPlace ; Draguignan >

< Ayn_Rand ; birthPlace ; Saint_Petersburg >< type:Person ; birthPlace ; type:Place >

< The_Birth_of_a_Nation ; director ; D._W._Griffith >

< type:Film ; director ; type:Person >

• Use this argument type constraint when disambiguating relation– disambiguate arguments first, then use argument type information for

relation disambiguation

104

Scenario• Find the appropriate answer to the user question using raw text

data

Information Retrieval-based QA

1. Where was Kim yunaborn?

2. When did Kim yuna got gold medal?

3. Who is Sotnikova?

User’s Question Raw Text

Text data

Answer

Bucheon, South Korea

Section

105

• Architecture

Information Retrieval-based QA

Question Processing

Answer Type Detection

Entity extraction

(NER)

Triple extraction

(Parser, SRL)

Document Processing

Question

PassagesScoring

Answer type Mapping to DBpedia

Answer Type, Keywords

Answer

Answer ProcessingAnswer Candidates

Extraction

Documents scoring

Passages

Text(Wikipedia) Database

Relevant Documents

Answer Selection

Answer Type

Section

106

• Answer Type is important !– It can reduce the search space to find answer.– Regard answer type as type of named entity – Use ontology in the knowledgebase.

Open Domain Semantic Answer Type

…

… …

Golf player Swimmer

TennisPlayer

Thing

AgentActivity Drug Event

Person

Athlete

Wrestler

Game Sport

GetMore detailInformation of answer

Section

Seonyeong Park, Donghyeon Lee, Seonghan Ryu, Byungsoo Kim, Gary Geunbae Lee. Hierarchical dirichlet process topic modelling for flexible answer type classification in open domain question answering. Proceedings of the 10th Asian information retrieval society conference (AIRS 2014), Sarawak, Dec 2014

107

• Hard to detect the answer type in the question using only lexical information.– Ex) Q: Who compose the “magic waltz” ? Answer Type: composer– Ex) Q: What did Bruce Carver die from? Answer type : reason

• Use Semantic Information in the Question

Open Domain Semantic Answer Type

Extract various information of input question

Input Questionquestion –answer pair

web log

Map the information to the class in ontology (need Inference)

Answer Type

Section

108

– Semantic answer type detector using Knowledgebase

Open Domain Semantic Answer Type Section

Extract property semantically similar with main verb in

DBpedia

Main verb, Focus, parsing result

DBpedia

Previous Hybrid(rule + supervised learning) Answer

type classifier

Question Parsing(Parser, SRL)

user question

Detect type of focus using type info of each property in

DBpedia

No

yes

Answer type Previous small size of answer type Ontology

Measuring semantic similarity between previous ontology and

DBpedia ontology

Answer type in DBpedia ontology

Answer type in DBpedia ontology

Focus: focus is the word which will be replaced with an answer. Therefore, type of focus is same as answer type.

109

– Example of Semantic answer type detector using Knowledgebase• Example 1

– Q: Who has been married to Tom Cruise?– Main verb: married– Focus: “Who”– Parsing information: who is the subject of married (main verb)– property semantically similar with main verb: wife– type of property information:– Answer type : person

• Example 2– Q: Who resides in the high-rise?– Main verb: resides– Focus: “Who”– Parsing information: who is the subject of resides(main verb)– property semantically similar with main verb: residence– type of property information:– Answer type: person


Gabrilovich, Evgeniy, and Shaul Markovitch. "Computing Semantic Relatedness Using Wikipedia-based Explicit Semantic Analysis." IJCAI. Vol. 7. 2007.

110

– Extract Answer Candidates using open domain semantic answer type detector


Keywords of the user question

Answer Type check

Answer Type

passages

Extract entities & recognize the type of entities

Answer Candidates

Open DomainSemantic Answer

Type Detector

user question

DBpediaSparql

Hoffart, Johannes, et al. "Robust disambiguation of named entities in text."Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2011.

DBpediaOntology

111

• Sentence Scoring in Passage– Kidman has been married twice: first to actor Tom Cruise, and now to country singer Keith Urban. S

he has an adopted son and daughter with Cruise as well as two biological daughters. …. Cruise Kidman is the adopted son of actors Tom Cruise and Nicole Kidman. …..An example of the extravagance of a wedding locations is when Tom Cruise married Katie Holmes……/ …He was replaced as team leader by [[Ethan Hunt]] ([[Tom Cruise]]) after he was revealed: Impossible (film)……………………

• Measuring sentence importance– Measuring between text and question similarity– Content Selection : Choose sentences to extract from the document– Query Focused Multi Document Summarization

Answer Candidate Selection Section

112

• Select Important Sentence– We consider various similarity measures

• Term similarity (Jaccard coefficient)

– We can use not only answer type but also other information

• Compare not only Answer type and entity similarity but also semantic and syntactic structure of sentences.

• syntactic (Dependency Parser ) • semantic level (Semantic role labeler)

Answer Candidate Selection Section

113

Textual Entailment Answer Selection

T: Passages

H: Substituted sentence

Check the type of entities in the sentences (same as answer type or not) and select

important sentences

Answer Type

Answer Candidates

Answer Candidates scoring using Textual Entailment

Substitute Focus with Answer candidates

Answer

Section

Patent pending

KBQA Answer

114

• Textual Entailment– Text(t) : Entailing text ( Candidate sentence)– Hypothesis(h) : Entailed text (question)– Ex)

• true entailment– t : John is a fluent French speaker– h : John speaks French

• false entailment– t : John was born in France– h : John speaks French

– given t/h pair, cast the textual entailment task as a classification problem

Textual Entailment Answer Selection Section

115

• Textual Entailment– Other Textual Entailment Architecture

Textual Entailment Answer Selection Section

Collections of NLPApache UIMA framework

Standardized algorithms or knowledge resource (knowledge resources/ lexical syntactic resources).Different approach: transformation based, edit distance based, and classification based

Magnini, Bernardo, et al. "The Excitement Open Platform for Textual Inferences.“, ACL demo

Keyword QA

Goal Get keywords as input, return report answer

messi team manager

Lionel_Messi play at FC_Barcelona, Argentina_national_football_team.

Tito_Vilanova is manager of FC_Barcelona.

Extracted Data

Search from Database

116

Keyword QA

Query Generator

Natural Language Generator

KeywordTo

Entity

KeywordTo

Property

TemplateDB

Knowledge DB

Triple Extractor

Keyword Process

DB Entity DB PropertyProperty

Candidates

Keyword Input

Report

SPARQL QueryQuery

TripleTriple Set

Messi team manager

Lionel_Messi team

Person/heightbirthDatebirthPlacecareerStationNumberPositionTeam…

SELECT ?p ?o WHERE…

Lionel_Messi team FC_Barcelona

Lionel_Messi team FC_BarcelonaFC_Barcelona manager Tito_Vilanova

…

FC_Barcelona is Lionel_Messi’s team. FC_Barcelona’s manager is Tito_Vilanova.

Keyword Segmentator Messi, team, manager

117

http://...lionel_messi/

Keyword QA

118

Keyword Segmentation Segmentation using Lexicon

Wikipedia Lexicon + additional lexicon

Longest Match

Lionel messi team manager

Lionel messi, messi, team, manager,

birthday, …

Lionel messi, team, manager

Keyword QA

Keyword-Entity/Property Matching Module

Match user input keyword to entity/property of DB messi -> Lionel_Messi

Kim yuna -> Kim_Yu_Na

manager -> manager (property)

birth -> birthDate

AIDA (open source) Named entity disambiguation module

Match to wikipedia entity

ESA (open source) Word semantic similarity

Keyword-Entity Matching Module

Messi team

Lionel_Messi team

119

Keyword to Entity AIDA module (open-source)

Accurate Online Disambiguation of Named Entities

Find named entity, and match to Wikipedia page

Entity Matching

Input : “When did Barack Obama graduated Harvard Law School?”

120

Keyword to Property Semantic match between keyword & property

Explicit Semantic Analysis Module (open source)

Property Matching

Tom cruise

birthDate 1962-07-03

birthPlace Syracuse, New York, United States

religion Scientology

spouse Mimi_Rogers

starring Interview_with_the_vampirestarring Top_Gun

… …

Tom cruise’s triple example

Keyword ExampleTom cruise, birthday

121

Keyword to Property Semantic match between keyword & property

Explicit Semantic Analysis Module

Property Matching

Tom cruise

birthDate 1962-07-03

birthPlace Syracuse, New York, United States

religion Scientology

spouse Mimi_Rogers

starring Interview_with_the_vampirestarring Top_Gun

… …

Tom cruise’s triple example

Keyword ExampleTom cruise, wife

122

Keyword QA

Query Generator

SPQRQL query generation Extract related triples

Rule-based

Query Generator

Lionel_Messi team

SELECT , ?p, ?o WHERE { …

123

Query Generating Policy

Lionel_Messi

Person/heightteam

birthPlace…

Argentina_national_football_team

FC_BarcelonaFC_Barcelona_B

…

coachstadiummanager

…169.2

ArgentinaRosario,_Santa_FeSanta_Fe_Province

…

capacitychairmanmanager

…

areaTotalcapital

currency…

: Entity

: Property

………

Input keywords : messi, fc barcelona, manager

124

Query Generating Policy

Lionel_Messi

Person/heightteam

birthPlace…

Argentina_national_football_teamFC_Barcelona

FC_Barcelona_B…

coachstadium

manager…

169.2

ArgentinaRosario,_Santa_FeSanta_Fe_Province

…

capacitychairmanmanager

…

areaTotalcapital

currency…

: Entity

: Property

………

Input keywords : messi, team, manager

125

Keyword QA

Report Generator

Report triple set, template Property-template matching data

534 templates generated

Report Generator

Barak Obama graduated ColumbiaUniversity, and

HarvardLawSchool. …

Barak_Obama almaMater ColumbiaUniversity

Barak_Obama almaMater HarvardLawSchool

… … …

almaMater graduated

birthPlace borned in

… …

Extracted Triple Set

Template Set

126

Keyword QA

NLG Template Generator

Automatic Template Extraction Wikipedia-dbpedia

Template Generator

teamposition

…(properties)

play at plays as a plays as a

127

Keyword QA

Keyword Segmentation (81.64%) Whole data – 670 keyword queries

Well-segmented – 547 queries

Error - 123 queries

Out of vocabulary (segmentation lexicon)

System Answer Accuracy (95.1%) Whole data – 670 keyword queries

Right answers – 637 answers

Wrong answers – 33 answers

Error case : property / entity matching error

Sangdo Han, Hyosup Shim, Byungsoo Kim, Seonyeong Park, Seonghan Ryu, Gary Geunbae Lee. Keyworkd question answering system with reportgeneration for linked data. Proceedings of the 2015 International Conference on Big Data and Smart Computing (BigComp 2015), Jeju, Feb 2015 (short paper) 128

Demo movie Youtube postech isoft QA

http://www.youtube.com/watch?v=P6yL5QiJQo0 KBQA IRQA Keyword QA OpenIE

http://www.youtube.com/watch?v=P6yL5QiJQo0

Multi-party Open Proactive Dialog Systems

131

• Overall Architecture

Overall Architecture

Module Description

Situation Feature Extraction Extract Feature from Situation (Voice Activity Detection, Speaker Detection, Previous Info Stacking, ETC..)

Dialog Engagement Classify Dialog Engagement for Each Speaker ID

Speaker Identification Classify Speaker and Assign New Speaker ID

Always Listening ASR Recognize All Speech Sound

Sentence Formation Regularity Checking Check Sentence Formation Regularity for Dialog Situation Feature

Speaker ID Assignment to Sentence Assign Speaker ID to All Sentence from ASR

ASR Error Correction Correct ASR Error before Passing Sentence to the Next Step

Multiparty Language Understanding Language Understanding for Multiparty

Multiparty Dialog System Dialog System for Multiparty

Dialog Engagement

(CNN)

Always Listening ASR (Voice

Activity Detection )

Sound Signal

SituationFeature

Extraction

Dialog Situation(Vision)

Speaker Identification

Natural Language

Understanding &

Dialog Management

For Multiparty

Speaker IDAssignment to

Sentence

Sentence Formation Regularity

Checking (RNN)

ASR Error Correction

(RNN + Several Method)

132

• Dialog Engagement– Dialog Engagement to PC– Classify Dialog Engagement for Each Speaker ID

• ScenarioA: Let’s have a dinner outside, in some fancy restaurant!B: Great! Where should we go to?C: I like FANCYFANCY restaurant.A: Yeah, FF restaurant is good.B: Then let’s go there to have dinner.A: But we brought our car in for servicing this morning. How can we go there?B: Maybe we should take a taxi and go to the repair shop. Where was it?PosChat: NiceCar repair shop is where you brought your car in.B: Okay. Then we go to the shop and go to there for dinner. Make a reservation at 7 p.m., PosChat.PosChat: Okay. I’ll make a reservation to FF restaurant for three people at 7 p.m.

Dialog Engagement

Engage

Engage

Engage

Non-engageNon-engage

Bohus, Dan, and Eric Horvitz. "Models for multiparty engagement in open-world dialog." Proceedings of the SIGDIAL 2009 Conference: The 10th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics, 2009.

133

• Dialog Engagement Architecture

Dialog Engagement

Engage

Non-engageNon-engage

Camera Face Direction

Lips Tracking

Voice Activity Detection

Always Listening ASR

Recognition Result

EngagementClassifier

Dialog Engagement

Model

DialogManagement

Learning by DNN

EngagementPrevious

State

Dialog ManagementPrevious

State

Junhwi Choi, Jeesoo Bang, Gary Geunbae Lee. “Multiparty open-world dialog system on NAO robot”. Proceedings of SLT 2014, Dec 2014, Nevada (demo presentation)

134

• Automatic ASR Error Correction– Two Step Process– ASR Error Detection

• Part of Speech Information based detection• Context based detection

– ASR Error Correction• Word Sequence Matching based Correction• Recurrent Neural Network based Correction

ASR Error Correction

Current Syllable

Previous Syllable Context

Confused Phoneme of Next Syllable

Probabilityof Next Syllable

ASR Error Correction Performance 1-WER

Basline (no Correction) 0.8357

Word Sequence Pattern Matching 0.8813 (27.8% Error Reduction)

Syllable RNN Only 0.8382 (1.5% Error Reduction)

Combined RNN (Syllable RNN + Phoneme RNN) 0.8480 (7.5% Error Reduction)

Word Sequence Pattern Matching + Combined RNN 0.8820 (28.1% Error Reduction)

ASR Error Detection Performance

F-Score DetectionAccuracy

POS Label Pattern 0.4744 0.8266

Word Dictionary by POS 0.3452 0.8653

Word Co-occurrence 0.4143 0.7587

Voting (threshold 2) 0.4967 0.8761

Voting (threshold 1) 0.4879 0.7337

Junhwi Choi, Donghyeon Lee, Seonghan Ryu, Kyusong Lee, Gary Geunbae Lee, ”Engine-independent ASR error management for dialog systems”, IWSDS 2014Junhwi Choi, Seonghan Ryu, Kyusong Lee, Younghee Kim, Jeesoo Bang, Seonyeong Park, and Gary Geunbae Lee. “ASR Independent Hybrid Recurrent Neural Network based Error Correction for Dialog System Applications”, Proceedings of the MA3HMI 2014 Workshop, Satellite workshop of INTERSPEECH 2014.

135

• Manual ASR Error Correction• ASR Error Correction Interface with Voice-only

• ScenarioA: Let’s have a dinner outside, in some fancy restaurant!B: Great! Where should we go to?C: I like FANCYFANCY restaurant.A: Yeah, FF restaurant is good.B: Then let’s go there to have dinner. Okay. Then we go to the shop and go to there for dinner. Make a reservation at 7 p.m., PosChat.PosChat: Okay. I’ll make a reservation to Effa restaurant for three people at 7 p.m.A: FF restaurant.PosChat: Okay. I’ll make a reservation to FF restaurant for three people at 7 p.m.

One-step Error Correction

User Utterance

Analysis Region

Detection

User Intention Understanding Correction

Proceed Dialog

Management

Confirmation(Optional)

Junhwi Choi, et al. "Seamless error correction interface for voice word processor." Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on. IEEE, 2012.Junhwi Choi, Seonghan Ryu, Kyusong Lee, Younghee Kim, Jeesoo Bang, Seonyeong Park, and Gary Geunbae Lee. “ASR Independent Hybrid Recurrent Neural Network based Error Correction for Dialog System Applications”, Proceedings of the MA3HMI 2014 Workshop, Satellite workshop of INTERSPEECH 2014.

Understand User Intention

Proceed Correction

136

• User Intention Understanding– Characteristic of Clear Speech (Prosodic: Pitch, Duration, Intensity)

– Characteristic of ASR Error (Pronunciation Similarity)– Accuracy of User Intention: 84.62%

One-step Error Correction

Junhwi Choi, Seonghan Ryu, Younghee Kim, Gary Geunbae Lee. “One-step error detection and correction approach for voice word processor”, (In preparation)

137

• Scenario

Long-term Memory Chatting System

Hi, I’m John.

Nice to meet you.

I’ll remember that.

Do you know where I live?

You are living in Pohang.

I’m not good at foreign language.

I live in Pohang.

(I, be, John)(I, live in, Pohang)

(…, …, …)

Long-term Memory

User Utterance System Response

Hi, I’m Jane. Nice to meet you.

Do you know where I live? You are living in Seoul.

Do you know where I live? I don’t know about that.


But I heard that you can speak Chinese.


Then, can I help you?

Example Database

Then, can I help you?

Jeesoo Bang, Hyungjong Noh, Yonghee Kim, and Gary Geunbae Lee. Example-based Chat-oriented Dialogue System with Personalized Long-term Memory. Proceedings of the 2nd International Conference on Big Data and Smart Computing (BigComp 2015), 2015.

138

• Architecture of the Chatting System

Architecture of the Chatting System

139

1. Extract user-related facts (triples) from user inputs, and store them into the long-term memory

2. Modify the system response by applying user-related facts

3. Select the most appropriate response


I’ll remember that.I live in John.

(I, be, John)(I, live in, Pohang)

Long-term Memory

User utterance System response


You are living in Seoul.


You are living in Pohang.

140

1. Knowledge Extractor• Extract user-related facts from user inputs, and store them into the long-

term memory

• RDF-style triple: trp = (arg1, rel, arg2)– arg: noun phrase– rel: textual fragment indicating semantic relation between two args– E.g. I like red apples. (I, like, red apple)


141

1. Knowledge Extractor• Long-term Memory (LTM)

– Define two types of triple patterns• Triple pattern with SBJ slot (e.g. (SBJ, be, my friend))• Triple pattern with OBJ slot (e.g. (I, like, OBJ))

– Matched triples are stored in the Long-term Memory


Triple patterns(SBJ, be, my friend)

(I, like, OBJ)(My name, be, OBJ)(I, can speak, OBJ)

…(Harry, be, my friend)

Matched triples(Harry, be, my friend) Long-term

Memory

Personal Knowledge Manager

User InputHarry is my friend.

Knowledge Extractor

142

2. Personal Knowledge Applier• Apply User-related facts to System Response Candidates

– For 𝑡𝑡𝑡𝑡𝑡𝑡𝑠𝑠𝑠𝑠 extracted from a response (candidate) 𝑠𝑠𝑠𝑠– Replace arg2 (arg1) of the 𝑡𝑡𝑡𝑡𝑡𝑡𝑠𝑠𝑠𝑠 with the arg2 (arg1) of a user-related triple,

when the two triples are similar enough except those arg2 (arg1)


objectmy

name

is

Chuck

subject

predicateTriple extracted

from system response(candidate)

objectMy

name

is

Bruce

subject

predicateTriple inLong-term Memory

Oh, your name is Chuck.Bruce

143

3. General Score• Put weight on the system response which is general• Assumption: general response has many similar responses in the example

database.

– E: the example database– e = (su, ss): an example; su is a user utterance, ss is a system response– sim(s1, s2): weighted dice similarity between two sentences s1, s2– … for any sentence s = {w1, w2, …, wn}; w is a word

– userIDF(w) = log(|E|/cnt(w)) … approximation for short sentences– cnt(w): the frequency (the number of occurrence) of the word w in E


‖𝑠𝑠‖ = �𝑢𝑢𝑠𝑠𝑚𝑚𝑡𝑡𝐼𝐼𝐼𝐼𝐼𝐼(𝑤𝑤)𝑤𝑤∈𝑠𝑠

144

• Anaphor: a word or phrase that refers back to an earlier word or phrase– My mother said she was leaving.

• ScenarioA: My best friend is Seonghan. His favorite fruit is strawberry.B: I like strawberry, too.A: Oh, I didn’t know that.B: My best friend is Sangdo. He likes computer games. His favorite game is FIFA online.A: Today is Seonghan’s birthday. Could you recommend a present for Senghan?PosChat: You can give him what he likes. You said Seonghan’s favorite fruit is strawberry.A: That’s good idea. B: Hmm… I am bored. Do you have any recommendations?PosChat: You can play computer games with your best friend. Sangdo’s favorite game is FIFA online.

Multi-party Chatting System

145

• Discourse stack– Stores the contents of multiparty dialog texts in structured format for

anaphora resolution

Multi-party Chatting System

Sentence Information

That’s good idea. DA: statement

Could you recommend a present for Senghan?

DA: yn_q

Today is Seonghan’s birthday. DA: statement

Oh, I didn’t know that. DA: statement

His favorite fruit is strawberry.

DA: statement

My best friend is Seonghan. DA: statementPerson: Senghan

Sentence Information

Do you have any recommendations?

DA: yn_q

Hmm… I am bored. DA: statement

His favorite game is FIFA online.

DA: statement

He likes computer games. DA: statement

My best friend is Sangdo. DA: statementPerson: Sangdo

I like strawberry, too. DA: statement

Sentence

You can play computer games with your best friend. Sangdo’s favorite game is FIFA online.

You can give him what he likes. You said Seonghan’sfavorite fruit is strawberry.

[I, like, strawberry]

…

[My best friend, be, Seonghan]

…

A stack B stack Poschat

A LTM B LTM

Junhwi Choi, Jeesoo Bang, Gary Geunbae Lee. “Multiparty open-world dialog system on NAO robot”. Proceedings of SLT 2014, Dec 2014, Nevada (demo presentation)

146

Distributed Word Representation Matching

Distributed word representation: n-dimensional vector

Can capture distributional syntactic and semantic information

Recursive Autoencoder (RAE) Combine word representations into vector representations of longer

phrases

The cats catch mice

147

Distributed Word Representation Matching

Paraphrase identification using distributed word representation

The cats catch mice Cats eat mice

1 2 3 4 1 2 3

6 5 4

7 5

1 2 3 4 51234567

Variable-sized similarity matrix

Fixed-sized matrix

Dynamic Pooling Softmax classifierParaphrase

Socher, Richard, et al. "Dynamic pooling and unfolding recursive autoencoders for paraphrase detection." Advances in Neural Information Processing Systems. 2011.

Emotional Dialog System - Issue Emotion-Based Dialog Strategy

Emotional Strategy Video / Audio Input

Facial Expression Angle of Mouth Angle of Eyes

Prosody, Accent Emotion Detection

Angery, Sad, Happy, Scared Emotional Keyword Detection

Ex) 슬퍼 -> Sad

Dialog Strategy Appropriate Strategy for Emotion

Emotion : Sad

My dog died yesterday

You look sad, cheer up!

User

AgentSangdo Han, Kyusong Lee, Donghyeon Lee, Gary Geunbae Lee. Counselling dialog system with 5W1H extraction. Proceedings of the 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL 2013), Aug 2013, Metz

Humor Dialog System - Solution Architecture

Natural Language Understanding

Chat-oriented DM

Humor Generation

Utterance Selection

Linguistic Analysis(DA, NE)

Natural Language Understanding

User Utterance

System Utterance

Utterance Selection

Dialog Manager

Natural LanguageGeneration

Chat-oriented DM

Humor Manager

Humor Generation

Humor GenerationModule 1

Humor GenerationModule 2

Humor GenerationModule N…

Keyword Analysis

④

③②

①

Korean Patent filed: 10-2013-0008478

Counseling System Features

Based on Counseling Skills Open-Ended Questioning Paraphrasing Reflect feeling

Based on User’s Emotion Strategy based on Emotion

Virtual Agent Talk with 3D Virtual Agent Emotion-based Attitude

Speaker Utterance

User My friend bugs me a lot

Agent Why does your friend bug you?

User He beats me saying I am dirty

Agent Wow! You must be angry

User Ya! I am depressed

Agent Since when did it happen?

Sangdo Han, Younghee Kim, Gary Geunbae Lee. Micro-counselling dialog system based on semantic content. Proceedings of the international workshopseries on spoken dialog systems (IWSDS 2015), Jan 2015, Busan

Demo video postech isoft

(NAO robot) https://www.youtube.com/watch?v=pcz228RDTlk

Counselling dialog system

Demo

https://www.youtube.com/watch?v=pcz228RDTlk

Dialog system for English education

Gary Geunbae LeeIntelligent Software Lab.

CALL:POSTECH approach to CALL

Intelligent Software Lab. 153

StudentModel

Learner

PronunciationProsodyGrammar

Error Detection Game Environment

Utterance Suggestion

DB-CALL

PronunciationProsody

Grammar

Error Feedback

Data Server

CALL:POSTECH approach to CALL

Intelligent Software Lab. 154

PRONUNCIATION PROSODY GRAMMAR Dialog system / Game

DETECTION & FEEDBACK

DETECTION & FEEDBACK

DETECTION & FEEDBACK DB-CALL & Gameplay

Pronunciation Training & Assessment

Intonation

Phrase Break

Grammar Error Simulation

Grammatical Error Detection

Data Collection

Various Platf orm(Mobile, Tablet PC)

Student Model

Pronunciation Detection / Feedback

Pronunciation Error Simulation

Stress/Rythm

English Tutoring System

3D virtual Environment

Pronunciation Assessment and Training:Architecture & data flow

Jisoo Bang, Jonghoon Lee, Gary Geunbae Lee, Minhwa Chung. A pronunciation variants prediction method for Korean learner’s mispronunciation detection.(accepted) ACM Trans. on Asian Language Information Processing (TALIP)

Prosody Assessment and Training:Definition– What is prosody?

• English is one of the stress-timed languages• Prosody consists of rhythm, stress and intonation

– Rhythm• Determined by the beats occurring in regular patterns

– Between stressed and unstressed syllables• We derived rhythm from sentence stress patterns

– Intonation• Pitch fluctuations in utterances Showing high degree of freedom Requiring prosodic components with relatively low degree

• Integrating pitch accent, phrase accent and boundary tone

156

Prosody Assessment and Training:Architecture including feedback provision

Alignment

Text TextAnalysis

Speech Analysis

ProsodyPrediction

Model

Rule ApplicationRules

PredictedProsody

ModelTraining Model

ProsodyDetection

DetectedProsody

FeedbackDiff.

TextAnalysis

Text

Speech Signal

ModelTraining

Sechun Kang, Gary Geunbae Lee, Ho-Young Lee, Byeongchag Kim. An automatic pitch accent feedback system for english learners with adapatation of english corpus spoken by Koreans. Proceedings of the 2012 IEEE workshop on spoken language technology (SLT2012), Dec 2012, Miami

157

Prosody Assessment and Training:Rhythm’s user interface

– Component interface view

• Words: the recognized (or given) text• Canonical: sentence stress prediction results• Actual: sentence stress detection results

• Score:DetectionPrediction

}Detection)Prediction( B ,Conf | {B1 B

∪∩∉≥

−τ

158

Collecting Grammar Error Data: POLC:Picture description task

• From English learners of Korean• Story Telling based on pictures• 80 Students (5 tasks for each student)

Hongsuck Seo, Kyusong Lee, Gary Geunbae Lee, Soo-Ok Kweon, Hae-Ri Kim. Grammatical error annotation for Korean learners of spoken English. Proceedings of the 8th international conference on language resources and evaluation (LREC2012), May 2012, Istanbul

Collecting Grammar Error Data: Error tagsets

• JLE Tagset– Consisting of 46 tags– Systematic tag structure– Some ambiguity caused by POS specific error tag structure

• CLC Tagset– World-widely used tagset including 76 tags– Systematic & Taxonomic tag structure– JLE issue is figured out by taxonomic tag structure

• NUCLE Tagset– 27 error tags– Quiet arbitrary tag structure

• UIUC Tagset– Only for articles and prepositions

TextErroneous TextGrammatical Error

Simulation

ASR ASR’

N-gram LM

Merged Hypotheses

Error-typeClassifier

GrammaticalityChecker

N-gram LM

Feedback

Error PatternsError Frequency

Grammar Assessment and Training:Grammar error detector architecture

Sungjin Lee, Hyungjong Noh, Kyusong Lee, Gary Geunbae Lee. Grammatical error detection for corrective feedback provision in oral conversations. Proceedings of the 25th AAAI conference on artificial intelligence (AAAI-11), Aug 2011, Sanfransisco

Grammar Assessment and Training:Grammatical Error Simulation

Automatic Speech Recognizer

Grammar Error Simulator

Incorrect Sentences

Correct Sentences

Error Types

Sungjin Lee, Gary Geunbae Lee. Realistic grammar error simulation using markov logic. Proceedings of the ACL 2009, August 2009 Singapore (short paper)

Spoken Dialog System DB-CALL System

Dialog-based Language Learning System: Dialog-based CALL system

Cheongjae Lee, Sangkeun Jung, Kyungduk Kim, Gary Geunbae Lee. Hybrid approach to robust dialog management using agenda and dialog examples. computer speech and language, 24 (4): 609-631, Oct 2010

Dialog-based Language Learning System:The Framework of Ranking DM

Scoring Module

User Intention: SLU N-best(System Intention)

CalculatedScores

Next System Intention(User Intention)

Ranking various scores Robust system action

Hyungjong Noh, Sungjin Lee, Kyusong Lee. Gary Geunbae Lee. Ranking dialog acts using discourse coherence indicator for English tutoring dialog systems. Proceedings of the 3rd international workshop on spoken dialog systems technology (IWSDS 2011), Sept 2011, Granada Spain

Dialog-based Language Learning System:POMY system architecture

Kyusong Lee, Soo-ock Kweon, Hyungjong Noh, Gary Geunbae Lee. Postech Immersive English Study (POMY): Dialog-based Language Learning Game.(accepted) IEICE transactions on information and systems

Pre-test Post-test Mean

Category N Mean SD Mean SD difference p

Listening 25 56.4 16.6 71.2 20.9 14.8 0.0001**

Vocabulary 25 74.0 31.4 117.6 32.7 43.6 0.0001**

Speaking 25 Pronunciation 25 42.08 6.80 44.48 6.80 2.40 0.0001**

Grammar 25 36.56 8.45 42.40 6.95 5.84 0.0001**

# of Words 25 136.31 55.30 170.04 80.88 33.73 0.003**

Table 1. Overall

Dialog-based Language Learning System:Cognitive effect on overalls students

• Significantly Improved• Students Spoke more words in post test

Demo video (Postech isoft dbcall) (postechisoft pesaa)

https://www.youtube.com/watch?v=k0TAdfngZpU

Robot dbcall system 2013 pesaa system

Demo

https://www.youtube.com/watch?v=k0TAdfngZpU

Thank You & QA

Siri, Watson and Natural Language ProcessingContentsSiri, Watson and NLPApple SiriSiri – your wish is its commandSample Dialogs (chatting)Sample Dialogs (tasks)Architecture�Google NowMS CortanaQuestion Answering: IBM’s WatsonTypes of Questions in Modern Systems슬라이드 번호 13IBM Watson Platform and ApplicationIBM Watson - Recent ApplicationsIBM Watson – EcosystemLanguage TechnologyWhat’s hard – ambiguities, ambiguities, all different levels of ambiguitiesWhy else is natural language understanding difficult?Levels of LanguageRecent Trend of Application using NLPRecent Trend of Application using NLP/AIIOT2H (Internet of things to Human)Multi-domain ontology reasoning dialog systems for intelligent assistantSPOKEN DIALOG SYSTEM (SDS)Interactive Question AnsweringSDS APPLICATIONSASR (automatic speech recognition)SPEECH UNDERSTANDING (in general)REPRESENTATIONKnowledge-based SystemsHOW TO SOLVE: STATISTICAL APPPROBLEM FORMALIZATIONMACHINE LEARNING FOR SLUDIALOG MANAGEMENTDESIGN ISSUESDESIGN ISSUESDIAL