![Page 1: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/1.jpg)
U1, Speech in the interface: 1. Introduction 1
Module u1:
Speech in the Interface1: Introduction
Jacques Terken
HG room 2:40tel. (247) 5254
![Page 2: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/2.jpg)
U1, Speech in the interface: 1. Introduction 2
contents
1. Aims and overview of course 2. Speech interfaces 3. Usability issues: introduction 4. Project
![Page 3: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/3.jpg)
U1, Speech in the interface: 1. Introduction 3
Aims
Acquire insight into usability issues and obtain an overview of state of the art for speech in the interface
Obtain hands-on experience with design of speech-centric interface
Exercise project skills (organisation, collaboration, report, presentation)
![Page 4: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/4.jpg)
U1, Speech in the interface: 1. Introduction 4
Overview of Module
Introduction Dialog management Speech input technologies Speech output technologies Multimodal interaction Evaluation Human Communication
Exercises and project
![Page 5: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/5.jpg)
U1, Speech in the interface: 1. Introduction 5
contents
1. Aims and overview of course 2. Speech interfaces 3. Usability issues: introduction 4. Project
![Page 6: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/6.jpg)
U1, Speech in the interface: 1. Introduction 6
Speech in the interface
Non-
Interactive
Interactive
Online Monitoring speech communications, Live speech processing
Dialogue systems
Offline Speech data-mining
X
![Page 7: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/7.jpg)
U1, Speech in the interface: 1. Introduction 7
Markets and applications
R. Moore 2005
![Page 8: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/8.jpg)
U1, Speech in the interface: 1. Introduction 8
Speech interfaces
Conversational interfaces:
natural language interaction with machines (Star Trek syndrome)
Command & Control applications:
voice-based equivalent of command-line interfaces and button interfaces (utterances need to adhere to strict grammar)
![Page 9: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/9.jpg)
U1, Speech in the interface: 1. Introduction 9
Components of conversational interfaces
Speechrecognition
Natural Language Analysis
DialogueManager
SpeechSynthesis
LanguageGeneration
Application
![Page 10: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/10.jpg)
U1, Speech in the interface: 1. Introduction 10
Spin-offs
Speechrecognition
Natural Language Analysis
DialogueManager
SpeechSynthesis
LanguageGeneration
Application (e.g. MS-Word)
1. Dictation systems: what you say
![Page 11: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/11.jpg)
U1, Speech in the interface: 1. Introduction 11
2. Command-control: what you mean
Speechrecognition
(Natural Language) Analysis
DialogueManager
SpeechSynthesis
LanguageGeneration
Application (e.g. stereo)
![Page 12: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/12.jpg)
U1, Speech in the interface: 1. Introduction 12
3. Text-to-speech conversion
Speechrecognition
Natural Language Analysis
DialogueManager
SpeechSynthesis
LangGeneration:prosody
Application (e.g. E-mail)
![Page 13: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/13.jpg)
U1, Speech in the interface: 1. Introduction 13
contents
1. Aims and overview of course 2. Speech interfaces 3. Usability issues: introduction 4. Project
![Page 14: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/14.jpg)
U1, Speech in the interface: 1. Introduction 14
Speech in HCI: “yes please”
Among others Zue (MIT):
Speech will be key technology of the 21st century
![Page 15: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/15.jpg)
U1, Speech in the interface: 1. Introduction 15
Background Zue c.s.:
– Aim: developing the conversational interface– Motivation: natural language interaction is the
most natural form of communication (learned at a very early age); among other things very efficient error handling
![Page 16: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/16.jpg)
U1, Speech in the interface: 1. Introduction 16
Advantages of speech direct access to functionality supports mobility suited for hands busy/dirty - eyes busy situations no special motor abilities needed, optimal
compatibility with communicative abilities of users compatible with trend towards miniaturisation of
equipment
![Page 17: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/17.jpg)
U1, Speech in the interface: 1. Introduction 17
Maturity hypothesis
Speech interfaces not yet mature because of complexity of technology:
– R.K. Moore:
“Spoken language interaction is the most sophisticated behaviour of the most complex organism in the known universe”
![Page 18: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/18.jpg)
U1, Speech in the interface: 1. Introduction 18
Phylogenetic argumentation
First: direct manipulation (“you do what i want”)
Later: symbolic manipulation (cf. management, commercials)
Physical manipulation and violence considered primitive
![Page 19: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/19.jpg)
U1, Speech in the interface: 1. Introduction 19
Ontogenetic argumentation
Russian educational psychology (Galperin):– knowledge acquisition starts with direct
manipulation– later-on symbolic manipulation
”stay off” warning to children: “look with your eyes not with your hands”
![Page 20: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/20.jpg)
U1, Speech in the interface: 1. Introduction 20
Therefore
Direct manipulation phylogenetically and ontogenetically more primitive and less complex
Maturity hypothesis: same trajectory for HCI:
first direct manipulation
then symbolic manipulation (speech)
![Page 21: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/21.jpg)
U1, Speech in the interface: 1. Introduction 21
However UI design principles (Schneiderman ‘86):
– transparency: continuous representation of objects and actions
– fast, incremental and reversible operations with immediate effect
– physical actions or labelled buttons, avoid complex syntax/natural language as much as possible
Design principles difficult to realise in speech interfaces
![Page 22: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/22.jpg)
U1, Speech in the interface: 1. Introduction 22
In addition, language and speech technology is not (yet) very robust, and development costs are high
Getting towards the application semantics is more complicated for (natural) language than for direct manipulation
Finally: HCI is domain in its own right, so there is no a priori reason to model HCI after HHI
SO: avoid natural language
![Page 23: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/23.jpg)
U1, Speech in the interface: 1. Introduction 23
Speech interfaces: yes or no
Speech not suited for all kinds of information or situations
(e.g. “a picture is worth a thousand words”) Nevertheless, speech is useful under certain
conditions, e.g.– hands busy - eyes busy– mobility, miniaturisation– disabilities (CTS/RSI!)
![Page 24: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/24.jpg)
U1, Speech in the interface: 1. Introduction 24
use interface design guidelines for design of speech interfaces
e.g. http://www.larson-tech.com/MMGuide.html and in return: offer human communication theory as
model for HCI
![Page 25: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/25.jpg)
U1, Speech in the interface: 1. Introduction 25
Speech interfaces (SI) and Direct-manipulation interfaces Main problems with speech interfaces:
– no external support for functionality– unreliability of input technology
![Page 26: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/26.jpg)
U1, Speech in the interface: 1. Introduction 26
Dealing with unreliability
Constrain domain– restricted vocabulary– restricted application / task domain– restricted number of users: speaker-dependent
speech recognition Extensive verification (in connection with error
cost)
![Page 27: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/27.jpg)
U1, Speech in the interface: 1. Introduction 27
Dealing with functionality problem
Quick reference card Training System-driven dialogue
experience
need for adaptive systems
(e.g. barge-in)
![Page 28: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/28.jpg)
U1, Speech in the interface: 1. Introduction 28
contents
1. Aims and overview of course 2. Speech interfaces 3. Usability issues: introduction 4. Project
![Page 29: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/29.jpg)
U1, Speech in the interface: 1. Introduction 29
Aim
Provide hands-on experience with design and implementation of a speech-centric interface, involving (at least) voice-based control and speech output.
The topic: speech/multimodal interface for in-car information and entertainment systems.
![Page 30: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/30.jpg)
U1, Speech in the interface: 1. Introduction 30
Tools
Download CSLU toolkit from
http://www.cslu.ogi.edu/toolkit (requires registering)
![Page 31: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/31.jpg)
U1, Speech in the interface: 1. Introduction 31
Project stages
Task analysis (requirements gathering) Design on paper (V0.1) Wizard of Oz Redesign, implementation of V1.0 Validation Evaluation Report
![Page 32: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/32.jpg)
U1, Speech in the interface: 1. Introduction 32
Exercise for today
CSLU Exercises: McTear ch. 7, pizza application Extend the pizza application:
– Goto http://www.dominos.nl/– Click “online bestellen”– Extend the dialogue system to include all the
topping options, the side dishes and the drinks (see “menukaart”)
– Test the system and discuss your experiences
![Page 33: U1, Speech in the interface: 1. Introduction1 Module u1: Speech in the Interface 1: Introduction Jacques Terken HG room 2:40 tel. (247) 5254 j.m.b.terken@tue.nl](https://reader030.vdocuments.us/reader030/viewer/2022032523/56649d815503460f94a65c13/html5/thumbnails/33.jpg)
U1, Speech in the interface: 1. Introduction 33
Composition of project teams