mobile multimodal applications. dr. roman englert, gregor glass march 23 rd, 2006
Post on 24-Dec-2015
218 Views
Preview:
TRANSCRIPT
Mobile Multimodal Applications.
Dr. Roman Englert, Gregor Glass March 23rd, 2006
2
Agenda:
Motivation
Potentials through Multimodality
Use Cases Map & Sound Logo
Components & Modules for Mobile Multimodal Interaction
User Perspective
Challenges
3
Moore's Law: Technical capability will double approximately every 18 months.
Buxton's Law: Technology designers promise functionality proportional to Moore's Law.
Moore‘s Law:
Growth ofTechnology
Moore's Law: Technical capability will double approximately every 18 months.
Buxton's Law: Technology designers promise functionality proportional to Moore's Law.
Buxton‘s Law:
Growth ofFunctionality
Multimodality.Motivation.
The Challenge is how to deliver more functionality without breaking through the complexity barrier and making the systems so cumbersome as to be completely unusable.
God’s Law (complexity barrier): Human capacity is limited and does not increase over time!
God‘s Law:
Growth of Human Capabiltiy
(Billy Buxton)
5
Multimodality – New User Interfaces. Composite Usage Szenario: Map.
Example:
User selects a point of interestclicking with a stylus and speaking inorder to focus it.
„Zoom in here”
6
Multimodality – New User Interfaces. Composite Usage Szenario: SoundLogo
Example:
User selects a sound logoby clicking on the title witha stylus and speaking inorder to hear it
SoundLogo = Personalized Call Connect Signal
„Play this sound logo”
7
Input Voice Stylus
Gesture …
ClientUser Server Content
Back-EndVoice Data
Dialog Management
Synchronisation Management
Media Resource Management (ASR/TTS)
Output Voice Text
Graphic Video …
User Interface Types of Multimodality
Sequential
Parallel
Architecture Layer Internet / Services
Multimodality – New User Interfaces.Components of multimodal end-to-end connection.
8
Recognition
grammar
speech
ink
etc.
system-generated
Interpretation
interpretation
interpretation
interpretation
interpretation
mouse/keyboard
semanticinterpretation
Integration
integrationprocessor
interactionmanager
EMMA
EMMA
EMMA
EMMA
EMMA
system andenvironment
applicationfunctions
sessioncomponent
EMMA
Multimodality – New User Interfaces.Main modules for parallel interaction.
back
9
Multimodality – New User Interfaces. User Perspective:
Feedback nutshell from divers previous innovation projects: “Give us speech control”
Composite interaction with full prototype implementation for customer self service: 2 campaigns (SMS & Personalized Call Connect Signal)
Need to actively communicate the possibilities & advantages of new multimodal interaction paradigm to user
Real appreciation of speech control & good acceptance of “push-to-talk” mode
Expectation: Symmetry & consistency between the interaction modes
BUT: How do users really want to speak to the machine?How to provide feedback? / How to correct input
errors?
Great for context dependent service interaction BUT: Which mode is most suitable for which task?
For whom? Under which circumstances?
10
Multimodality – New User Interfaces. Challenges:
Sequential vs parallel i/o
Unique interpretation of multimodal hypotheses
Discourse phenomena like anaphora resolution and generation
Input correction loops
Encapsulation of i/o tools to achieve a generic front end
Model Driven Architecture
11
Thank you for your attention!
12
Multimodality – New User Interfaces.Sequential and Parallel Input.
Sequential input Multimodal applications may allow
to choose between different input modalities, e.g. to speak or to click on a button
Only one input channel will be interpreted, i.e. the user may speak or click on a button
Multiple input channels will be interpreted sequentially as defined by the application
Parallel input Also known as composite input Multimodal applications allow
to use multiple input modes at nearly the same time, e.g. the user may speak and tap onto the screen
The Multimodal application will combine multiple inputs and interpret them User navigates in a map
and speaks “zoom in here”
Select a field and then speak “My number is …”
Then click only on a button Afterwards, navigate “Back to
main menu”
Parallel input needs additional platform or application capabilities in order to combine (integrate) and interpret multiple inputs
13
User speaks and clicks on the screen:
“Zoom in <here>.”
Recognition
grammar
speech
ink
Interpretation
interpretation
interpretation
semanticinterpretation
Integration
integrationprocessor
interactionmanager
EMMA
EMMA
EMMA
Multimodality – New User Interfaces.Example: Composite input for voice and stylus.
14
Semantic Interpretation:
action = zoom in
location = x, y from stylus
Recognition
grammar
speech
ink
Interpretation
interpretation
interpretation
semanticinterpretation
Integration
integrationprocessor
interactionmanager
EMMA
EMMA
EMMA
Multimodality – New User Interfaces.Example: Composite input for voice and stylus.
Interpretation:<emma:emma version="1.0" xmlns:emma="http://www.w3.org/2003/04/emma"> <emma:interpretation id="int1" emma:mode="speech"> <action>zoom_in</action> <location emma:hook="ink"/> </emma:interpretation></emma:emma>
15
User clicks on map while speaking:
x = 17
y = 54
Recognition
grammar
speech
ink
Interpretation
interpretation
interpretation
semanticinterpretation
Integration
integrationprocessor
interactionmanager
EMMA
EMMA
EMMA
Multimodality – New User Interfaces.Example: Composite input for voice and stylus.
16
Interpretation:<emma:emma version="1.0" xmlns:emma="http://www.w3.org/2003/04/emma"> <emma:interpretation id="int1" emma:mode="ink"> <location> <type>point</type> <x>17</x> <y>54</y> </location> </emma:interpretation></emma:emma>
Recognition
grammar
speech
ink
Interpretation
interpretation
interpretation
semanticinterpretation
Integration
integrationprocessor
interactionmanager
EMMA
EMMA
EMMA
Multimodality – New User Interfaces.Example: Composite input for voice and stylus.
17
Recognition
grammar
speech
ink
Interpretation
interpretation
interpretation
semanticinterpretation
Integration
integrationprocessor
interactionmanager
EMMA
EMMA
EMMA
Multimodality – New User Interfaces.Example: Composite input for voice and stylus.
Integration:<emma:emma version="1.0" xmlns:emma="http://www.w3.org/2003/04/emma"> <emma:interpretation id="int1" emma:mode="multimodal"> <action>zoom_in</action> <location> <type>point</type> <x>17</x> <y>54</y> </location> </emma:interpretation></emma:emma>
18
Interaction manager (application specific tasks)
Proof of input data integrated input? speech only? ink/stylus only?
Proof of suitability of Integration results input data compatible? (e.g. are the
real number of stylus input (e.g. 2times) the same like the expected value
Mapping of recognition results from different modalities e.g.
Speech recognition error but stylus correct
Speech recognition OK but stylus incorrect
Confidence ok and stylus ok Decision for error handling output
graphical, audio, prompt, TTS Handling of redundant information and
creation of related user reaction prioritisation of input modalities
Integration
integrationprocessor
interactionmanager
EMMA
EMMA
EMMA
Multimodality – New User Interfaces.Methods and functionalities: Interaction manager.
back
top related