Download - ® Speech Technology Opportunities and Challenges David Nahamoo Speech CTO, IBM Research Dec 12, 2006
®
Speech Technology Opportunities and Challenges
David NahamooSpeech CTO, IBM ResearchDec 12, 2006
2
Needs for Speech Technology
Lower cost
Increased cust sat
From cost to revenue
Ease of use
Speed, efficiency
Extended reach
Integration of voice/video with
enterprise data
Indexing of large amount of
multimedia info
Breaking language barriers
Accessibility
Value Value
Value
GLOBAL ACCESS
AUTOMATIONUSABILITYMultichannelSelf-Service
MultimodalInteraction
MultimediaAnalytics
Devices Commerce
Information
Command & control
Dictation
Information
Access
Transactional
Problem solving
AccessibilityMultilingual
communication
VoiceWeb
Transcription
3
Commerce
– Contact Centers
– Unified Communication
Global Access
– Speech To Speech Translation
– Translingual MultiMedia Mining
– Accessibility
Devices
– Automotive
– Set Top Box
– Mobile Phones
Major Speech Application Opportunities
4
Speech Technology Innovation that Matters
• Conversational Interaction – Dealing with Complexity
• Speech Analytics – Extracting Insight / Knowledge
• Multilingual Dimension – Globalization
®
Contact Centers Of Future
6
Contact Centers face a number of challenges as they attempt to balance costs, customer experience and revenue growth
Too much focus on Cost ReductionToo much focus on Revenue
GrowthToo much focus on the Customer
Experience
PoorCustomer
Experience
RisingCosts
RisingCosts
Limits onRevenue Growth
PoorCustomer
Experience
Limits onRevenue Growth
Can actually lead to… Can actually lead to… Can actually lead to…
1. Cost Reduction/Containment
2. Customer Experience Improvement
3. Revenue Growth
BUT… BUT… BUT…
Differing emphasis can be placed on each one, but unless managed carefully and balanced effectively for the business, the effects can be disastrous…
7
Contact Centers – Logical Components and Focus AreasD
ata
Ser
vice
s
VoIP GatewayPublic Internet Managed IP Network
Network Services
Channel Services - Assisted
Web Voice Chat
Email Voicemail Video
Agent Services
Agent Desktop
Routing
Skills WFM
Web
Vo
ice
Ch
at
Vo
icem
ail
Em
ail
Outbound
Services
Dialer
Presence
PortalChannel Services – Self-service
Back-end business processes, applications and information services (internal and external)
ERP SCMCRM
Fax
Universal Queue
VoiceCallback
ODS ECM
Contact Services
Systems Information Analytics
QAM KPIsMDM EDW
Alerts Dashboards
RTAKM
UIM Search
8
Dat
a S
ervi
ces
VoIP GatewayPublic Internet Managed IP Network
Network Services
Channel Services - Assisted
Web Voice Chat
Email Voicemail Video
Agent Services
Agent Desktop
Routing
Skills WFM
Web
Vo
ice
Ch
at
Vo
icem
ail
Em
ail
Outbound
Services
Dialer
Presence
PortalChannel Services – Self-service
Back-end business processes, applications and information services (internal and external)
ERP SCMCRM
Fax
Universal Queue
VoiceCallback
ODS ECM
Contact Services
Systems Information Analytics
QAM KPIsMDM EDW
Alerts Dashboards
RTAKM
UIM Search
Self Service
Information Integration Analytics
Agent Performance
RevenueGrowth
Multi-channel Access
Contact Centers – Logical Components and Focus Areas
®
Self-Service
10
Increased Self-Service Self-service to 80% levels and higher is possible in at least some centers
– Today’s contact centers are typically 10 to 20% self-service in most industries, but at least some companies claim 80% self-service now where Web-based interaction predominates; when voice predominates, numbers are much lower
– Live-agent costs are an order of magnitude higher than the costs of self-service
Self-service adoption has been slow to take off (8% growth 2003-2005)
– Self-service is more challenging technically than agent performance because of the difficulty of achieving high customer satisfaction
– Self-service is often run by another group than the one that runs the contact center
– Self-service will be the end-game as labor-arbitrage becomes increasingly more difficult
Whichever vendor develops ways to drive self-service fastest (while maintaining customer satisfaction) will have a commanding position in the marketplace
– Self-service is clearly a huge cost-savings opportunity
11
Customers prefer the convenience and control aspect of self-service, and have high expectations
Customers prefer self-service– Self-service preferred for many types of customer
contact• Viewing Bill (42%); • Checking Minutes (44%); • Checking/Changing a Talkplan/package (37%); • Subscribing to Services (38%)
– Web preferred to the phone (50%)• Provided one can obtain answers in the same amount of
time
And their expectations are very high– Ease of use
• 86% indicate they would stop using an organization if their IVR was difficult to use
– High level of service• 82% indicate lower level of service via the web
unacceptable– Majority indicate they would abandon a web
transaction or go to a competitor due to usability issues.
Source: Fujitsu Consulting and Netonomy, Modalis Research Technologies, Genesys, Inc., Harris Interactive
Mobile Bank Health insurance Householdinsurance
Veryimportant
Quiteimportant
Not veryimportant
Not at allimportant
60%
20%
40%
0%
20%
40%
60%
“How important was the ability to serve yourself (as a customer) in your decision to use the service provider in the first place?”
12
IVRs are still the dominant self-service channel and they are increasingly becoming speech-enabled
IVRs are still the dominant contact channel
– 45% of contact starts in IVR channel
IVRs are becoming speech-enabled
– Speech-enabled IVRs support more complex functionality and higher completion rates
• Well-designed voice user interfaces (VUI) can reduce call time by as much as 30% and compared to traditional IVR systems and cut opt-out rates by 50% (Forrester - 2003)
• Increased IVR retention rate. Companies are up to 60% more likely to retain a caller within the IVR using speech vs. touch tone (Giga)
13
Conversational Interaction
Should support the gap between user mental model and the application model– Task Complexity
– User Familiarity
– User Patience
Should minimize the user effort and task completion time– Consistent
– Rapid
– Efficient
14
Conversational Solutions
LOW MEDIUM HIGH
COMPLEXITY
STOCKTRADING
PACKAGETRACKING
FLIGHTRESERVATION
BANKING
CUSTOMERCARE
TECHNICALSUPPORT
INFORMATIONAL TRANSACTIONAL PROBLEM SOLVING
STOCKQUOTE
FLIGHTSTATUS
TRAVEL
CALL ROUTING
15
User and application models match,
Time not a factor, No decision makingPACKAGETRACKING
STOCKQUOTE
String of numbers & characters + checksum ASR
Large list of names and symbols ASR
User model is close to application,
Some decision making, Time not a factor
BANKING
STOCKTRADING
Directed Dialog and limited syntax NLU
16
User’s model might not match application’s,
Involved decision making, Time a factor
User and application models match,
No decision making, Long list of concepts
CALL ROUTING Substantial Language Understanding
Substantial Dialog (& Language Understanding)TRAVEL
17
Conversational Help Desk ChallengesHelp Desk is the most complex of all three types of conversational speech applications
Complexity is based on Nature of the Call• User domain model is limited at best • User is usually upset• Complex dialog and language understanding
Current Market Solution• No Industry “best practices” have been established
18
Main MenuWorkstation, host, password, business app, telephone
Overview of IBM Help Desk
Agent handles97-99.5% of calls
80% Serviceis HOWTO
Self service(telephone)0.5% to 3%
Incoming Calls
Troubleshooting
Create Trouble Ticket
Password Reset
Self Help (FAQ/HOWTO)
Not-entitled
Introducing ( )Audrey
®
Speech Analytics
20
Contact Center AnalyticsContact Points
Branch office
Web
IVRCall Center
CustomerEnterprise
Products& Services
Integrate & Analyze Structured& Unstructured Data
Unstructured
Call logs & transcriptsEmails, Surveys
Self ServiceAgent
Structured
Customer/Product Transaction Data
Instant Market Intelligence Customer preferences
Dissatisfaction Drivers
Lifetime Value Management
Analyze Agent Performance
Improve C-Sat, Upsell Rate
Analyze Contact Drivers
Improve FAQs, Web pages
Structured
Agent Data
Analytics enhances value for:
Self-service
Agent performance
Cross-sell/up-sell
Transformational Diagnostics
Business Intelligence
Marketing
……
21
Call Center Operation Quality
Millions of Calls Everyday Want general information:
– Are callers happy?
– Are processes followed?
– What are people asking for?
– What is the trend of occurrence of known problems?
– Are there new problems?
Need to know where to take action:
– Save a customer from defecting
– Apologize for mishandled calls
– Show call to agent for coaching
– Follow up on a missed sales opportunity
Currently Human monitoring is necessary for these
things Only a small fraction of calls can be
checked Most checking is wasted There is no permanent record of the
calls
22
Speech Analytics for Automated Quality MonitoringBackground:
– IBM NA call center team listens to and evaluates ~1% of all calls
– 35 questions answered
• “did the agent use courteous words and phrases?”
• “did the agent speak in an appropriate tone?”
• “did the agent follow the closing procedure?”
• “did the agent solve the problem?”
– Mostly random calls, rarely interesting
– Typical of all call centers
CallRank Quality Monitoring Application:
– Monitor 100% of calls
– Answer questions and assign default ratings
– Provide a ranked list to human monitors to focus attention on bad calls
Websphere
CallRank Calls & Stored
Analysis
Turn Audio into Text
Store Analysis and Transcripts back
into CM/DB2 Transcribed & Analyzed audio
CM
Extract audio from CM/DB2
Evaluate Calls
Collection Reader
Analytics Annotators
CAS Consumer
Speech-to-Text Annotator
UIMA Processing Pipeline
IBM Call Centers
IBM Call Centers
From CM
audio
23
Example of a good call
24
Example of a bad call
25
Automated Quality Monitoring
• Status: Three times as many bad calls found for same listening effort Processing ~ 3000 calls/day now from all North American centers
• Technology: Answer many questions with pattern matching on decoded text
Did the agent follow the appropriate closing script? Search for “THANK YOU FOR CALLING”, “ANYTHING ELSE”,
“SERVICE REQUEST” Use other linguistic cues to improve the accuracy of the system
Number of hesitations (UH, UM, HUM, etc), total silence, longest silence, …
®
Agent Performance
27
Agent Performance Personnel costs are by far the largest component of existing contact center costs
– Move to off-shore operations has resulted in significant (up to 75%) labor cost reductions
– Large contact centers have very large numbers of personnel
– Estimated 6M agents in U.S. in 2004 and continuing to grow
Even with the rise of self-service, a percentage of calls will still be handled by live agents
Numerous opportunities exist to improve performance by automation:– Integration of systems across the business for use in the contact center
– On-boarding process (e.g., accent monitoring)
– Training (on-boarding, continuing education, real-time training)
– Agent quality monitoring
– Call logging (30% of agent time in some contact centers)
– Helping the agent find the answer to the customer’s question
– Workforce management
– Intelligent call routing globally
– Expert “multi-channel” agents
– Activity-centric computing and other collaborative projects
28
Agent Performance: Voice Assessment/TrainingIncreased number of off-shore centers
e.g., India (>50% growth)
Key focus in off-shore contact centers
Hiring– Shrinking candidate pool and high agent attrition
rates
Training– Train agents to have neutral accents to improve
customer experience
Voice Assessment/Training System
Candidate screening for– Grammar– Pronunciation – Spoken language comprehension
Accent training– Correctness of pronunciation, intonation,
speaking rate and syllable stress
29
Contact Centers Summary Contact centers are focal points in an enterprise from which all customer
contacts are managed
Contact Centers face a number of challenges as they attempt to balance costs, customer experience and revenue growth
Customers increasingly prefer self-service and speech self-service is now ready for prime time
Enterprises can achieve improved agent performance with agent productivity tools and agent hiring/training tools
Enterprises should focus on revenue growth transforming their contact centers from cost centers to profit centers
Customer demand for choice, convenience and consistency is driving the adoption of multi-channel enablement in contact centers
Actionable intelligence from real-time and offline analytics of structured and unstructured customer interaction data will lead to new opportunities for cost reduction, revenue growth and improved customer experience
®
Increasing Global Reach
31
Global Language Barriers
Different languages spoken by people living in different regions or even by different ethnic groups living in the same region
Language barriers cause…– High cost for agents – need both subject matter expertise and
language skills • Call centers, insurance agents, etc.
– Unreachable to broad international business or tourism travel market
– Life threatening in • medical emergency• natural disaster situations• military
– Multilingual on demand media and entertainment
32
Data Point: Online population language Mismatch
*Global Internet Statistics (http://www.glreach.com/globstats/index.php3)
Mismatch:
Diversity of languages spoken online increasing, yet language of web pages are consolidating
**
33
Informational &
Transactional
Multimodal Multimedia Translingual Access
Machine Translation
SpeechRecognition
Information Analytics
Video
Audio
Multimedia Analytics
Translingual Analytics
Multimodal Access
Translingual Access
Multimodal Translingual
Access
Multimedia Translingual
Analytics Image
Text
OCR
Text mining, Categorization, Taxonomies, Entity extraction, Entity relation, Ontology, …
Content
Context
Transcription, biometrics, …
34
S2S Translation call for innovation Speech Recognition Challenges
– Needs to work in noisy environments, with spontaneous, conversational speech in multiple languages, could be emotional speech when under stress.
Translation has to handle output of ASR system
– Recognition errors
– Spoken language: different from written language• Non-grammatical disfluencies• Imperfect syntax• Lack of formal characteristics of text: no punctuation or paragraphing
Translated text must be "speakable" for oral communication
– not enough to translate content adequately; output must be fluent
– Need to carefully consider and tune interactions between ASR, MT and NLG – need access to all components
Cost-effective development of new languages and domains
Intonation translation remains a grand challenge
35
Speech Technology Driving New Business Opportunities
• Increasing Self Service: More natural interaction with more difficult tasks is made possible
• Increasing Agent Productivity, Monitoring Quality, and Increasing Sales Opportunity: Extracting insight from the content of conversation
• Increasing the Global Reach: Breaking the language barrier