business conversation analysis deep...

DEEP LEARNING IN BUSINESS CONVERSATION ANALYSIS

ANTHONY SCODARY, GRIDSPACEWONKYUM LEE, GRIDSPACE

“Which translation speech recognition so and so forth I mean there's a whole bunch of amazing applications that are made possible by deep learning and so internet service providers are using it for internal application development.

And then lastly what you mentioned as cloud service providers and basically because of the adoption of gp use and because of the success of kuta and so many applications are now able to be accelerate on gp use so that we can extend the capabilities of moore's law so that we can continue.

You'd have the benefits of of computing acceleration, which which in the cloud means reducing cost.

And that's on the serve cloud service provider side of of the Internet company so that would be amazon web services as the Google compute cloud.”

OVERVIEW

1. Business Conversations2. Recognition3. Analysis

1. Business Conversations

PROTOCOLS

SIGNALPROCESSING

PROTOCOLS

- Symbol Set (Lexicon)

- Rules (Syntax)

- Meaning (Semantics)

TYPES OF PROTOCOLS

SOURCE MEDIUM

TYPES OF PROTOCOLS: ENDPOINTS

BIRDCALL SEISMOGRAPH GROWLING

ELECTRICFENCE TCP FIRE

“SIT” SIRI SPEECH

NATURE

MACHINE

NATURE MACHINE HUMAN

TYPES OF PROTOCOLS: H2H MEDIA

BANDWIDTH

INFORMATION DENSITY

SMSVOICEMAIL

MISSEDCALL

POSTCARDWAVING

SPEECH

WHY DO WE STILL TALK?

- Fast

- Innate

- Layered

- Synchronous

- Dense in meaning

ORGANIZATIONS

INTERNALCOMMUNICATION

EXTERNALCOMMUNICATION

CallsMeetingsHallway Chats

Support CallsIn-Person Sales

DocumentsEmailChatSMS

Chat SupportSocial MediaEmail

ORGANIZATIONS

INTERNALCOMMUNICATION

EXTERNALCOMMUNICATION

CallsMeetingsHallway Chats

Support CallsIn-Person Sales

DocumentsEmailChatSMS

Chat SupportSocial MediaEmail

Mostly lost today

THIS DATA MATTERS

2. Recognition

REAL-TIME CALL ANALYSIS

ASRDSPSCANNERCLASSIFIER

Feature Extraction(MFCC)

Acoustic Model (GMM)

Lexicon

Language Model

“hello”

Conventional ASR - Combination of blocks designed by each expertise

GMM-HMM: 1980-2010

Acoustic Model (GMM)

Lexicon

Language Model

“hello”

Lots of tuning to improve accuracy

Robust Feature, Speaker-Adaptation, Application specific LM

Acoustic Model

Lexicon

Language Model

“hello”

Replacing acoustic model with deep neural net

DNN-HMM: 30%-40% improvement (2011-2017)

All-in-one Deep Learning Model

“hello”

Someday in the near future, Replacing whole models with one neural net

End-to-End ASR: active research in-progress

Simple Linear model(GMM)Advanced Linear model (GMM-SAT-DT)

Deep Learning ModelEnd-to-End Deep Learning (under development)

“Human parity”

ASR error rate for decades (in Academia) WER (log scale)

ASR HISTORY

“However, it’s still NOT Easy in real-world business conversational voice”

Language Challenge

Acoustic Challenge

• Domain specific terminology (company name, product name, …)• Spontaneous speech (natural conversation)• Accent, Dialect, Mispronunciation

• Noise (background, channel)• Acoustic effect (reverberation, Lombard effect)• Variability from speakers• Microphone displacement (near/far field)

ASR CHALLENGES

Data is King!

- General Conversational Data + in-domain data (training with in-domain data improves 15-30% accuracy)

- Simulated data with variety noise helps! (improves 10-15% accuracy)

- Data collection with semi-supervised training helps

LARGE-SCALE DATA PROCESSING

Multi-GPU Training

- 4x Titan X with parallel training- One week for full-training with 25k hours audio- 80x Faster than 32 core CPU machine

LARGE-SCALE DATA PROCESSING

Real-time adaptive processing

- Online i-vector adaptation (5-10% improvement)- speaker characteristics- environmental noise- Accent & dialect

- Context-based grammar adaptation (recognize in-domain specific terms)

REAL-TIME ADAPTIVE PROCESSING

State-of-Art deep learning model

- Time-delayed neural network- Computation optimization (Subsampling,

bi-phone, etc)- WFST framework for search

“Purely sequence-trained neural networks for ASR based on lattice-free MMI”, Interspeech 2016

WER: 5~6% Capital Market Model 12~15% Customer Intelligence ModelReal-Time-Factor: 0.3-0.35

STATE OF THE ART DEEP LEARNING MODEL

3. Analysis

IS TRANSCRIPTION REALLY WHAT YOU WANT ANYWAY?

STUFF WITH ACTUAL USE TO COMPANIES

- Prediction

- Classification

- Summarization

- Entity Extraction

- Anomaly Detection

“ARTIFICIAL INTELLIGENCE”

ARITHMETIC

GRAPH SEARCH

IMAGE RECOGNITION

CONVERSATION

EMOTION

CONSCIOUSNESS

ABOVE THIS LINE THIS SURELY IS

“REAL” INTELLIGENCE

TECHNOLOGY REVOLUTION

WASTE OF MONEY AND

We focus on the industry needs as

an engineering task.

ANALYSIS

1. Speech is complex.

Let models decide what features

matter for a task or application.

ANALYSIS

2. Speech is high dimensional.

Datasets must be large enough to

train large models to match.

ANALYSIS

3. Conversational speech is noisy.

Large, well-augmented datasets are

necessary to be robust.

ANALYSIS

aardvark

One-hot(D-dimensions) ℝ300

ANALYSIS

BROTHER

SISTER

ANALYSIS

i have no political party actually

~~~‘democrat’

ANALYSIS

gridspace.com

QUESTIONS?

business conversation analysis deep...

Documents

your social customer's conversation is driving your business

opportunities and obstacles to collaboration: a conversation...

workflow hero’s line of business conversation...

business english conversation and listening instructor:...

business english: office conversation quiz

business guru: conversation to conversion

audra king– digital marketing consultant. twitter...

instructor: hsin-hsin cindy lee, phd business english...

understanding your family business a conversation for your...

starting the conversation with a...

applications with rich user interface powered by the...

opening a new conversation with business leaders: it's time...

it support conversation manager: a conversation … ·...

“a onversation about medicare part a, , and d”...•a...

opentouch conversation for iphone - al- · pdf fileopentouch...

conversation questions english for business 1516

the new conversation : taking social media from talk to...

1 © fsg| wa community foundation convening – business...

harvard business review - the new conversation taking social...

business conversation manager: facilitating people