barga didc'14 invited talk

Post on 09-Feb-2017

131 Views

Category:

Services

3 Downloads

Preview:

Click to see full reader

TRANSCRIPT

A Look into the Future by Learning from the PastRoger S. Barga

Cloud Machine Learning, Cloud and Enterprise

Microsoft Corporation

This isn’t an academic talk…

This isn’t an applied research talk…

1 1 5 4 3

7 5 3 5 3

5 5 9 0 6

3 5 2 0 0

1. Learn it when you can’t code it

2. Learn it when you can’t scale it

3. Learn it when you have to adapt/personalize

4. Learn it when you can’t track it

• Distributed

computing and

storage

• Deep Neural

Networks

• Learning =

Scalable,

Adaptive

Computation for

Various Big

Data

2011 (“Big

Data, DNN”)

• Wide

application in

products

• Statistical

Modeling of

Data

• Learning =

Parameter

Estimation or

Inference

2005

(“Graphical

Models”)

• Statistical

Learning Theory

• Scoring Systems

• Learning =

Optimization of

Convex

Functions

2000

(“Kernel

Machines”)

• Expert Systems

• Decision-Tree

Learning (C4.5)

• Learning =

Methods to

automatically

build Expert

Systems

1990

(“Symbolic”)

• Neural

Networks

• Artificial

Intelligence

• Learning =

Adaptation of

Neurons based

on External

Stimuli

1980

(“Neuro”)

• Distributed

computing and

storage

• Deep Neural

Networks

• Learning =

Scalable,

Adaptive

Computation for

Various Big

Data

2011 (“Big

Data, DNN”)

• Wide

application in

products

• Statistical

Modeling of

Data

• Learning =

Parameter

Estimation or

Inference

2005

(“Graphical

Models”)

• Statistical

Learning Theory

• Scoring Systems

• Learning =

Optimization of

Convex

Functions

2000

(“Kernel

Machines”)

• Expert Systems

• Decision-Tree

Learning (C4.5)

• Learning =

Methods to

automatically

build Expert

Systems

1990

(“Symbolic”)

• Neural

Networks

• Artificial

Intelligence

• Learning =

Adaptation of

Neurons based

on External

Stimuli

1980

(“Neuro”)

• Distributed

computing and

storage

• Deep Neural

Networks

• Learning =

Scalable,

Adaptive

Computation for

Various Big

Data

2011 (“Big

Data, DNN”)

• Wide

application in

products

• Statistical

Modeling of

Data

• Learning =

Parameter

Estimation or

Inference

2005

(“Graphical

Models”)

• Statistical

Learning Theory

• Scoring Systems

• Learning =

Optimization of

Convex

Functions

2000

(“Kernel

Machines”)

• Expert Systems

• Decision-Tree

Learning (C4.5)

• Learning =

Methods to

automatically

build Expert

Systems

1990

(“Symbolic”)

• Neural

Networks

• Artificial

Intelligence

• Learning =

Adaptation of

Neurons based

on External

Stimuli

1980

(“Neuro”)

• Distributed

computing and

storage

• Deep Neural

Networks

• Learning =

Scalable,

Adaptive

Computation for

Various Big

Data

2011 (“Big

Data, DNN”)

• Wide

application in

products

• Statistical

Modeling of

Data

• Learning =

Parameter

Estimation or

Inference

2005

(“Graphical

Models”)

• Statistical

Learning Theory

• Scoring Systems

• Learning =

Optimization of

Convex

Functions

2000

(“Kernel

Machines”)

• Expert Systems

• Decision-Tree

Learning (C4.5)

• Learning =

Methods to

automatically

build Expert

Systems

1990

(“Symbolic”)

• Neural

Networks

• Artificial

Intelligence

• Learning =

Adaptation of

Neurons based

on External

Stimuli

1980

(“Neuro”)

• Distributed

computing and

storage

• Deep Neural

Networks

• Learning =

Scalable,

Adaptive

Computation for

Various Big

Data

2011 (“Big

Data, DNN”)

• Wide

application in

products

• Statistical

Modeling of

Data

• Learning =

Parameter

Estimation or

Inference

2005

(“Graphical

Models”)

• Statistical

Learning Theory

• Scoring Systems

• Learning =

Optimization of

Convex

Functions

2000

(“Kernel

Machines”)

• Expert Systems

• Decision-Tree

Learning (C4.5)

• Learning =

Methods to

automatically

build Expert

Systems

1990

(“Symbolic”)

• Neural

Networks

• Artificial

Intelligence

• Learning =

Adaptation of

Neurons based

on External

Stimuli

1980

(“Neuro”)

The future will belong to those who can turn

their historical data into predictive models…

Vision Analytics

Recommenda-

tion engines

Advertising

analysis

Weather

forecasting for

business planning

Social network

analysis

Legal

discovery and

document

archiving

Pricing analysis

Fraud

detection

Churn

analysis

Equipment

monitoring

Location-based

tracking and

services

Personalized

Insurance

Machine learning and predictive models are core new capabilities that will touch everything in the new enterprise

training data (expensive) synthetic training data (cheaper)

solve hard problems

value from Big Data

data analytics

Machine learning enables nearly every

value proposition of web search.

Hundreds of thousands of machines…

Hundreds of metrics and signals per machine…

Which signals correlate with the real cause of a problem?

How can we extract effective repair actions?

solve hard problems

value from Big Data

data analytics

human intelligence

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

WER %

Training

English training data

English words,

with some errors

English

speech input

More

with fewer errors

How much data? – About the same as a human needs…

Runtime

Training

English training data

English words,

with some errors

English

speech input

with fewer errors

Runtime

Can we learn the internal representation of human speech?

French training data

or French words

or French

Chinese training data

or Chinese words

or Chinese

Shetland Sheepdog (0.72) Shoe Store (0.56) Attack Aircraft Carrier (0.81)

Steel Arch Bridge (0.74) Ballplayer, Baseball Player (0.86) Catamaran (0.51)

Wood Rabbit, Cottontail, Cottontail Rabbit (0.18)

The first image returned is Rajiv Gandhi (her husband) in the Answer.

An image of Lindsay Lohan appears in the Images Answer

not really

X

X

solve hard problems

value from Big Data

data analytics

human intelligence

engineering practices

intelligence will become ambient

intelligence from machine learning

55

57

59

61

63

65

67

69

71

Overall NDCG

Bing NDCG Google NDCG

The razor-toothed piranhas of the genera

Serrasalmus and Pygocentrus are the most

ferocious freshwater fish in the world. In

reality they seldom attack a human.

Template

matching

The razor-toothed piranhas of the genera

Serrasalmus and Pygocentrus are the most

ferocious freshwater fish in the world. In

reality they seldom attack a human.

pypygygogoc

Pygocentrus

The razor-toothed piranhas of the genera

Serrasalmus and Pygocentrus are the most

ferocious freshwater fish in the world. In

reality they seldom attack a human.

Sentence-level

decoding

The razor-toothed piranhas of the genera

Serrasalmus and Pygocentrus are the most

ferocious freshwater fish in the world. In

reality they seldom attack a human.

Massive

The Intelligent Cloud

Machine

Learning &

Analytics

Crowd

Sourcing

Massive &

Diverse Data

The Cloud - Where Everything Comes Together

top related