1545 amazon maschinelleslernen-frav4 · machine"learning"at"amazon!...

25
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Barbara Pogorzelska TPM Machine Learning June 30, 2016 Machine Learning at Amazon

Upload: lamdang

Post on 30-Jul-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

©  2016,  Amazon  Web  Services,  Inc.  or  its  Affiliates.  All  rights  reserved.

Barbara  PogorzelskaTPM  Machine  LearningJune  30,  2016

Machine  Learning  at  Amazon

Page 2: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Agenda

q Introduction  to  Amazon  Machine  LearningqMachine  Learning  at  Amazonq Customer  Case  Studies

Page 3: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Introduction  to    Amazon  Machine  Learning

Page 4: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Machine  Learning  at  Amazon

Page 5: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Machine  Learning  Opportunities  @  Amazon

Retail•Demand  Forecasting•Vendor  Lead  Time  Prediction•Pricing•Packaging•Substitute  Prediction

Customers•Product  Recommendation•Product  Search•Visual  Search•Product  Ads•Shopping   Advice•Customer  Problem  Detection

Seller•Fraud  Detection•Predictive  Help•Seller  Search  &  Crawling

Catalog•Browse-­Node  Classification•Meta-­data  validation•Review  Analysis

Digital•Named-­Entity  Extraction•XRay•Plagiarism  Detection•Echo Speech  Recognition

Retail•Demand  Forecasting•Vendor  Lead  Time  Prediction•Pricing•Packaging•Substitute  Prediction

Customers•Product  Recommendation•Product  Search•Visual  Search•Product  Ads•Shopping  Advice•Customer  Problem  Detection

Seller•Fraud  Detection•Predictive  Help•Seller  Search  &  Crawling

Catalog•Browse-­Node  Classification•Meta-­data  validation•Review  Analysis

Digital•Named-­Entity  Extraction•XRay•Plagiarism  Detection•EchoSpeech  Recognition

Page 6: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Locations

ML  Seattle

ML  Bangalore

S9

A9A2Z

Ivona

ML  Berlin

Evi

Page 7: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Machine  Learning  in  Berlin

ML  @  Amazon

Forecasting

Retail

Content  Linkage

Digital

Scalable  Algorithms  &  Services

AWS

Visual  Services

Retail  &  Digital

Page 8: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Machine  Learning  in  Berlin

ML  @  Amazon

Forecasting

Retail

Content  Linkage

Digital

Scalable  Algorithms  &  Services

AWS

Visual  Services

Retail  &  Digital

Page 9: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Forecasting

• Given  past  sales  of  a  product  in  every  region,  predict  regional  demand  up  to  one  year  into  the  future

Setting

• New  Products:  No  past  demand!• Regionalized:   150+  fulfillment  centers  worldwide• Sparsity:  Huge  skew  – many  products  sell  very  few  items• Seasonal:  Huge  variation  due  to  external,  seasonal  events• Distributions:   Future  is  uncertain  è predictions  must  be  distributions• Scale:  20M+  products  fulfilled  by  Amazon  alone!• Orders:  Customers  demand  bundle of  products• Censored:  Past  sales  ≠  past  demand  (inventory  constraint)

Challenges

Page 10: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Forecasting  Seasonality

Page 11: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Machine  Learning  in  Berlin

ML  @  Amazon

Forecasting

Retail

Content  Linkage

Digital

Scalable  Algorithms  &  Services

AWS

Visual  Services

Retail  &  Digital

Page 12: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Content  Linkage

• Enrich  Every  Piece  of  Digital  Content  Continuously  by  Linking  it  to  Relevant  Content  on  Amazon  and  the  Web

Setting

• Scale:  Millions  of  books  – with  1000’s  added  each  day!• Languages:  Over  20  different  languages  (Machine  Translation!)• Media:  Link  books,  movies,  products  and  maps  together  • Web:  Web  grows  by  1B+  pages  per  day• Representation:  Language  and  media-­independent  (Wiki?)

Challenges

Page 13: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

XRay

Page 14: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

ASIN  Machine  Translation

ASINs

ContributionProfit

Human  Translation

Machine  Translation

Selection Gap

Page 15: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Machine  Learning  in  Berlin

ML  @  Amazon

Forecasting

Retail

Content  Linkage

Digital

Scalable  Algorithms  &  Services

AWS

Visual  Services

Retail  &  Digital

Page 16: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Scalable  Algorithms  &  Services

• No  limitations   on  model  size  and  data  size!

Setting

• Distributed:  Parameters  need   to  be  distributed• Fault  Tolerance:  Data  and  model  chunks  might   fail• Simplicity:  Zero-­parameter   algorithms   for  engineers• Any-­Time:  Any-­time  convergence  of  algorithms• Resource-­Constrains:   Learning   algorithms   that  optimize  under   resource  &  budget  constraints

Challenges

Page 17: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Three  types  of  data-­driven  development

Retrospectiveanalysis  and  reporting

Here-­and-­nowreal-­time  processing  and  

dashboards

Predictionsto  enable  smart  applications

Amazon  Kinesis  Amazon  EC2  AWS  Lambda

Amazon  Redshift  Amazon  RDS  Amazon  S3Amazon  EMR

Amazon  Machine  Learning

Page 18: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Machine  Learning  in  Berlin

ML  @  Amazon

Forecasting

Retail

Content  Linkage

Digital

Scalable  Algorithms  &  Services

AWS

Visual  Services

Retail  &  Digital

Page 19: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Automated  Produce  Inspection:  The  Goal

New  Automated InspectionCurrent Inspection

Computer  Vision

Page 20: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Customer  Case  Studies

Page 21: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

AdiMap Case  Study  

AdiMap Jobs  

Employee  &  Employer:  what  is  the  salary  of  jobs  in  US  companies?

AdiMapApps

App  Business:  what  are  the  financials  of  

apps  and  developers?

AdiMap Spend  

Advertiser  &  Publisher:  what  is  the  ad  spend  and  revenue  worldwide?

AdiMap Elections

Voter  &  Candidate:  what  is  the  ad  spend  of  US  presidential  candidates?

Company  Data  science  company  that  combines  the  disciplines  of  computer  science,  statistics,  and  business

Page 22: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

BuildFax Case  Study

Company• Aggregates  dispersed  building  permit  data  from  across  the  United  States  • Providing  the  processed  to  other  businesses,  such  as  insurance  companies,  building  inspectors,  

and  economic  analysts

• Old  predictive  models  were  based  on  ZIP  codes  and  other  general  data  using  Python  and  R  languages

• New  models  based  on  data  sets  from  public  sources  and  from  customers  estimate  job  costs  with  80%  accuracy  

Page 23: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

Fraud.net Case  Study

Company• Aggregating  and  analyzing  large  amounts  of  fraud  data  from  thousands  of  online  merchants  in  real  

time• Protects  more  than  2  percent  of  all  U.S.  e-­commerce• Fraud.net saves  its  customers  about  $1  million  a  week  by  helping  them  detect  and  prevent  fraud

• “On  any  given  day,  we  might  see  100  different  fraud  schemes,  each  one  with  100  different  variations”.

• “As  new  fraud  schemes  pop  up,  we  have  to  identify  and  create  models  around  those  specialized  situations.”

Page 24: 1545 Amazon MaschinellesLernen-FRAv4 · Machine"Learning"at"Amazon! Customer"Case"Studies. Introduction(to ... AdiMap Case(Study(AdiMap Jobs Employee"&" Employer: ... 1545_Amazon_MaschinellesLernen-FRAv4

47Lining

Company• 47Lining  is  an  AWS  Advanced  Consulting  Partner  with  Big  Data  Competency  designation• Develops  big  data  solutions  built  using  AWS  building  blocks:  Redshift,  Kinesis,  S3,  DynamoDB,  

Machine  Learning  and  Elastic  MapReduce

Churn  Prediction  with  71%  Accuracy

Consumer  Credit  Behavior

Propensity  to  Purchase  Real  

Estate