© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Predictive Analytics with Amazon SageMaker
Steve Shirkey
Specialist SA, AWS (Singapore)
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Explosion in AI and ML Use Cases
Image recognition and tagging for photo organization
Object detection, tracking and navigation for Autonomous Vehicles
Speech recognition & synthesis in Intelligent Voice Assistants
Algorithmic trading strategy performance improvement
Sentiment analysis for targeted advertisements
2
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
~1997
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thousands of Amazon Engineers Focused
on Machine Learning
Fulfillment &
logistics
Search &
discovery
Existing
products
New
products
At
AWS
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• Applied research
• Core research
• Alexa
• Demand forecasting
• Risk analytics
• Search
• Recommendations
• AI services
• Q&A systems
• Supply chain optimization
• Advertising
• Machine translation
• Video content analysis
• Robotics
• Lots of computer vision…
• NLP/NLU
Over 20 years of AI at Amazon…
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
ML @ AWS
OUR MISSIONPut machine learning in the
hands of every developer
and data scientist
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Customers Running Machine Learning On AWS Today
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
The Amazon Machine Learning Stack
PLATFORM SERVICES
APPLICATION SERVICES
FRAMEWORKS & INTERFACES
Caffe2 CNTKApache
MXNetPyTorch TensorFlow Chainer Keras Gluon
AWS Deep Learning AMIs
Amazon SageMaker AWS DeepLens
Rekognition Transcribe Translate Polly Comprehend Lex
Amazon EMR (Spark ML) Amazon Mechanical Turk
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Let’s Review the ML Process
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are Business
Goals met?
Model Deployment
Monitoring &
Debugging
– Predictions
YesNo
Data
Au
gm
en
tati
on
Featu
re
Au
gm
en
tati
on
The Machine Learning Process
Re-training
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are Business
Goals met?
Model Deployment
Monitoring &
Debugging
– Predictions
YesNo
Data
Au
gm
en
tati
on
Featu
re
Au
gm
en
tati
on
Discovery: The Analysts
Re-training
• Help formulate the right
questions• Domain Knowledge
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are Business
Goals met?
Model Deployment
Monitoring &
Debugging
– Predictions
YesNo
Data
Au
gm
en
tati
on
Featu
re
Au
gm
en
tati
on
Integration: The Data Architecture
Retraining
• Build the data platform:• Amazon S3
• AWS Glue
• Amazon Athena
• Amazon EMR
• Amazon Redshift
Spectrum
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Visualization &
Analysis
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
• Setup and manage
Notebook Environments
• Setup and manage
Training Clusters
• Write Data Connectors
• Scale ML algorithms to
large datasets
• Distribute ML training
algorithm to multiple
machines
• Secure Model artifacts
Why We built Amazon SageMaker: Model Training Undifferentiated Heavy Lifting
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Business Problem –
Model Deployment
Monitoring &
Debugging
– Predictions
• Setup and manage Model
Inference Clusters
• Manage and Scale Model
Inference APIs
• Monitor and Debug Model
Predictions
• Models versioning and
performance tracking
• Automate New Model
version promotion to
production (A/B testing)
Why We built Amazon SageMaker: Model Deployment Undifferentiated Heavy Lifting
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker
Easily build, train, and deploy machine learning models
Collect and
prepare training
data
Choose and
optimize your
ML algorithm
Set up and
manage
environments for
training
Train and tune
model
(trial and error)
Deploy model
in production
Scale and
manage the
production
environment
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker
Pre-built
notebooks
for common
problems
Set up and
manage
environments
for training
Train and
tune model
(trial and
error)
Built-in, high
performance
algorithms Deploy model
in production
Scale and
manage the
production
environment
BUILD
Easily build, train, and deploy machine learning models
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker Built-in
Algorithms
k-Means Clustering
PCA
Neural Topic Modelling
Factorisation Machines
Linear Learner
XGBoost
Latent Dirichlet Allocation
Image Classification
Seq2Seq
DeepAR Forecasting
BlazingText (word2vec)
Random Cut Forest
k-Nearest Neighbor
Object Detection
Training ML Models Using Amazon SageMaker
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Bring Your Own
Algorithms
ML Algorithms
R
MXNet
TensorFlow
Caffe
PyTorch
Keras
CNTK
…
SageMaker Built-in
Algorithms
K-means Clustering
PCA
Neural Topic Modelling
Factorisation Machines
Linear Learner – Regression
XGBoost
Latent Dirichlet Allocation
Image Classification
Seq2Seq
Linear Learner –
Classification
DeepAR Forecasting
Training ML Models Using Amazon SageMaker
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker
Framework SDKs
TensorFlow SDK
MXNet (Gluon) SDK
Chainer SDK
PyTorch SDK
SageMaker Built-in
Algorithms
K-means Clustering
PCA
Neural Topic Modelling
Factorisation Machines
Linear Learner – Regression
XGBoost
Latent Dirichlet Allocation
Image Classification
Seq2Seq
Linear Learner –
Classification
DeepAR Forecasting
Bring Your
Own
AlgorithmsML Algorithms
R
MXNet
TensorFlow
Caffe
PyTorch
Keras
CNTK
…
Training ML Models Using Amazon SageMaker
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker
Framework SDK
TensorFlow SDK
MXNet (Gluon) SDK
Chainer SDK
PyTorch SDK
SageMaker Built-in
Algorithms
K-means Clustering
PCA
Neural Topic Modelling
Factorisation Machines
Linear Learner – Regression
XGBoost
Latent Dirichlet Allocation
Image Classification
Seq2Seq
Linear Learner –
Classification
DeepAR Forecasting
Bring Your
Own
AlgorithmsML Algorithms
R
MXNet
TensorFlow
Caffe
PyTorch
Keras
CNTK
…
Apache Spark
Estimator
Apache Spark Python library
Apache Spark Scala library
Amazon
EMR
Training ML Models Using Amazon SageMaker
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker
Pre-built
notebooks
for common
problems
Built-in, high
performance
algorithms
One-click
training
Hyperparameter
tuning
BUILD TRAIN
Deploy model
in production
Scale and
manage the
production
environment
Easily build, train, and deploy machine learning models
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker
Fully managed
hosting with
auto-scaling
One-click
deploymentPre-built
notebooks
for common
problems
Built-in, high
performance
algorithms
One-click
training
Hyperparameter
tuning
BUILD TRAIN DEPLOY
Easily build, train, and deploy machine learning models
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
SageMaker
Notebooks
Training
Algorithm
SageMaker
Training
SageMaker
Hosting AWS
Lambda
API
Gateway
Prepare
Training Data Inference requests
Amazon S3
Amazon S3
Train & Optimise Deploy
Raw
Data
Prepared
Data
Algorithm
Container
Trained
Model
Trained
Model
HPOUser
Interactions
Reference Architecture
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Walkthrough: SageMaker Console
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker Architecture
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon ECR
Model Training (on EC2)
Amazon SageMaker
Client application
Training code
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon ECR
Model Training (on EC2)
Tra
inin
g d
ata
Training code Helper code
Client application
Training code
Amazon SageMaker
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon ECR
Model Training (on EC2)
Tra
inin
g d
ata
Mo
de
l a
rtif
act
s
Training code Helper code
Client application
Inference code
Training code
Amazon SageMaker
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon ECR
Model Training (on EC2)
Model Hosting (on EC2)
Tra
inin
g d
ata
Mo
de
l a
rtif
act
s
Training code Helper code
Helper codeInference code
Client application
Inference code
Training code
Amazon SageMaker
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon ECR
Model Training (on EC2)
Model Hosting (on EC2)
Tra
inin
g d
ata
Mo
de
l a
rtif
act
s
Training code Helper code
Helper codeInference code
Client application
Inference code
Training code
Inference requestInference response
Inference Endpoint
Amazon SageMaker
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon ECR
Model Training (on EC2)
Model Hosting (on EC2)
Tra
inin
g d
ata
Mo
de
l a
rtif
act
s
Training code Helper code
Helper codeInference code
Gro
un
d T
ruth
Client application
Inference code
Training code
Inference requestInference response
Inference Endpoint
Amazon SageMaker
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hyperparameter Tuning
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Photo credit: https://www.flickr.com/photos/ceasedesist/5821282085 (Licensed for commercial use)
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Learning Rate
Regula
rization
Option 1: Grid Search
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Option 2: Random Search
Learning Rate
Regula
rizatio
n
Bergstra, Bengio, “Random Search for Hyper-Parameter Optimization”
https://dl.acm.org/citation.cfm?id=2188395
“Compared with neural networks configured by a pure grid search, we find that random search over the same domain is able to find models that are as good or better within a small fraction of the computation time.”
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Option 3: Bayesian Optimization Papers
• A Tutorial on Bayesian Optimization of Expensive Cost
Functions, with Application to Active User Modeling and
Hierarchical Reinforcement Learning
(https://arxiv.org/abs/1012.2599)
• Practical Bayesian Optimization of Machine Learning Algorithms
(https://arxiv.org/abs/1206.2944)
• Taking the Human Out of the Loop: A Review of Bayesian
Optimization (https://ieeexplore.ieee.org/document/7352306)
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Hyperparameter Tuning in Amazon SageMaker
Amazon SageMaker’s Hyperparameter Tuning feature is
based on an implementation of Bayesian Optimization,
along with some additional optimizations
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Walkthrough: Hyperparameter Tuning
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Additional Resources
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Next Steps with Amazon SageMaker
• Getting started with Amazon SageMaker:
• https://aws.amazon.com/sagemaker/
• Use the Amazon SageMaker SDK:
• For Python: https://github.com/aws/sagemaker-python-sdk
• For Spark: https://github.com/aws/sagemaker-spark
• SageMaker Code Samples / Workshops:
• https://github.com/awslabs/amazon-sagemaker-examples
• https://github.com/awslabs/amazon-sagemaker-workshop
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Brainstorming Modeling Teaching
Leverage Amazon experts with decades of ML
experience with technologies like Amazon Echo,
Amazon Alexa, Prime Air and Amazon Go
Amazon ML Solutions
Lab provides ML
expertise
Amazon ML Solut ions Lab
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Training Offer
Make your data driven decisions count, and make a career in Big Data on AWS. Follow the Big Data Specialty learning path and become a specialist in Big Data:
• Implement core AWS Big Data services according to best practices
• Design and maintain Big Data
• Leverage tools to automate data analysis
Certified Cloud Practitioner
Associate-level Certification
AWS Certified Big Data - Specialty
• Enterprise solutions architects
• Data scientists
• Big Data solutions architects
• Data analysts
Who should attend
Free AWS digital training: Foundational knowledge
Big Data on AWS – 3-day Classroom Training
Free AWS digital training: Big Data Technology Fundamentals
Visit www.aws.training to find out more.
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Q&A
© 2018 Amazon Web Services, Inc. or its Affiliates. All rights reserved.
We hope you found it interesting! A kind reminder to complete the survey.
Let us know what you thought of today’s event and how we can improve the
event experience for you in the future.
Thank You For Attending
AWS Data Driven Decisions Webinar Series.
twitter.com/AWSCloud
facebook.com/AmazonWebServices
youtube.com/user/AmazonWebServices
slideshare.net/AmazonWebServices
twitch.tv/aws