deploying and managing machine

Post on 27-Mar-2022

4 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Deploying and managing machine learning models at scale

A I M 3 4 8

Sireesha Muppala

AI/ML Specialist SA

Amazon Web Services

Nitin Wagh

Sr. BDM, Amazon SageMaker

Amazon Web Services

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Our awesome support experts

A I M 3 4 8

Arun Nagarajan

Sr. SDE, AI Platforms

Kiran Bakshi

Consultant, ProServ

Piyush Bothra

Sr. Solutions Architect

Workshop map

Workshop map

Related Breakouts

AIM307 - Amazon SageMaker deep dive: A modular solution for ML

AIM311 - Choose the right instance type in Amazon SageMaker

AIM318 - Amazon SageMaker: Automatically tune hyperparameters

AIM306 - How to build high-performance machine learning solutions at low cost

Workshop map

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

https://tinyurl.com/w5wu595

Workshop map

You are a data scientist at Media Company

• Build music recommendation model for customers

• Dataset provides user purchase/listening patterns

• Develop model monitoring solution to ensure it is up-to-date

• Build movie recommendation model

• Dataset provides user purchase/viewing patterns

You are an engineer responsible for deployment atMedia Company

• Deploy models as real-time endpoints at scale

• Set up model drift detection pipeline that triggers training if required

• Save cost and efficiently run large number of models

Workshop map

Amazon SageMaker at Re:Invent 2019

Amazon

SageMakerGround

Truth

Algorithms

& FrameworksNotebooks

Training

& Tuning

Deployment &

HostingRLML

MarketplaceNeo

SageMakerStudio

NEW!

Quick-start

Notebooks (Preview)

NEW!

Experiments

NEW!

Debugger

NEW!

Autopilot

NEW!

Model Monitor

NEW!

Build, Train, Deploy Machine Learning Models Quickly at Scale

Processing

NEW!

Model deployment in SageMaker - Overview

Model deployment in SageMaker – Key features

Model deployment – Security and compliance

Amazon Confidential – Do not share or distribute

Deploying a model is not the end.

You need to continuously monitor

models in production and iterate.

Concept drift due to

divergence of data

Model performance can

change due to unknown

factors

Continuous monitoring involves a

lot of tooling and expense

Model monitoring is

cumbersome but critical

+

+

=

Amazon Confidential – Do not share or distribute

Introducing Amazon SageMaker Model Monitor

Automatic data

collection

Continuous

monitoringCloudWatch

integration

Continuous monitoring of models in production

Visual

data analysisFlexibility

with rules

Workshop map

Deploy trained model (XGBoost movie recommendation model)

Amazon SageMaker

training job

Model Amazon SageMaker

Endpoint

Applications

Enable data capture for Amazon SageMaker Endpoint

Amazon SageMaker

training job

Model Amazon SageMaker

EndpointApplications

Requests,

predictions

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Create Endpoint with data capture enabled

s3://{destination-bucket-prefix}/{endpoint-name}/{variant-name} /yyyy/mm/dd/hh/filename.jsonl

Data captured from SageMaker Endpoint

Example of collected prediction request and response

Workshop map

2. Run predictions and

view captured data

Run predictions and view captured data

Amazon SageMaker

training job

Model Amazon SageMaker

Endpoint

Applications

Requests,

predictions

View

captured data

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Workshop map

Amazon SageMaker

training job

Model Amazon SageMaker

Endpoint

Applications

Baseline statistics

and constraints

Requests,

predictions

Analyze

baseline results

Generate baseline: Create a ProcessingJob

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Generate baseline: Under the hood ProcessingJob

Baseline results – Statistics

Baseline results – Statistics

Baseline results – Constraints (suggested)

Workshop map

Amazon SageMaker

training job

Model Amazon SageMaker

Endpoint

Applications

Results:

statistics

and violations

Baseline statistics

and constraints

Requests,

predictions

MonitoringSchedule Job

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Model monitoring: Under the hood

ProcessingJob

Monitoring Schedule Execution Summary

MonitoringSchedule execution: constraint_violations.json

MonitoringSchedule execution: Violation sample

{ "violations": [{

"feature_name" : "string",

"constraint_check_type" :

"data_type_check",

| "completeness_check",

| "baseline_drift_check",

| "missing_column_check",

| "extra_column_check",

| "categorical_values_check"

"description" : "string"

}]

}

MonitoringSchedule execution: Violation types

For numerical fields:

Metric : Max → query for MetricName: feature_data_{feature_name}, Stat: Max

Metric : Min → query for MetricName: feature_data_{feature_name}, Stat: Min

Metric : Sum → query for MetricName: feature_data_{feature_name}, Stat: Sum

Metric : SampleCount → query for MetricName: feature_data_{feature_name}, Stat: SampleCount

Metric:Average→queryforMetricName:feature_data_{feature_name},Stat:Average

For both numerical and string fields:

Metric: Completeness → query for MetricName: feature_non_null_{feature_name}, Stat: Sum

Metric:BaselineDrift→queryforMetricName:feature_baseline_drift_{feature_name},Stat:Sum

CloudWatch metrics

/aws/sagemaker/Endpoints/data-metric namespace with EndpointName and ScheduleName dimensions

Workshop map

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Alerting and automate training trigger

Amazon SageMaker

Training job

Model Amazon SageMaker

Endpoint

Applications

Results:

statistics

and violations

Baseline statistics

and constraintsAmazon

CloudWatch

metrics

Requests,

predictions

Analysis of

results

Notifications

• Model updates

• Training data

updates

• Retraining

MonitoringSchedule execution: CloudWatch Alarms

Take corrective action: Retrigger model training

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Workshop map

Price a

house

Find

regulatory

violations

USA

Brazil

Singapore

… …

Next best

action

C-000001 C-000002 C-945821…

Number of models can add up quickly …

Multi-model endpointsFlexible cost savings as number of models scale

EP-1

Model 1

EP-2

Model 2

EP-10

Model 10

EP

Model 1

Model 2

…Model 10

Sample scenario: ml.c5.xlarge, $0.238/hr, 2 instances running 24/7

10 separate endpoints

$3,430/mo

1 multi-model endpoint

$343/mo

Multi-Model Endpoints

Mode:

Artifact location:

predict

s3://bucket/your-endpoint-models/

load

new_york.tar.gz

texas.tar.gz

florida.tar.gz

nevada.tar.gz

Amazon SageMaker

Multi-model endpoint S3 model storage

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

multi-model-endpoint

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Thank you!

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Sireesha Muppala

AI/ML Specialist SA

Amazon Web Services

Nitin Wagh

Sr. BDM, Amazon SageMaker

Amazon Web Services

© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

top related