serverless predictions at scale - devops school...© 2018, amazon web services, inc. or its...

36
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Thomas Reske Global Solutions Architect, Amazon Web Services Serverless Predictions at Scale

Upload: others

Post on 07-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Thomas Reske

Global Solutions Architect, Amazon Web Services

Serverless Predictions at Scale

Page 2: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Serverless computing allows you to

build and run applications and services without thinking about servers

Page 3: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

$

No server management Flexible scaling

Automated high availability No excess capacity

What are the benefits of serverless computing?

Page 4: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Machine Learning Process

Fetch data

Clean and

format data

Prepareand

transform data

Model TrainingEvaluation

and verificaton

Integrationand

deployment

Operations, monitoring, and

optimization

Page 5: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Machine Learning Process

Fetch data

Clean and

format data

Prepareand

transform data

Model TrainingEvaluation

and verificaton

Integrationand

deployment

Operations, monitoring, and

optimizationHow can I use pre-trained

models for serving predictions?

How to scale effectively?

Page 6: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

What exactly are we deploying?

Page 7: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

A Pre-trained Model...

• ... is simply a model which has been trained previously

• Model artifacts (structure, helper files, parameters, weights, ...) are files, e.g. in JSON or binary format

• Can be serialized (saved) and de-serialized (loaded) by common deep learning frameworks

• Allows for fine-tuning of models („head-start“ or „freezing“), but also to apply them to similar problems and different data („transfer learning“)

• Models can be made publicly available, e.g. in the MXNet Model Zoo

Page 8: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

What‘s missing?

Page 9: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Blueprints?

Page 10: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Model ServerModel ServerModel Server

Model Artifacts

Load Balancing

Generic Architecture

Page 11: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Amazon API Gateway and AWS Lambda Integration

Amazon API Gateway

(API Proxy)

Amazon S3(Model Archive)

AWS Lambda(Data Processing)

Page 12: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Trade-Offs?

Page 13: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Model Server for Apache MXNet

• Flexible and easy to use open-source project (Apache License 2.0) for serving deep learning models

• Supports MXNet and Open Neural Network Exchange (ONNX)

• CLI to package model artifacts and pre-configured Docker images to start a service that sets up HTTP endpoints to handle model inference requests

• More Details at https://github.com/awslabs/mxnet-model-server and https://onnx.ai/

Page 14: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Quick Start - CLI# create directory

mkdir -p ~/mxnet-model-server-quickstart && cd $_

# create and activate virtual environment

python3 -m venv ~/mxnet-model-server-quickstart/venv

source ~/mxnet-model-server-quickstart/venv/bin/activate

# install packages

pip install --upgrade pip

pip install mxnet-model-server

# run model server for Apache MXNet

mxnet-model-server --models \

squeezenet=https://s3.amazonaws.com/model-server/models/squeezenet_v1.1/squeezenet_v1.1.model

# classify an image

curl -s -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg

curl -s http://127.0.0.1:8080/squeezenet/predict -F "[email protected]" | jq -r

Page 15: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Quick Start - Docker# create directory

mkdir -p ~/mxnet-model-server-quickstart-docker && cd $_

# run model server for Apache MXNet

docker run -itd --name mms -p 8080:8080 awsdeeplearningteam/mms_cpu \

mxnet-model-server start --mms-config /mxnet_model_server/mms_app_cpu.conf

# classify an image

curl -s -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg

curl -s http://127.0.0.1:8080/squeezenet/predict -F "[email protected]" | jq -r

Page 16: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Why developers love containers

Software Packaging

Distribution Immutable Infrastructure

Page 17: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Server

Guest OS

Bins/Libs Bins/Libs

App2App1

Managing One Host Is Straightforward

Page 18: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Managing a Fleet Is Hard

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Availability Zone 1 Availability Zone 2

Availability Zone 3

Server

Guest OS

Server

Guest OS

Server

Guest OS

Server

Guest OS

Page 19: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Model ServerModel ServerModel Server

Model Artifacts

Load Balancing

Generic Architecture

Page 20: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

AWS Fargate and Apache MXNet Model Server

AWS Fargate(Data Processing and Model Archive)

Elastic Load Balancing

Amazon ECR(Inference Code Image)

Page 21: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Quick Start – AWS Fargate CLI – Basic Task Definition {

"family": "mxnet-model-server-fargate",

"networkMode": "awsvpc",

"containerDefinitions": [

{

"name": "mxnet-model-server-fargate-app",

"image": "awsdeeplearningteam/mms_cpu",

"portMappings": [

{

"containerPort": 8080,"hostPort": 8080,"protocol": "tcp"

}

],

"essential": true,

"entryPoint": [

"mxnet-model-server","start","--mms-config","/mxnet_model_server/mms_app_cpu.conf"

]

}

],

"requiresCompatibilities": [

"FARGATE"

],

"cpu": "256",

"memory": "512"

}

Page 22: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Quick Start – AWS Fargate CLI # register task definition

aws ecs register-task-definition --cli-input-json file://basic_taskdefinition.json

# create AWS Fargate cluster

aws ecs create-cluster --cluster-name mxnet-model-server

# create service

aws ecs create-service \

--cluster mxnet-model-server \

--service-name mxnet-model-server-fargate-service \

--task-definition mxnet-model-server-fargate:1 \

--desired-count 2 \

--launch-type "FARGATE" \

--network-configuration "awsvpcConfiguration={subnets=[$MMS_SUBNETS],securityGroups=[$MMS_SG_ID]}" \

--load-balancers "targetGroupArn=$MMS_TG_ARN,containerName=$MMS_CONTAINER_NAME,containerPort=8080"

Page 23: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Trade-Offs?

Page 24: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Amazon SageMaker

Amazon S3(Model Archive)

Amazon SageMaker(ML Hosting Service)

Amazon ECR(Inference Code Image)

Page 25: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Quick Start – Amazon SageMaker CLI # register model

aws sagemaker create-model \

--model-name image-classification-model \

--primary-container Image=$INFERENCE_IMG,ModelDataUrl=$MODEL_DATA_URL \

--execution-role-arn $EXECUTION_ROLE_ARN

# register next-generation model

aws sagemaker create-model \

--model-name image-classification-model-nextgen \

--primary-container Image=$INFERENCE_IMG_NG,ModelDataUrl=$MODEL_DATA_URL_NG \

--execution-role-arn $EXECUTION_ROLE_ARN

Page 26: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Quick Start – Amazon SageMaker CLI # create endpoint configuration

aws sagemaker create-endpoint-config \--endpoint-config-name "image-classification-config" \

--production-variants ‘[ { "VariantName" : "A", "ModelName" : "image-classification-model", "InitialInstanceCount" : 1, "InstanceType" : "ml.t2.medium", "InitialVariantWeight" : 8

}, { "VariantName" : "B", "ModelName" : "image-classification-model-nextgen", "InitialInstanceCount" : 1, "InstanceType" : "ml.t2.medium", "InitialVariantWeight" : 2

} ]’

# create endpoint

aws sagemaker create-endpoint \

--endpoint-name image-classification \

--endpoint-config-name image-classification-endpoint-config \&& aws sagemaker wait endpoint-in-service --endpoint-name image-classification

Page 27: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Other Options?

Page 28: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Amazon S3(Model Archive)

AWS Fargate(Data Processing)

Elastic Load Balancing

Amazon API Gateway

(API Proxy)

Amazon ECR(Inference Code Image)

Page 29: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Amazon S3(Modell Archiv)

Amazon SageMaker(ML Hosting Service)

Amazon ECR(Inferenz Code Image)

Amazon API Gateway

(API Proxy)

AWS Lambda(Data Pre-Processing)

https://medium.com/@julsimon/using-chalice-to-serve-sagemaker-predictions-a2015c02b033

Page 30: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Summary

Fetch data

Clean and

format data

Prepareand

transform data

Model TrainingEvaluation

and verificaton

Integrationand

deployment

Operations, monitoring, and

optimizationHow can I use pre-trained

models for serving predictions?

How to scale effectively?

Page 31: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Summary

Fetch data

Clean and

format data

Prepareand

transform data

Model TrainingEvaluation

and verificaton

Integrationand

deployment

Operations, monitoring, and

optimization1

2

3

API Gateway + Lambda + S3

ELB + AWS Fargate

Amazon SageMaker

Page 32: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Resources

• Serverless Computing and Applicationshttps://aws.amazon.com/serverless/

• Seamlessly Scale Predictions with AWS Lambda and MXNethttps://aws.amazon.com/blogs/compute/seamlessly-scale-predictions-with-aws-lambda-and-mxnet/

• Model Server for Apache MXNethttps://github.com/awslabs/mxnet-model-server

• Serverless Predictions at Scalehttps://github.com/aws-samples/aws-ai-bootcamp-labs/blob/master/serverless_predictions.MD

• Serverless Inference with MMS on AWS Fargatehttps://github.com/awslabs/mxnet-model-server/blob/master/docs/mms_on_fargate.md

• Introducing Model Server for Apache MXNethttps://aws.amazon.com/blogs/machine-learning/introducing-model-server-for-apache-mxnet/

• Using Chalice to serve SageMaker predictionshttps://medium.com/@julsimon/using-chalice-to-serve-sagemaker-predictions-a2015c02b033

Page 33: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Before you go...

Page 34: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Sessions (AI & ML)

Wednesday, June 6 Thursday, June 713:00-15:00 AWS ML & AI Deep Dive

Julien Simon, AWS13:00-14:00 Machine Learning & AI at Amazon

Ian Massingham, AWS

15:00-16:00 Build Your Recommendation Engineson AWS TodayYotam Yarden & Cyrus Vahid, AWS

14:00-15:00 The Machine Learning Process: FromBusiness Model to Machine Learning in ProductionConstantin Gonzalez, AWS

16:00-17:00 Serverless Predictions at ScaleThomas Reske, AWS

15:00-16:00 Using Machine Learning Algorithmsto create Market TransparencyDr. Markus Schmidberger & Dr. Markus Ludwig, Scout24 AG

16:00-17:00 Needle in the Live Video Cloud haystack – AI/ML for mass live acquisitionMatija Tonejc & David Querq, Make.TV

Page 35: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Please complete the session survey in the summit mobile app.

Page 36: Serverless Predictions at Scale - DevOps School...© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Serverless computing allows you to build and run applications

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Thank You!

Thank you!