15-319 / 15-619 cloud computingmsakr/15619-f18/recitations/f18_recitation05.pdf · aws s3 cp...

36
15-319 / 15-619 Cloud Computing Recitation 5 September 25 th , 2018

Upload: others

Post on 08-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

15-319 / 15-619Cloud Computing

Recitation 5

September 25th, 2018

1

Page 2: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Overview● Administrative issues

○ Office Hours, Piazza guidelines● Last week’s reflection

○ Project 2.2, OLI Unit 2 (Modules 5, 6)● This week’s schedule

○ Quiz 4 - Sep 28, 2018 (Modules 7, 8, 9)○ Project 2.3 - Sep 30, 2018○ Primers - NoSQL, HBase, Profiling a Cloud

Service, Storage I/O Benchmarking○ Team Formation

2

Page 3: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Team Project - Time to Team Up15-619 Students:● Start to form your teams

○ Choose carefully as you cannot change teams○ Look for a mix of skills in the team

■ Web tier: web framework performance■ Storage tier: deploy and optimize MySQL and HBase■ Extract, Transform and Load (ETL)

● Create an AWS account only for the team project

15-319 Students:● You are allowed to participate in the team project● Once committed to a team, can’t quit● Earn a significant bonus for participating in the team project● If you are a 15-319 student and want to participate in the team

project, please email the professors.

3

Page 4: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Team Formation - DeadlinesFollow the instructions in @1396 carefully● By Friday 9/28 at 11:59 PM ET

○ Identify your team members○ One team member should form a team on TPZ and all other team

members should accept the invitation■ completing this step will freeze your team

● By Saturday 9/29 at 11:59 PM ET○ Create a new AWS account ⇒ only used for the team project○ Update the team profile in TPZ with the

■ new AWS ID aws-id and ■ the time slot that your team will participate in a group exercise

● By Sunday 9/30 at 11:59 PM ET○ Finish reading the Profiling a Cloud Service primer to get yourself

prepared for the online group exercise before 10/14

Page 5: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Administrative

• COSTS!!– Please monitor your expenses regularly– Especially for 2.3 and AWS Lambda – Keep in mind that besides instances, other

resources cost money-CloudSearch and Rekognition• TAGS!!

– Tag Immediately - Tag requests for spot instances do not always propagate

– Tag Correctly - Tag for the current running project• Piazza

– Public posts help everyone• Office hours

5

Page 6: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Administrative

● Quiz ○ Need to click submit○ Quiz duration : 2 hrs (keep track of time)

● DO NOT upload code to public repositories (Github/Bitbucket/etc.)

● Reflections posts○ Discuss the approach adopted to solve the

problem, challenges faced, cool insights○ DO NOT share code or pseudocode

● Provide clean, modular and well documented code○ Helps everyone

6

Page 7: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Reflection on Last Week

● OLI: Conceptual Content○ Unit 2 - Modules 5 and 6:

■ Cloud Management & Software Deployment Considerations

○ Quiz 3 completed● P2.2:

○ Deploying a containerized application on container clusters on multiple clouds■ Containers■ Docker ■ Kubernetes

7

Page 8: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Reflection on P2.2

● Dockers○ Benefits and challenges.

● Kubernetes○ Explain how containers communicate with each other,

among the Kubernetes cluster, and with the host machine.

○ Working with Kubernetes clusters and deploying applications to multiple clouds.

○ Describe the advantages of using the container cluster framework, realizing the importance of such tools in terms of application scalability, portability, and management.

8

Page 9: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Project 2.2

● 15% manual grading

○ Readability

○ Use CheckStyle

○ References

■ List all resources that you took help from

■ Not doing so may incur penalties/AIV!

■ Validate references file using JSONlint.com

9

Page 10: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

10

Example References File in JSON

Page 11: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

This Week’s Schedule● Unit 3 (Modules 7, 8 and 9) ● Quiz 4

○ Deadline, Friday, 11:59pm ET, Sep 28● Project 2.2 Project Reflection Feedback

○ Deadline, Sunday, 11:59pm ET, Sep 30● Complete Project 2.3 (including Project Reflection)

○ Using AWS Lambda/Azure Functions/GCP functions○ Deadline, Sunday, 11:59pm ET, Sep 30

● 4 Primers○ NoSQL, HBase, Profiling a Cloud Service, Storage Bench

● 15-619 Team Project, Team Formation:○ Read instructions on Piazza carefully

https://piazza.com/class/jkvtywetsu35vh?cid=1396

11

Page 12: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Project Primers● P2.3

○ Introduction to Cloud Functions (released on 9/17)

● 4 new primers released this week (9/24)○ Team Project

■ Profiling a Cloud Service● Read content to prepare for team programming

exercise starting 10/1 on Cloud9○ P3.1

■ NoSQL■ HBase■ Storage I/O Benchmarking

● Relevant to the team project as well!

12

Page 13: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

This Week: Content

● UNIT 3: Virtualizing Resources for the Cloud

○ Module 7: Introduction and Motivation○ Module 8: Virtualization○ Module 9: Resource Virtualization - CPU○ Module 10: Resource Virtualization - Memory ○ Module 11: Resource Virtualization – I/O ○ Module 12: Case Study○ Module 13: Network and Storage Virtualization

13

Page 14: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

OLI Module 7 - VirtualizationIntroduction and Motivation

● Why virtualization

○ Elasticity

○ Resource sandboxing

○ Mixed OS environment

○ Resource sharing

○ Improved system utilization and reduced costs

14

Page 15: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

OLI Module 8 - Virtualization

● What is Virtualization○ Involves the construction of an isomorphism that

maps a virtual guest system to a real (or physical) host system

○ Sequence of operations e modify guest state○ Mapping function V(Si)

● Virtual Machine Types○ Process Virtual Machines○ System Virtual Machines

15

Page 16: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

OLI Module 9Resource Virtualization - CPU

● Steps of CPU Virtualization○ Multiplexing a physical CPU among virtual CPUs○ Virtualizing the ISA (Instruction Set Architecture) of a

CPU

● Code Patch, Full Virtualization and Paravirtualization● Emulation (Interpretation & Binary Translation)● Virtual CPU

16

Page 17: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Project 2

Running Theme: Automating and scaling distributed systems

17

P2.1

EC2 VMs

P2.2Containers

This week!

P2.3Functions

Page 18: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Serverless Computing● Develop and run applications on servers without

worrying about server management.● Applications can have one or more functions.● The cloud service provider provides the server

to run the application.● The developer designs the application and runs

it without having to manage any servers or worry about scaling.○ Pay-per-invocation model

● Functions-as-a-Service (FaaS) is a use-case of serverless computing

18

Page 19: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Why use Cloud Functions● High availability ● Resiliency: each execution is contained and

isolated, and thus have no impact on other executions

● Built in logging and monitoring: Easily accessible logs from CloudWatch ensures traceability

● Event-driven: functions can be triggered by events like S3 file uploads

19

Page 20: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Cloud Functions● Possible use cases

○ Chat bots○ Mobile backends○ How about Mapreduce?

• Not suitable for FaaS• EMR is PaaS

● FaaS, typically stateless functions● Pay per # of function invocations

+ running time● Scalability is automatic through

provider

20

Page 21: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

● AWS FaaS offering● Only pay for number of

invocations and the computetime, that is, when your code runs

● Stateless…○ Every lambda function has 500MB of

non-persistent disk space in its own /tmp dir

● Debug?○ CloudWatch is your friend

AWS Lambda

21

Page 22: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

● Handler○ Execution starts here○ Triggered by events

● Context○ Interact w/ AWS Lambda execution

environment e.g. getFunctionName(), getLogger() etc.

● Logging● Exceptions● Be stateless!

AWS Lambda Programming Model

22

Page 23: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Triggers

● Events trigger Lambda functions● Events are passed to function as input

parameter● Event sources publish events that cause the

cloud function to be invoked○ AWS Lambda: Create object in S3 bucket,

SNS topic etc.○ Azure: HTTPTrigger, BLOBTrigger etc.○ GCP: file upload to Cloud Storage, message

on a Cloud Pub/Sub topic etc.

23

Page 24: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Project 2.3 - Overview

● Task 1○ Fibonacci function in Azure Functions○ Power sets function in GCP Cloud Functions○ CIDR block statistics in AWS Lambda

● Task 2○ Lambda and FFmpeg to generate

thumbnails ● Task 3

○ Rekognition and CloudSearch to label thumbnails and index for video search

24

Page 25: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

● HTTP triggered functions● Subtask 1

○ Fibonacci - Azure Functions

● Subtask 2○ PowerSets - GCP Functions

● Subtask 3○ CIDR - AWS Lambda

• Java: use SubnetUtils class from Apache’s Commons Net.• Python users should consider the python-iptools

package or ipaddress from the Python standard library.

P2.3 Task1: Cloud Functions

25

Page 26: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

● Event driven functions using AWS Lambda● Possible event sources:

○ S3, SNS, and CloudWatch Logs● Event delivery protocol

○ Pull (Kinesis, DynamoDB Streams)○ Push (S3, API Gateway)

• S3 does not support notifying multiple Lambda functions for the same event type

• Use SNS to fanout the event trigger

P2.3 Task2: Event Driven Functions

26

Page 27: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Project 2.3 - Task 3Video Processing Pipeline● AWS Lambda and FFmpeg to process videos● AWS Rekognition for image labeling● AWS CloudSearch to index videos based on

labels

27

Page 29: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

● Fully managed AWS service for searching○ Scaling - disable auto-scaling for this project○ Cost - use search.m1.small ($0.059 per hour)○ Batch uploads ($0.10 per 1,000 Batch Upload

Requests, each batch <= 5 MB)

AWS CloudSearch

29

Page 30: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

● Image recognition service

● Pricing!!○ $1.00 per 1,000 images○ Keep budget for testing○ Be careful not to exhaust your budget

AWS Rekognition

30

Page 31: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

● To mitigate overspending on Rekognition, we have provided you with a CloudWatch Alarm terraform template that you must apply.

● The submitter will verify that you have not exceeded some threshold of calls to Rekognition before proceeding.

● Useful trick for future projects!

CloudWatch Alarm

31

Page 32: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Project 2.3 - Hints

● Make sure you are authenticated before submitting (EC2 IAM role, az login, gcloud auth)

● Review the suggested libraries● ffmpeg + ffmpy works well in Python 2.7● Test your functions and triggers manually first

○ aws s3 cp video.mp4 s3://your-video-bucket● Review the CloudWatch Logs for errors● Use the /tmp directory for storing images and video

during a function execution● But.. remember that functions are stateless● Check for misconfigured permissions

○ S3 -> SNS, SNS -> Lambda, Lambda -> S3 / CS

32

Page 33: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Project 2.3 Penalties

33

Page 34: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Upcoming Deadlines

34

• Quiz 4: Modules 7, 8 and 9:

Due: Friday September 28, 2018 11:59PM Pittsburgh

• Project 2.3: Functions as a Service

Due: Sunday September 30, 2018 11:59PM Pittsburgh

• Team Project: Team Formation

Due: Friday 9/28 at 11:59 PM Pittsburgh

Page 35: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Team Formation - DeadlinesFollow the instructions in @1396 carefully● By Friday 9/28 at 11:59 PM ET

○ Identify your team members○ One team member should form a team on TPZ and all other team

members should accept the invitation■ completing this step will freeze your team

● By Saturday 9/29 at 11:59 PM ET○ Create a new AWS account ⇒ only used for the team project○ Update the team profile in TPZ with the

■ new AWS ID aws-id and ■ the time slot that your team will participate in a group exercise

● By Sunday 9/30 at 11:59 PM ET○ Finish reading the Profiling a Cloud Service primer to get yourself

prepared for the online group exercise before 10/135

Page 36: 15-319 / 15-619 Cloud Computingmsakr/15619-f18/recitations/F18_Recitation05.pdf · aws s3 cp video.mp4 s3://your-video-bucket Review the CloudWatch Logs for errors Use the /tmp directory

Questions?

36