15-319 / 15-619 cloud computingmsakr/15619-f18/recitations/f18_recitation05.pdf · aws s3 cp...
TRANSCRIPT
15-319 / 15-619Cloud Computing
Recitation 5
September 25th, 2018
1
Overview● Administrative issues
○ Office Hours, Piazza guidelines● Last week’s reflection
○ Project 2.2, OLI Unit 2 (Modules 5, 6)● This week’s schedule
○ Quiz 4 - Sep 28, 2018 (Modules 7, 8, 9)○ Project 2.3 - Sep 30, 2018○ Primers - NoSQL, HBase, Profiling a Cloud
Service, Storage I/O Benchmarking○ Team Formation
2
Team Project - Time to Team Up15-619 Students:● Start to form your teams
○ Choose carefully as you cannot change teams○ Look for a mix of skills in the team
■ Web tier: web framework performance■ Storage tier: deploy and optimize MySQL and HBase■ Extract, Transform and Load (ETL)
● Create an AWS account only for the team project
15-319 Students:● You are allowed to participate in the team project● Once committed to a team, can’t quit● Earn a significant bonus for participating in the team project● If you are a 15-319 student and want to participate in the team
project, please email the professors.
3
Team Formation - DeadlinesFollow the instructions in @1396 carefully● By Friday 9/28 at 11:59 PM ET
○ Identify your team members○ One team member should form a team on TPZ and all other team
members should accept the invitation■ completing this step will freeze your team
● By Saturday 9/29 at 11:59 PM ET○ Create a new AWS account ⇒ only used for the team project○ Update the team profile in TPZ with the
■ new AWS ID aws-id and ■ the time slot that your team will participate in a group exercise
●
● By Sunday 9/30 at 11:59 PM ET○ Finish reading the Profiling a Cloud Service primer to get yourself
prepared for the online group exercise before 10/14
Administrative
• COSTS!!– Please monitor your expenses regularly– Especially for 2.3 and AWS Lambda – Keep in mind that besides instances, other
resources cost money-CloudSearch and Rekognition• TAGS!!
– Tag Immediately - Tag requests for spot instances do not always propagate
– Tag Correctly - Tag for the current running project• Piazza
– Public posts help everyone• Office hours
5
Administrative
● Quiz ○ Need to click submit○ Quiz duration : 2 hrs (keep track of time)
● DO NOT upload code to public repositories (Github/Bitbucket/etc.)
● Reflections posts○ Discuss the approach adopted to solve the
problem, challenges faced, cool insights○ DO NOT share code or pseudocode
● Provide clean, modular and well documented code○ Helps everyone
6
Reflection on Last Week
● OLI: Conceptual Content○ Unit 2 - Modules 5 and 6:
■ Cloud Management & Software Deployment Considerations
○ Quiz 3 completed● P2.2:
○ Deploying a containerized application on container clusters on multiple clouds■ Containers■ Docker ■ Kubernetes
7
Reflection on P2.2
● Dockers○ Benefits and challenges.
● Kubernetes○ Explain how containers communicate with each other,
among the Kubernetes cluster, and with the host machine.
○ Working with Kubernetes clusters and deploying applications to multiple clouds.
○ Describe the advantages of using the container cluster framework, realizing the importance of such tools in terms of application scalability, portability, and management.
8
Project 2.2
● 15% manual grading
○ Readability
○ Use CheckStyle
○ References
■ List all resources that you took help from
■ Not doing so may incur penalties/AIV!
■ Validate references file using JSONlint.com
9
10
Example References File in JSON
This Week’s Schedule● Unit 3 (Modules 7, 8 and 9) ● Quiz 4
○ Deadline, Friday, 11:59pm ET, Sep 28● Project 2.2 Project Reflection Feedback
○ Deadline, Sunday, 11:59pm ET, Sep 30● Complete Project 2.3 (including Project Reflection)
○ Using AWS Lambda/Azure Functions/GCP functions○ Deadline, Sunday, 11:59pm ET, Sep 30
● 4 Primers○ NoSQL, HBase, Profiling a Cloud Service, Storage Bench
● 15-619 Team Project, Team Formation:○ Read instructions on Piazza carefully
https://piazza.com/class/jkvtywetsu35vh?cid=1396
11
Project Primers● P2.3
○ Introduction to Cloud Functions (released on 9/17)
● 4 new primers released this week (9/24)○ Team Project
■ Profiling a Cloud Service● Read content to prepare for team programming
exercise starting 10/1 on Cloud9○ P3.1
■ NoSQL■ HBase■ Storage I/O Benchmarking
● Relevant to the team project as well!
12
This Week: Content
● UNIT 3: Virtualizing Resources for the Cloud
○ Module 7: Introduction and Motivation○ Module 8: Virtualization○ Module 9: Resource Virtualization - CPU○ Module 10: Resource Virtualization - Memory ○ Module 11: Resource Virtualization – I/O ○ Module 12: Case Study○ Module 13: Network and Storage Virtualization
13
OLI Module 7 - VirtualizationIntroduction and Motivation
● Why virtualization
○ Elasticity
○ Resource sandboxing
○ Mixed OS environment
○ Resource sharing
○ Improved system utilization and reduced costs
14
OLI Module 8 - Virtualization
● What is Virtualization○ Involves the construction of an isomorphism that
maps a virtual guest system to a real (or physical) host system
○ Sequence of operations e modify guest state○ Mapping function V(Si)
● Virtual Machine Types○ Process Virtual Machines○ System Virtual Machines
15
OLI Module 9Resource Virtualization - CPU
● Steps of CPU Virtualization○ Multiplexing a physical CPU among virtual CPUs○ Virtualizing the ISA (Instruction Set Architecture) of a
CPU
● Code Patch, Full Virtualization and Paravirtualization● Emulation (Interpretation & Binary Translation)● Virtual CPU
16
Project 2
Running Theme: Automating and scaling distributed systems
17
P2.1
EC2 VMs
P2.2Containers
This week!
P2.3Functions
Serverless Computing● Develop and run applications on servers without
worrying about server management.● Applications can have one or more functions.● The cloud service provider provides the server
to run the application.● The developer designs the application and runs
it without having to manage any servers or worry about scaling.○ Pay-per-invocation model
● Functions-as-a-Service (FaaS) is a use-case of serverless computing
18
Why use Cloud Functions● High availability ● Resiliency: each execution is contained and
isolated, and thus have no impact on other executions
● Built in logging and monitoring: Easily accessible logs from CloudWatch ensures traceability
● Event-driven: functions can be triggered by events like S3 file uploads
19
Cloud Functions● Possible use cases
○ Chat bots○ Mobile backends○ How about Mapreduce?
• Not suitable for FaaS• EMR is PaaS
● FaaS, typically stateless functions● Pay per # of function invocations
+ running time● Scalability is automatic through
provider
20
● AWS FaaS offering● Only pay for number of
invocations and the computetime, that is, when your code runs
● Stateless…○ Every lambda function has 500MB of
non-persistent disk space in its own /tmp dir
● Debug?○ CloudWatch is your friend
AWS Lambda
21
● Handler○ Execution starts here○ Triggered by events
● Context○ Interact w/ AWS Lambda execution
environment e.g. getFunctionName(), getLogger() etc.
● Logging● Exceptions● Be stateless!
AWS Lambda Programming Model
22
Triggers
● Events trigger Lambda functions● Events are passed to function as input
parameter● Event sources publish events that cause the
cloud function to be invoked○ AWS Lambda: Create object in S3 bucket,
SNS topic etc.○ Azure: HTTPTrigger, BLOBTrigger etc.○ GCP: file upload to Cloud Storage, message
on a Cloud Pub/Sub topic etc.
23
Project 2.3 - Overview
● Task 1○ Fibonacci function in Azure Functions○ Power sets function in GCP Cloud Functions○ CIDR block statistics in AWS Lambda
● Task 2○ Lambda and FFmpeg to generate
thumbnails ● Task 3
○ Rekognition and CloudSearch to label thumbnails and index for video search
24
● HTTP triggered functions● Subtask 1
○ Fibonacci - Azure Functions
● Subtask 2○ PowerSets - GCP Functions
● Subtask 3○ CIDR - AWS Lambda
• Java: use SubnetUtils class from Apache’s Commons Net.• Python users should consider the python-iptools
package or ipaddress from the Python standard library.
P2.3 Task1: Cloud Functions
25
● Event driven functions using AWS Lambda● Possible event sources:
○ S3, SNS, and CloudWatch Logs● Event delivery protocol
○ Pull (Kinesis, DynamoDB Streams)○ Push (S3, API Gateway)
• S3 does not support notifying multiple Lambda functions for the same event type
• Use SNS to fanout the event trigger
P2.3 Task2: Event Driven Functions
26
Project 2.3 - Task 3Video Processing Pipeline● AWS Lambda and FFmpeg to process videos● AWS Rekognition for image labeling● AWS CloudSearch to index videos based on
labels
27
● Fully managed AWS service for searching○ Scaling - disable auto-scaling for this project○ Cost - use search.m1.small ($0.059 per hour)○ Batch uploads ($0.10 per 1,000 Batch Upload
Requests, each batch <= 5 MB)
AWS CloudSearch
29
● Image recognition service
● Pricing!!○ $1.00 per 1,000 images○ Keep budget for testing○ Be careful not to exhaust your budget
AWS Rekognition
30
● To mitigate overspending on Rekognition, we have provided you with a CloudWatch Alarm terraform template that you must apply.
● The submitter will verify that you have not exceeded some threshold of calls to Rekognition before proceeding.
● Useful trick for future projects!
CloudWatch Alarm
31
Project 2.3 - Hints
● Make sure you are authenticated before submitting (EC2 IAM role, az login, gcloud auth)
● Review the suggested libraries● ffmpeg + ffmpy works well in Python 2.7● Test your functions and triggers manually first
○ aws s3 cp video.mp4 s3://your-video-bucket● Review the CloudWatch Logs for errors● Use the /tmp directory for storing images and video
during a function execution● But.. remember that functions are stateless● Check for misconfigured permissions
○ S3 -> SNS, SNS -> Lambda, Lambda -> S3 / CS
32
Project 2.3 Penalties
33
Upcoming Deadlines
34
• Quiz 4: Modules 7, 8 and 9:
Due: Friday September 28, 2018 11:59PM Pittsburgh
• Project 2.3: Functions as a Service
Due: Sunday September 30, 2018 11:59PM Pittsburgh
• Team Project: Team Formation
Due: Friday 9/28 at 11:59 PM Pittsburgh
Team Formation - DeadlinesFollow the instructions in @1396 carefully● By Friday 9/28 at 11:59 PM ET
○ Identify your team members○ One team member should form a team on TPZ and all other team
members should accept the invitation■ completing this step will freeze your team
● By Saturday 9/29 at 11:59 PM ET○ Create a new AWS account ⇒ only used for the team project○ Update the team profile in TPZ with the
■ new AWS ID aws-id and ■ the time slot that your team will participate in a group exercise
●
● By Sunday 9/30 at 11:59 PM ET○ Finish reading the Profiling a Cloud Service primer to get yourself
prepared for the online group exercise before 10/135
Questions?
36