Download - Exploring The Cloud
Exploring the Cloud
Chris Sosa, Dr. Andrew Grimshaw
sosa, grimshaw @ cs.virginia.edu
University of Virginia
University of Virginia 2
Introduction - 1
Ever-increasing demand for computing resources
During non-peak times, computing resources sit idle Still paying! Power, cooling, etc
Total Cost of Ownership (TCO) is much more than the cost of Hardware Maintenance Administration Cooling Etc.
University of Virginia 3
Introduction - 2
Observation – load on main ITC clusters exhibit bimodal distribution
Can we only pay for what we use?
University of Virginia 4
Enter Cloud Computing (field trip!)
What is it?Infrastructure-related capabilities provided as a
serviceAlso known as utility-computing and is
associated with very basic API’sLots of industry support
Amazon Infrastructure Services: EC2, S3, … Google App EngineMicrosoft AzureIBM led initiatives
University of Virginia 5
Cloud Computing Paradigms
Top-down: Client only provides program and deployment informationMicrosoft AzureGoogle App Engine
Bottom-up: Raw Infrastructure provided (virtualized hardware) AmazonNirvanixFlexiscaleGoGrid
University of Virginia 6
Advantages and Disadvantages of Using the CloudAdvantages
Pay for what you use – model is based on how long you use resources. You can allocate and deallocate them on-the-fly
Hardware cost, set-up time, maintenance, cooling all go down to zero
Can start developing immediatelyDisadvantages
No control over physical resources. Do you trust Amazon? SLA’s may not be good enough. Is 99.95% availability good
enough? Some limitations in what you can run. Must stay within the API /
framework given
University of Virginia 7
Why Cloud Computing
Only have to pay for what we useDisadvantages do not affect most users in a
batch system
University of Virginia 8
Amazon Leading the Push
Amazon has been most successful player so farOver 29 billion objects stored on S3Using over 60% of their resources for Cloud
servicesEC2 just went out of Beta in October (new)
… rest of these slides will assume we use Amazon
University of Virginia 9
Outline
IntroductionOverview of Amazon Cloud ServicesProposal of Hybrid SchedulerQuestions to be AnsweredConclusion
University of Virginia 10
Amazon S3
Simple Storage for the InternetApplications can interact with various mechanisms
REST SOAP Bit Torrent
250 Mb/second network linkObjects stored in buckets
Buckets have own namespace Up to 100 buckets per account Unlimited objects per bucket 5 GB limit on size of objects Objects are write-once
SLA guarantees 99.9% availability
University of Virginia 11
S3 Pricing
Storage$0.15 per GB-Month of storage used
Data Transfer$0.10 per GB - all data transfer in$0.18 per GB - first 10 TB / month data transfer out$0.16 per GB - next 40 TB / month data transfer out$0.13 per GB - data transfer out / month over 50 TBFREE to EC2
Requests$0.01 per 1,000 PUT or LIST requests$0.01 per 10,000 GET and all other requests*
* No charge for delete requests
University of Virginia 12
Amazon EC2
Provides Virtual Compute Resources Purchase CPU’s on hourly
basis Can use provided virtual
machine images, or make own Virtual Machines run atop Xen
Can do meta data operations with REST, SOAP, command-line tools
Instances assigned IP address for SSH, remote desktop, etc
SLA guarantees 99.95% availability
University of Virginia 13
EC2 Pricing
Instances $0.10 / hr - Small Instance - 1.7 GB of memory, 1 EC2
Compute Unit (1 virtual core - 1.7 GHz processor), 160 GB of instance storage, 32-bit platform (can buy in sets of 1, 4, 8)
$0.20 / hr - High-CPU Medium Instance 1.7 GB of memory, 5 EC2 Compute Units (2 virtual cores with 2.5 EC2 Compute Units each), 350 GB of instance storage, 32-bit platform (can buy in sets of 1 or 4)
Data Transfer $0.10 per GB – data in $0.18 per GB - first 10 TB out FREE to S3
University of Virginia 14
Overview
IntroductionOverview of Amazon Cloud ServicesProposal of Hybrid SchedulerQuestions to be AnsweredConclusion
University of Virginia 15
Main Idea
Reduce the number of resources we have active and improve peak performance
Modify local schedulerWhen CPU usage is above threshold, allocate
new machines from EC2 and schedule jobsAs usage decreases, deallocate resources and
return to normal usage
University of Virginia 16
Design
Amazon EC2
...Virtualized
PC...
Virtual Machine
... ...
Genesis II
... ...
Am
azon
S
3 AMI
. . .
Local Cluster
Batch Scheduler
Sch
edul
e jo
bsC
heck
usa
ge
Allocate resource on-the-flyOff-load jobs
University of Virginia 17
Research Setup
Instead of spending funds on running experiments using EC2 and S3, we will be using Eucalyptus to emulate EC2Eucalyptus is an open-source implementation of the
EC2 interfaceRequires Xen be installed on host machines (need
dedicated machines)Create a centralized repository for data for our
tests (S3)NFS shareOther possibilities?
University of Virginia 18
Task Bar
1. Decide on the software that will be installed on the virtual machines1. PBS licensing is complicated and expensive2. Several alternatives such as Genesis II, Hadoop, etc.
2. Create AMI image and register with Eucalyptus3. Incorporate virtual machines from Eucalyptus into existing
scheduler and create mechanism to do this on-the-fly4. Modify scheduler to take into account a threshold5. Build stubs to measure how much bandwidth, time, etc. is being
used by the scheduler so that we can determine the price we would be charged by Amazon's EC2 and S3
6. Incorporate these costs, build economic model using actual workloads at UVa, differing thresholds, and various ways of passing jobs to the Cloud
University of Virginia 19
Overview
IntroductionOverview of Amazon Cloud ServicesProposal of Hybrid SchedulerQuestions to be AnsweredConclusion
University of Virginia 20
Questions to be Answered
What is the Cost Model associated with working with Cloud computing?
What costs would be associated with common jobs being run at UVa?
What software will we have installed on the Virtual Machines in the Cloud?
How can we create a threshold such that we can decide on-the-fly when to start offloading resources to Cloud resources?
University of Virginia 21
Overview
IntroductionOverview of Amazon Cloud ServicesProposal of Hybrid SchedulerQuestions to be AnsweredConclusion
University of Virginia 22
Conclusions
Important to be concerned about reducing costs as well as getting bigger bang for your buck
Offloading job processing to Cloud computing infrastructures can save costs while improving peak throughput
University of Virginia 23
Questions?