cloud computing and high performance networking
DESCRIPTION
Cloud Computing and High Performance Networking. David Irwin Computer Science Department University of Massachusetts, Amherst. Cloud Computing. - PowerPoint PPT PresentationTRANSCRIPT
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Cloud Computing and High Performance Networking
David Irwin
Computer Science DepartmentUniversity of Massachusetts, Amherst
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Cloud Computing
Wikipedia: “Internet-based computing, whereby shared resources, software and information are provided to computers and other devices on-demand, like the electricity grid”
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Cloud Computing
Shared resources == lots of computers in data centers
Benefit from “Economy of Scale” Cost per unit falls as scale increases I.e., like Costco but for computers and software services
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Outline
Virtualized Data Centers The foundation for cloud computing
Hardware Virtualization The foundation for virtualized data centers
Public/private Cloud Computing Relevance to education
Shared testbeds NSF’s GENI prototype
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Data Center Overview Large Server and Storage Farms
Used by enterprises to run server applications Used by Internet companies
Google, Facebook, YouTube, Amazon Size varies depending on needs
Architecture Traditional: applications run on physical servers
Manual mapping of applications to servers IT admins deal with “change”
Modern: virtualized data centers Application runs inside of virtual servers; VM mapped to
physical servers Provides flexibility in mapping from virtual to physical
resources
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Virtualized Data Center
Simplifies resource management Application started from preconfigured VM images, e.g., virtual
appliances Virtualization layer permits resource allocations to vary
dynamically Migrate VMs between physical machines with no down-time
Workload management Internet applications dynamic workloads
How much capacity to allocate to applications? Traditional approach: IT admins estimate peak
workloads and provision sufficient servers Flash crowd react manually by adding capacity Time scale of hours: lost revenue, bad publicity for
application
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Dynamic Provisioning
Track workload and dynamically provision capacity
Monitor Predict Provision Predictive versus reactive provisioning
Predictive: predict future workload and provision Reactive: react whenever capacity falls short of demand
Traditional data centers: bring up a new server Borrow from free pool or reclaim under-used server
Virtualized data center: exploit virtualization to speed up application startup time
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Outline
Virtualized Data Centers The foundation for cloud computing
Hardware Virtualization The foundation for virtualized data centers
Public/private Cloud Computing Relevance to education
Shared testbeds NSF’s GENI prototype
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Virtual machines are hotHeadlines from August 2007
VMware IPO: $19.1 billion
Xen sale: $500 million
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Traditional OS Structure
Host Machine
Operating System
App App App App
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
OS abstractions
HardwareHardware
OSOS
ApplicationsApplications
What are the interfaces and the resources?What is being virtualized?
InstructionsCPU
Virtual addrsPhysical mem
Syst callsI/O devices
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
OS abstractions
HardwareHardware
OSOS
ApplicationsApplications
What are the interfaces and the resources?What is being virtualized?
InstructionsCPU
Virtual addrsPhysical mem
Syst callsI/O devices
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Virtual Machine Structure
Host Machine
Virtual Machine Monitor (Hypervisor)
Guest OS
GuestApp
GuestApp
Guest OS Guest OS
GuestApp
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Why are VMs useful?
Code reuse Can run old operating systems + apps on new hardware Original purpose of VMs by IBM in the 60s
Encapsulation Can put entire state of an “application” in one thing Move it, restore it, copy it, etc
Isolation, security All interactions with hardware are mediated Hypervisor can keep one VM from affecting another Hypervisor cannot be corrupted by guest operating
systems
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Encapsulation
Say I want to suspend/restore an application I decide to write the process memory to disk I reboot my kernel and restart the process
Will this work? No, application state is spread out in many places Application might involve multiple processes Applications have state in the kernel (lost on reboot) (e.g. open files, locks, process ids, driver states, etc)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Encapsulation
Virtual machines capture all of this state
Can suspend/restore an application On same machine between boots On different machines
Very useful in server farms As we discussed earlier
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Examples
Full Virtualization Run any OS; expose full x86 ISA E.g., VMware, Xen on HVM
Para-virtualization Run any (slightly modified) OS; expose (slightly modified) x86
ISA E.g, Xen
OS-level virtualization Run multiple copies of same OS E.g, VServers, User-mode Linux
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Outline
Virtualized Data Centers The foundation for cloud computing
Hardware Virtualization The foundation for virtualized data centers
Public/private Cloud Computing Relevance to education
Shared testbeds NSF’s GENI prototype
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Types of Cloud Computing
Software-as-a-Service E.g., Gmail, Google Calendar
Platform-as-a-Service E.g., Azure, AppEngine
Infrastructure-as-a-Service E.g., Amazon EC2, S3, EBS
Implementations at different levels of abstraction Analagous to a software stack: hardware OSapplication Lower-level == more difficult to use + more freedom to
innovate
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Types of Cloud Computing
Software-as-a-Service E.g., Gmail, Google Calendar
Platform-as-a-Service E.g., Azure, AppEngine
Infrastructure-as-a-Service E.g., Amazon EC2, S3, EBS
Implementations at different levels of abstraction Analagous to a software stack: hardware OSapplication Lower-level == more difficult to use + more freedom to
innovate
Focus of much of this talk.Good for classroom, since it provides most
freedom to innovate
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Cloud Computing Benefits Low upfront capital expenditure
No need to buy, power, cool, or maintain hardware Cheaper for small or short-term operations E.g., educators teaching project-based courses
Predictable costs Flat fees for usage, e.g., $/hour Nice for fixed budgets: $250 class computing budget Computing budgets are hard to predict
Flexible pricing plans On-demand: pay $0.10 every hour, quit at anytime Spot: use computers when price <=$0.10/hour Reserved: reserve computers for long period at
$0.05/hour
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Example: Amazon EC2
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Types of Amazon Resources
Elastic Compute Cloud (EC2) ($0.085/hour) Elastic IPs AutoScaling CloudWatch ($0.015/hour) Elastic MapReduce Elastic LoadBalancing
Simple Storage Service (S3) ($0.15/GB-month) Elastic Block Store (EBS) ($0.10/GB-month)
Elastic Snapshots SimpleDB
Prices are continuing to fall as usage increases…..
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Educators and Cloud Computing
Excellent platforms for experimentation Don’t care if students “break” things
Give root access to machine Install arbitrary software
Students isolated from other students/users
Enables distributed application projects Can access many machines for short time-period
E.g., Class of 20 working in pairs developing an application that runs over 5 machines need 50 computers!
Costs $500 for a two week assignment (via Amazon Spot Pricing) Also can use “free” testbeds, e.g., PlanetLab, GENI
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Importance of Distributed Apps
Internet services use many computers to serve Internet users
E.g., Google, Facebook, YouTube, Yahoo, Microsoft Services may use thousands of computers running at
multiple data centers throughout the world Importance of these applications is still increasing at a
rapid rate
Important for students to learn how to develop distributed applications in this environment
Traditionally a difficult task, since access is expensive
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Private Clouds
If you already own a lot of mostly idle machines you can install private cloud software
Make your own infrastructure look like a cloud Don’t get the low cost benefits….. ….but maybe make your infrastructure more flexible or
usable
Good for class project maintenance if you have the time to invest to learn and install software
Example: Eucalyptus, emulates Amazon EC2 interfaces
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Outline
Virtualized Data Centers The foundation for cloud computing
Hardware Virtualization The foundation for virtualized data centers
Public/private Cloud Computing Relevance to education
Shared testbeds NSF’s GENI prototype
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Shared Testbeds
Run by universities and companies to experiment with new research prototypes
Often are “free”: may need to contribute a few computers Formed from donations by participants
Not as stable as commercial clouds Focus on one or more characteristics
Compute isolation Network isolation Geographic diversity Hardware diversity
Examples: PlanetLab, Emulab, NSF’s Global Environment for Network Innovations (GENI)
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
PlanetLab Get access to machines hosted around the world
Experiment with global network services No resource isolation Network presence but little compute power
Success led to GENI
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
Status GENI connects diverse testbeds together
Many components/link types Routers, edge nodes, wireless, wired, storage,
sensors, fiber-optic, etc.
Prototyping started last year (http://geni.net) Hasn’t been built, but isn’t vaporware
Existing systems form foundation 80 projects clustered around 4 “control
frameworks” PlanetLab, Emulab, Orca, Orbit 4 projects at UMass-Amherst! Run by BBN (Cambridge); also at UMass-Lowell and
Williams
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTS ASSACHUSETTS AAMHERST • MHERST • Department of Computer Science Department of Computer Science • 2008 • 2008
GENI Overview
Sliverable GENI Substrate(Contributing domains/Aggregates)
Wind tunnel
Experiments(Guests occupying slices)
Embedding
Petri dish
Observatory