rethinking the cloud_-_limitations_and_oppotunities_-_2011_nexcom
Post on 20-May-2015
1.575 Views
Preview:
DESCRIPTION
TRANSCRIPT
Woohyun Kim
Cloud Platform Team
S-Core
2011-05-18
Rethinking the Cloud - A View of Virtualization, Storage, Network, and Platform
Cloud Success Stories
What is the Cloud?
• A computing environment to elastically provide virtualized resources as a
service over the Internet in a pay-as-you-go manner
Amazon’s Challenge and Paradigm Shift
Success Cases in Amazon SmugMug(http://www.smugmug.com/) • an online photo storage application that stores more than half a petabyte of data on S3
• estimates cost savings on service and storage to be close to $1 million
37Signals(http://37signals.com/) • maker of popular online project-management software Basecamp, uses S3 for storage
needs.
New York Times(http://www.nytimes.com) • use EC2 to process terabytes of archival data using hundreds of EC2 instances within 36
hours
Animoto(http://animoto.com/) • an online presentation video generator that needs gobs of computing power for video
processing
• recently successfully withstood a surge in Web traffic that would kill most companies’
systems by scaling up their processing power quickly using EC2 with RightScale
• Animoto ramped from 25,000 users to 250,000 users in three days, signing up
20,000 new users per hour at peak
• Using RightScale, EC2 instances automatically scaled out 40 to 4000 at that time
• For more detail, refer to http://blog.rightscale.com/2008/04/23/animoto-facebook-
scale-up/
Powerset had a great idea, “Natural Language Search”
It should index millions of pages of data and content
They knew that this would require a massively large datacenter and extensive computing power CPUs, terminal switches, cable, racks, datacenters, hosting, power,
maintenance, staffs
But they needed to keep infrastructure costs at a minimum
6
Start-up Company: Powerset
“By using Amazon EC2, Powerset is able to match the infrastructure of large scale search companies on a
startup budget.” - Barney Pell, Founder and CEO of Powerset
““Amazon EC2 is a complete game-changer. EC2 and Amazon Web Services make it easy for start-ups to build a complete infrastructure without having to spend much
on capital .”- Paul Hammann
Powerset had a great idea, “Natural Language Search”
It should index millions of pages of data and content
They knew that this would require a massively large datacenter and extensive computing power CPUs, terminal switches, cable, racks, datacenters, hosting, power,
maintenance, staffs
But they needed to keep infrastructure costs at a minimum
7
Start-up Company: Powerset
“By using Amazon EC2, Powerset is able to match the infrastructure of large scale search companies on a
startup budget.” - Barney Pell, Founder and CEO of Powerset
““Amazon EC2 is a complete game-changer. EC2 and Amazon Web Services make it easy for start-ups to build a complete infrastructure without having to spend much
on capital .”- Paul Hammann
$100 millions
The New York Times is a 150-year old company, and serves the largest newspaper Website, NYTimes.com 1 billion page views per month
20+ million monthly unique visitors
They tried to convert TIFF images to PDFs TIFF images(405,000),
Articles(3.3 million) in SGML PNG images(810,000)
XML files(405,000) mapping articles to TIFFs JavaScript files(405,000)
8
Temporary & Data-intensive : The New York Times
“I got access to a few more machines and churned through all 11 million articles in just under 24 hours using 100 EC2
instances, and generated another 1.5TB of data to store in S3. It just costs $3000.” - Derek Gottfrid
“I had was this: upload 4TB of source data into S3, write some code that would run on numerous EC2 instances to read the source data, create PDFs, and store the results back into S3.
S3 would then be used to serve the PDFs to the general public.” - Derek Gottfrid
The New York Times is a 150-year old company, and serves the largest newspaper Website, NYTimes.com 1 billion page views per month
20+ million monthly unique visitors
They tried to convert TIFF images to PDFs TIFF images(405,000),
Articles(3.3 million) in SGML PNG images(810,000)
XML files(405,000) mapping articles to TIFFs JavaScript files(405,000)
9
“I got access to a few more machines and churned through all 11 million articles in just under 24 hours using 100 EC2
instances, and generated another 1.5TB of data to store in S3. It just costs $3000.”
“I had was this: upload 4TB of source data into S3, write some code that would run on numerous EC2 instances to read the source data, create PDFs, and store the results back into S3.
S3 would then be used to serve the PDFs to the general public.”
Temporary & Data-intensive : The New York Times
\ 64,865,381,400
Cloud Skepticism
Amazon’s cloud outages receive a lot of exposure …
April 21 ~ 22, 2011 A networking glitch made its storage volumes automatically create back-ups of themselves, filling up storage capacity and causing connectivity issues, lasts two days Amazon’s customers include start-ups like the social networking site Foursquare but also big companies like Pfizer, Netflix and Nasdaq dkagh Affected web sites included Quora.com, Reddit.com, GroupMe.com and Scvngr.com
July 20, 2008 Failure due to stranded zombies, lasts 5 hours
Feb 15, 2008 Authentication overload leads to two-hour service outage
October 2007 Service failure lasts two days
October 2006 Security breach where users could see other users data
… and their current SLAs don’t match those(99.99%) of enterprises
Amazon EC2 99.95% Amazon S3 99.9%
The Cloud is Falling
Cloud Is NOT A Brand-New Technology Utility
Computing
Amazon EC2 (August 2006)
Amazon S3 (March 2006)
Google App Engine (April 2008)
Microsoft Azure (Oct 2008)
GFS MapReduce
BigTable Hadoop
Cloud is just Buzz, and Marketing Hype Campaign
• Cloud computing is simply a buzzword used to repackage grid computing and utility computing, both of which have existed for decades
- Definition of Cloud Computing, whatis.com
• What is it? What is it? ... Is it - 'Oh, I am going to access data on a server on the Internet.' That is cloud computing?
• The interesting thing about cloud computing is that we’ve redefined cloud computing to include everything that we already do
- During Oracle’s Analyst Day, Larry Ellison
• .. cloud computing was simply a trap aimed at forcing more people to buy into locked, proprietary systems that would cost them more and more over time
• It's stupidity. It's worse than stupidity: it's a marketing hype campaign
- GNU founder, Richard Stallman
• Server revenue for public cloud computing will grow from $582 million in 2009 to $718 million in 2014
• Server revenue for the much larger private cloud market will grow from $7.3 billion to $11.8 billion in the same time period
- Worldwide Enterprise Server Cloud Computing 2010-2014 Forecast, IDC
Cloud Wars and Strategies
The Cloud Wars
EMC
VMware
SpringSource
CloudFoundry
iSilon GreenPlum
GemStone RabbitMQ
Hyperic
650M
450M
SalesForce.com
RedHat
Force.com
VMForce
Heroku 212M
OpenShift
JBoss 420M
Qumranet 107M KVM SPICE
Citrix
XenSource
Xen
NetScaler
300M
Oracle
Sun
Java MySQL
NexentaStor
ZFS
BtrFS
Ceph
OpenStack
Rackspace
rPath rBuilder
Quest vControl
3Tera Applogic
Infra is getting more Programmable
Cloud Disruptors
• RightScale & enStratus
Cloud Disruptors (cont’d)
• Cloud Mgmt. Functionality in enStratus
Cloudburst and Hybrid Cloud
• CloudSwitch
Who Moved My Cheese?
Disrupting or Disrupted??!
Virtualization
Virtualization on x86 Architecture
• VMM(Virtual Machine Monitor) or Hypervisor – Since VMM must perform in the privileged level(0) , OS is moved to non-privileged
level(1 or 3)
Virtual Machine
Virtual Machine
Virtual Machine
Operating System #1
(Win-XP)
Operating System #2
(Mac-OS)
Operating System #3
(Linux)
app app app app app app
Virtual Machine Manager
CPU Memory NIC Disk
• Problems on x86 Architecture – Privileged Instruction
• Trap when called from CPU user mode, and Emulate its effect by VMM
– Sensitive Non-privileged Instruction • Cause physical state of CPU to leak
– smsw %eax # reads CR0 into EAX – mov %cr0, %edx # reads CR0 into EDX – sub %eax, %edx # what’s the difference? – jnz emulation_flaw # it ought to be zero!!
No Trap, No Emulation => VMM is finally crashed
CPU Virtualization on x86 Architecture
• How to handle nonvirtualizable instructions – Full virtualization using binary translation – Paravirtualization using hypercalls – Hardware assisted virtualization using root/non-root mode
• VT-x : Virtualization Technology for 32bit CPU • VT-i : Virtualization Technology for 64bit CPU • VT-d : Virtualization Technology for Directed I/O • VT-c : Virtualization Technology for Connectivity
Virtualization on x86 Architecture
Hurdles in Server Virtualization
• Storage Allocation & Interfacing – On-demand, Pre-allocation
– NAS, iSCSI, Local Storage
• VM Management – Snapshot, Fast Cloning, Thin Provisioning, Live Migration, DRS
• Virtual Network – L2/L3 Network Design, Directed, Bridged, NAT, VLAN, Load-Balance, Firewall
• Resource Sharing – Resource Pool, High Availability, Scheduling, Workload Mgmt.
• Migration – P2V, V2V
• Hardware-Assisted Support – Privileged instruction virtualization
• De-privileging or ring compression to handle privileged instructuions
– Memory virtualization
• Memory partitioning and allocation of physical memory
– Device and I/O virtualization
• Routing I/O requests between virtual devices and physical hardware
Storage
File System with Shared Storage
Cluster File Systems • GFS2 – DLM, scaling to 100 • GlusterFS – fuse, poor performance • Lustre - dfs
Unified Storage using Virtual Block Pool
• NexentaStor based on ZFS
• GlusterFS
Cluster File System using Virtual Block Pool
A Feasible SAN File System
• IBM TotalStorage SAN File System
Pooled Storage
• GlusterFS • ZFS • Openfiler • iSCSI+GNBD+DRBD
Network
Basic Virtual Network
Tap vs. Tun • Tap – simulate an Ethernet device and operate with layer 2
packets such as Ethernet frames • Tun(nel) - simulate a network layer device and operate with layer
3 packets such as IP packets • TAP is used to create a network bridge, while TUN is used with
routing.
VDE(Virtual Distributed Ethernet) and VDE Switch
IPTables vs. Bridging • IPTables - let the host forward packets between each tap on its
own subnets • Bridging – let all the taps connect into a specific bridge to put
them on the same subnet
Physical Machine
VLAN-1
Physical Machine
NIC
VM
VM
VM
S/W bridge (br100)
eth
eth
eth
public IP
Physical Machine
NIC
VM
VM
VM
S/W bridge (br100)
eth
eth
eth
public IP
NIC
VM
VM
VM
S/W
bridge (br100)
eth
eth
eth
private IP from dnsmasq
supports VLAN
tagging
manual config
bridge (br101)
dnsmasq
dnsmasq
auto eth0
VPN VM
Nova users
dhcpdiscover
dhcpdiscover
dhcpdiscover
dhcpdiscover
dhcpdiscover
dhcpdiscover
① Flat Mode • manual config. of bridge • get fixed public IP from the
pool
② Flat DHCP Mode • auto config. of bridge • get fixed public IP
③ VLAN DHCP Mode (default) • auto config. of bridge, • auto config. of VLAN: range
of private IPs for project VLAN
• get fixed private IP: iptables + NAT (private/public)
• VLAN: cloudpipe (=openVPN VM template TAP/TUN)
OpenStack Nova Network Virtualization
CloudStack Network Virtualization
Virtual Private Network for Each Account
CloudStack Network Virtualization
Detail Virtual Private Network in Node A and Node B
Hurdles in Network Virtualization
• L2 Network – Problem: Scalability, Performance, Security – Solution: VLAN(for Scalability and Security), RBridge(for Scalability and STP Limitation), L2 over
L3
• Multi-tier Networking Design vs. Migration Limitation – Limitation of Spanning Tree Protocol
• Keep Layer 2 networks relatively small and join them together via Layer 3 segments
• But VM migration cannot be live across the multi-tier networks
– Port Consistency • Map the settings such as VLAN, ACL, QoS, and security profiles to all the
network ports • But some VMs are not able to meet required service levels
• L2(Switching) and L3(Routing) Networking Design – Scalability and Efficiency on the service provider side
• Amazon EC2 using L3 – 500,000 VM on 60,000 PM
– Legacy Support on the service consumer side • Amazon VPC, 3Tera AppLogic
– Define virtual network topology – Select IP address range – Create public subnets and private subnets – Configure route table and network gateway
Cloud Platform
Cloud Technologies
Anatomy of Cloud Tehcnologies
API Server • act as the Web services front end for the cloud controller
Compute Controller • compute server resources
Object Store • provide storage services
Auth Manager • provide authentication and authorization services
Volume Controller • provide fast and permanent block-level storage for the compute servers
Cloud Controller • represent the global state and interact with all other components
Network Controller • provide virtual networks to enable compute servers to interact with each other and with the public network
Scheduler • select the most suitable compute controller to host an instance
OpenStack Nova Architecture
Mgmt. Server VM
Host A Host B Mgmt. Server
Load Balancer
Management Servers
Zone
Guest VM
Volumes
attach
Host X
Guest VM
Host Y
Guest VM
Host Z
Primary Shared Storage
Computing Nodes
VM Image
Guest VM
live migration
Max(6*Volumes) per Guest VM
Guest VM
Volumes
attach
Host X
Guest VM
Host Y
Guest VM
Host Z
dynamically adding
Primary Shared Storage
Computing Nodes
VM Image
Guest VM
live migration Cluster
Max(16*Computing Nodes) per Cluster Computing Nodes should be in the same subnet, and homogeneous
Max(6*Volumes) per Guest VM
Pod
Computing Nodes should be in the same subnet, and have no limit to number of nodes
VM Image
snapshot
Templates
ISO images
Secondary Shared Storage
VM Image
copy, create,, boot, attach
CloudStack Architecture
Conclusion
Top 10 Cloud Obstacles and Opportunities
• A View of Cloud Computing, ACM, April, 2010
2011 Predictions of IaaS, PaaS, and NoSQL
• IaaS Prediction
• Hybrid is the way to go: The public-private cloud discussion isn’t relevant anymore
• Openstack will dominate the open IaaS offering
• PaaS Prediction
• 2011 is the year of PaaS
• CloudFoundry – Vmware
• OpenShift - Redhat
• A new PaaS category will emerge – Building your own PaaS
• CEAP(Cloud Enabled Application Platform) is being specifically designed to handle
multi-tenancy, scalability, and on-demand provisioning, but not higher degree of
flexibility and control
• Application servers will change their name to PaaS – But won’t change their stripes
• VMForce will fail to deliver on its promise => Already open Cloud
• NoSQL (+Big Data) predictions
• NoSQL will become compatible with SQL
• More applications will run entirely In-Memory
• Real-time /Stream-based analytics big will replace majority of the MapReduce batch
processing
• i.e., Yahoo S4, Google’s Percolator
written by Nati Shalom at Gigaspaces http://natishalom.typepad.com/nati_shaloms_blog/2010/12/2011-cloud-paas-nosql-predictions.html
Thank you.
top related