building a big iaas cloud - home.apache.orgpeople.apache.org/~ke4qqq/monday.pdf · massively...
TRANSCRIPT
Building a Big IaaS Cloud
David Nalley@ke4qqq
#whoami
• Recovering Sysadmin
• F/LOSS contributor
• Committer on Apache CloudStack
Assumptions
• You have a need for an IaaS compute cloud platform
• You know what ‘IaaS’ and ‘cloud’ mean
Massively scalable
• Scalable - this is the easy part
• Massively - this part is much harder - getting to thousands of physical hosts is complex - getting to tens of thousands of physical hosts is a completely different magnitude of problem.
So I have some questions
Cloud
Built for traditional enterprise apps & client-server compute
• Scale-up (pool-based resourcing)• IT management-centric • 1 administrator for 100’s of servers• Proprietary vendor stack
Designed around big data, massive scale & next-gen apps
• Scale-out (horizontal resourcing)• Autonomic management • 1 administrator for 1,000’s of servers• Open, value-added stack
Virtualization alone does not make a cloud
Server Virtualiza0on
CloudStack Overview
• CloudStack is an open source Infrastructure-as –a-Service (IaaS) orchestration platform that enables users to build, manage and deploy compute cloud environments.
• CloudStack was recently donated by Citrix to the Apache Software Foundation and is currently undergoing incubation.
What is Apache CloudStack?
• CloudStack offers an administrator's Web interface, used for provisioning and managing the cloud, as well as an end-user's Web interface, used for running VMs and managing VM templates.
• The UI can be customized to reflect the desired service provider or enterprise look and feel.
Graphical User Interface
• CloudStack Web Services Query HTTP API is loosely based on the REST architecture and allows developers to create new management solutions or integrate existing systems with CloudStack. It supports output in both XML and JSON.
• EC2/S3 support (translation layer) is also present.
Benefits of CloudStack
Self Service
Capital Leverage
Workforce Leverage
Management Automation
WorkloadStandardization
Remove IT as a service delivery cri1cal path
Reduce IT opera1onal costs
Consistent applica1on and service deployment
Usage Metering
CentralizedManagement
SmarterVirtualization
Visibility into user and line of business usage
Manage complete infrastructure, regardless of scale
Drive reduced capital requirements
Create Custom Virtual Machines via Service Offerings
Dashboard Provides Overview of Consumed Resources
• Running, Stopped & Total VMs• Public IPs• Private networks• Latest Events
Virtual Machine Management
Users
Start
Stop
Restart
Destroy
VM Operations Console Access
• CPU U1lized
• Network Read
• Network Writes
VM Status Change Service Offering
2 CPUs
1 GB RAM
20 GB
20 Mbps
4 CPUs
4 GB RAM
200 GB
100 Mbps
Volume & Snapshot Management
Volume
VM 1Add / DeleteVolumes
Schedule Snapshots
Hourly
Daily
Weekly
MonthlyNow
Create Templates from Volumes
Volume Template
View Snapshot History 12/2/2012 7.30 am…. 2/2/2012 7.30 am
Network & Network Services
• Create Networks and attach VMs
• Acquire public IP address for NAT & load balancing
• Control traffic to VM using ingress and egress firewall rules
• Set up rules to load balance traffic between VMs
CloudStack Architecture
CloudStack Architecture
Availability and Security
Servers Network StorageVirtualization Layer
Service Management (Metering, Accounts, etc.)
Resource ManagementServers Storage Network
Dynamic Workload Management
snapshots LB HA Monitoring
User Interface Developer APIAmazon*
Image Libraries
Applica1on Catalog
Custom Templates
Opera1ng System ISOs
Integra1
on API
Ope
ra3o
nal Integra3o
n(OSS/BSS, M
onito
ring, Iden
1ty Managem
ent , Etc)
Administrator End User Console
Zone
Zone
Zone
Cloud Infrastructure Overview - Summary
• One or more hosts grouped into a cluster
• One or more clusters grouped into a pod
• One or more pods grouped into a zone
• One or more zones controlled by one management server
Pod
SecondaryStorage
MySQLCloud_db
ManagementServer
• Hosts• Servers onto which services will be provisioned
• Primary Storage• VM disk storage
• Cluster• A grouping of hosts and their associated storage
• Pod• Collec1on of clusters in the same failure boundary
• Network• Logical network associated with service offerings
• Secondary Storage• Template, snapshot and ISO storage
• Zone• Collec1on of pods, network offerings and secondary storage
• Management Server Farm• Management and provisioning tasks
Components
Zone
CloudStack Pod
Cluster
Host
HostNetwork
PrimaryStorage
VM
VM
CloudStack Pod
ClusterSecondaryStorage
CloudStack Infrastructure - Overview
• CloudStack provides a number of ‘infrastructure’ pieces, external to the management server that provide scalable services.
• Secondary Storage (SSVM)
• Console Proxy (CPVM)
• Virtual Router (VR or domR)
Secondary Storage
• Secondary Storage - provides storage for machine images and snapshots
• Secondary Storage VM - provides stateless and scalable management and interaction with Secondary Storage.
Console Proxy• Hypervisors provide access to
the ‘console’ of a virtual machine generally via VNC.
• Accessing it requires direct access to the hypervisor - including credentials into the hypervisor directly.
• CPVM proxies access to the VNC session and provides access control so that others can’t get access.
• Automatically scales to handle demand of console sessions.
• Provides an AJAX interface that is usable on virtually any device.
Virtual router
• Lowest common denominator (so far) is a virtual machine.
• Provides a number of services
• DHCP
• Routing
• DNS
• Loadbalancing
• Firewall
• NAT
25
VMOps Pod VMOps PodVMOps Pod CloudStack Pod CloudStack Pod
Availability Zone
CloudStack Pod CloudStack Pod
CloudStack Scale
San Jose
Austin
FrankfurtTokyo
Availability Zones Deployed Globally
CloudStack Cluster
CloudStack Cluster
San Jose
Austin
FrankfurtTokyoPrivate Delhi
Private Rio
Availability Zones Can be Private
Management Server Managing Multiple Zones
Zone1
Data Center 1
Data Center 2
Zone 3
Zone 2
Data Center 2
Zone 3
Zone 2
Data Center 2
Zone 3
Zone 2
Data Center 2
Zone 3
Zone 2
Data Center 2
Zone 3
Zone 2
Data Center 3
Zone 4
Management Server
Ø Single Management Server can manage mul1ple zones
Ø Zones can be geographically distributed but low latency links are expected for beaer performance
Ø Single MS node can manage up to 5K hosts.
Ø Mul1ple MS nodes can be deployed as cluster for scale or redundancy
Data Center 1
Multi-Site Deployment
Availability Zone 1
PrimaryManagement
ServerData Center 2
SecondaryManagement
Server
Data Center 3
Data Center 4
Availability Zone 2
Availability Zone 3
Availability Zone 4
Deployment Architectures
Deployment Architecture
• The architecture used in a deployment will vary depending on the size and purpose of the deployment.
• From a small-scale deployment useful for dev/test and PoC deployments• To a fully-redundant large-scale setup for production deployments.
Management Server Deployment Architecture
Management Server
MySQLDB
Back UpDB
InfrastructureResources
User API
Admin API
Load Balancer
Management Server
Management Server MySQL
DB
InfrastructureResources
User API
Admin API
Single-node Deployment Multi-node Deployment
Ø MS is stateless. MS can be deployed as physical server or VM
Ø Single MS node can manage up to 8K hosts. Mul1ple nodes can be deployed for scale or redundancy
Replica3on
NFS Server
Small-Scale Deployment
SecondaryStorage
PrimaryStorage
Compu1ng Nodes
Management Server
Layer-‐2SwitchRouter &
Firewall
Public IP 62.43.51.125
Internet
192.168.10.0/24
192.168.10.10 to 192.168.10.13
192.168.10.3 192.168.10.4
Large-Scale Redundant Deployment
Primary StorageNFS/Swii
Secondary Storage
Management Server Cluster
Layer-3 switches with firewall modules
Layer-2 switches
Internet
Primary Storage
Primary Storage
Primary StorageNFS/Swii
Secondary Storage
Internet
Computing Nodes
PrimaryStorage Servers
SecondaryStorage Servers
The Three C’s of Complexity
• Control• Choice• Compliance
Compute
Giving Control Brings Complexity
Network Storage
Admin
Users
Org A
Admin
Users
Org BUsers
End User
Admin
VMware
XenServer
KVM
NFS
iSCSI
FC
NetScaler
F5
Jun. SRX Local Disk
SwiP
HDFS
• ACL• Limits• Governance
BareMetal
Guest Virtual Layer-‐2 Network
Guest 1 VM 1
Guest 1 VM 2
Guest 1 VM 3
Guest 1 Virtual Network 10.1.1.0/24
Gateway 10.1.1.1
Guest 10.1.1.2
Guest 10.1.1.3
Guest 10.1.1.4
Guest 1 Virtual Router
Guest 2 VM 1
Guest 2 VM 2
Guest 2 VM 3
Guest 2 Virtual Network 10.1.1.0/24
Gateway 10.1.1.1
Guest 10.1.1.2
Guest 10.1.1.3
Guest 10.1.1.4
Guest 2 Virtual Router
Public IP 65.37.141.2465.37.141.80
Public IP 65.37.141.1165.37.141.36
Internet
Mul1-‐1er Network
Private IP10.1.1.112
DHCP, DNSUser-‐data
Public IP 65.37.141.112
10.1.1.1 Web VM 1
10.1.1.3 Web VM 2
10.1.1.4Web VM 3
10.1.1.5Web VM 4
NetscalerLoad
Balancer
Private IP10.1.1.111Public IP
65.37.141.111 Juniper SRXFirewall
Virtual Router
Virtual Network 10.1.1.0/24VLAN 100
Virtual Network 10.1.2.0/24VLAN 1001
10.1.2.21
10.1.2.18
10.1.2.38
10.1.2.39
10.1.2.31 App VM 1 10.1.3.21
Virtual Network 10.1.3.0/24VLAN 141
10.1.2.24 App VM 2 10.1.3.45
10.1.3.24 DB VM 1
DHCP, DNS, User-‐data
DHCP, DNSUser-‐data,Source-‐NAT, VPN
Public IP 65.37.141.115
Virtual Router
Virtual Router
Unified Mul1-‐1er Network
10.1.1.1Web VM
1
10.1.1.3Web VM
2
10.1.1.4Web VM
3
10.1.1.5Web VM
4Virtual Network 10.1.1.0/24VLAN 100
Virtual Network 10.1.2.0/24VLAN 1001
10.1.2.31App VM
1
Virtual Network 10.1.3.0/24VLAN 141
10.1.2.24App VM
2
10.1.3.24 DB VM 1
Virtual Router Customer
Premises
IPSec or SSL site-‐to-‐site VPN
Internet
Monitoring VLAN
Virtual Router Services• IPAM• DNS• LB [intra]• S-‐2-‐S VPN• Sta1c Routes• ACLs• NAT, PF• FW [ingress & egress]• BGP
Load Balancer
Other Topologies
Guest Virtual Network 10.1.1.0/24VLAN 100
Gateway address 10.1.1.110.1.1.1
Guest VM 1
10.1.1.3Guest VM
2
10.1.1.4Guest VM
3
10.1.1.5Guest VM
4
Guest Virtual Network 10.1.1.0/24VLAN 100
DHCP, DNSUser-‐data
10.1.1.1Guest VM
1
10.1.1.3Guest VM
2
10.1.1.4Guest VM
3
10.1.1.5Guest VM
4
No services [Sta0c IPs] Dedicated VLAN with DHCP and DNSUser can request specific IP[s] for NIC
Core switch
Gateway address 10.1.1.1
Core switch
Virtual Router
Other Topologies
Guest Virtual Network 10.1.1.0/24VLAN 100
Gateway address 10.1.1.110.1.1.100
Guest VM 1
10.1.1.200Guest VM
2
10.1.1.101Guest VM
3
10.1.1.115Guest VM
4
Guest Virtual Network 10.1.1.0/24VLAN 100
DHCP, DNSUser-‐data
10.1.1.1Guest VM
1
10.1.1.3Guest VM
2
10.1.1.4Guest VM
3
10.1.1.5Guest VM
4
MPLS Use Case Shared VLAN with DHCP and DNS
CSVirtual Router
Core switch
Gateway address 10.1.1.1
Core switch
MPLS VLAN 100
DHCP, DNSUser-‐data
CSVirtual Router
…
DB Security Group
WebSecurity Group
Layer 3 Networking (Amazon Style)
… …
Web VM
Web VM
Web VM
Web VM
DB VM
Web VM
DB VM
Web VM
CloudStack Mgmt. Server
MySQL(Master)
Per Availability Zone
Virtual Router SSVM CPVMSecondary
StorageKVMXenServervCenter
HTTP File Share
Per Customer Per Pod / Cluster
MySQL(Slave)
CloudStack Mgmt. Server
33069090
8250
User/API
8080
3922
8250
3306
111/2049111/2049
80/443
443
22/443
22
Making it all scale
Thinking about cloud orchestration at scale
• Host management• Capacity management• What host to use to deploy a
new VM• Failure handling• Security group propaga3on• Set a goal
CPU utilization while deploying 30,000 VMs on 30,000 hostsC
PU
Util
izat
ion.
400
% is
max
imum
Time
20,000
5000
5000
Idle
Deploy time from 25,000 to 30,000 VMsS
econ
ds
to
depl
oy
VM number: 25,000 plus X
Storage at scale
• Storage is cluster specific (typically 8-16 nodes)
• Scaling out with SAN typically doesn’t do a good job - some new gen stuff helps, but still a daunting problem to keep up IO when you get to thousands of nodes.
• Distributed filesystems - they are better...but....
• Local storage - failure prone, but cheap, and scales easily with the number of nodes.