Commercial in Confidencewww.metron-athene.com
Virtualisation Oversubscription(What’s so scary?)
Commercial in Confidencewww.metron-athene.com
Topics
• What led me here• Oversubscription Overview• CPU Oversubscription• Memory Oversubscription• What’s the worst that can happen? (Queueing
theory, the simple version)
Commercial in Confidencewww.metron-athene.com
Overcommit vs Oversubscribe
• Overcommit = Oversubscribe
Commercial in Confidencewww.metron-athene.com
What led me here
• Clients– “Oh, we don’t oversubscribe”
• Fear
• Misunderstanding
Commercial in Confidencewww.metron-athene.com
Flying Navigation by Dead Reckoning
• You know where you started• You know how long you flew for• You know your air speed• You know what direction you flew in
• What if the wind changed in the last 8 hours?
• WW2 bombing saw 1 in 5 bomb loads within 5 miles of the target.
Commercial in Confidencewww.metron-athene.com
Virtualisation Used Capacity by Dead Reckoning
• You know what you started with• You know what you provisioned• You know how much is left
• Not especially efficient
Commercial in Confidencewww.metron-athene.com
Oversubscription
• Allocating more than you have– Thin Provisioning– Deduplication & Compression
Allocated
Exists
Allocated
Exists
Allocated
Used
Commercial in Confidencewww.metron-athene.com
What can be oversubscribed?
• CPUs• Memory• Disk• NICs
– Nobody ever seems to think about that one– VMs on a single host = no NIC involved– Otherwise…
Commercial in Confidencewww.metron-athene.com
CPU VMware Maximums
• Virtual Machine Maximum– 128 vCPUs per VM
• Host CPU maximums– Logical CPUs per host 480– Virtual machines per host 1024 – Virtual CPUs per host 4096– Virtual CPUs per core 32
• The achievable number of vCPUs per core depends on the workload and specifics of the hardware. For more information, see the latest version of Performance Best Practices for VMware vSphere
https://www.vmware.com/pdf/vsphere6/r60/vsphere-60-configuration-maximums.pdf
Commercial in Confidencewww.metron-athene.com
Memory VMware Maximums
• 6TB per Host– Well 12TB on specific hardware
• 4TB per VM
Commercial in Confidencewww.metron-athene.com
Memory Oversubscription
• How?– Free Space– Page Sharing– Balloon Driver (VMware) – Reservations– Shares
Commercial in Confidencewww.metron-athene.com
Memory
• Transparent Page Sharing– Deduplication in memory
• Balloon Driver– Vmmemctl process “steals” memory inside the VM
allowing that memory to be used by other VMs. This may cause the OS to page.
• VMkernel Swap– VM thinks pages are in memory. ESX has put that
memory on disk in a Vmkernel Swap file.– “Performance is NOT optimal”
Commercial in Confidencewww.metron-athene.com
Transparent Page Sharing
VM1 VM2
ESX
Commercial in Confidencewww.metron-athene.com
Balloon Driver (vmmemctl)
VM1 VM2
ESX
Commercial in Confidencewww.metron-athene.com
Memory test
• Memory vs. disk speed is…?– A) Memory is 100x faster than disk– B) Memory is 1,000x faster than disk– C) Memory is 10,000x faster than disk– D) Memory is 100,000x faster than disk– E) Memory is 1,000,000x faster than disk– F) I have no memory of the event, your honour
Commercial in Confidencewww.metron-athene.com
VMkernel Swap
0%10%20%30%40%50%60%70%80%90%
100%
BalloonSwap FileReservation MB
Example:• Assume maximum
memory contention• Default 65% can be
Balloon driver• Example Reservation is
30%• 5% In the VMkernel
(.vswp) file.
Commercial in Confidencewww.metron-athene.com
Memory
433MB Active Memory
2.6GB Unique Memory
1.4GB Shared Memory
50MB Balloon Driver Memory
150MB ESX Overhead for
the VM
Commercial in Confidencewww.metron-athene.com
Reservations
• Resource Pools or VMs• If they want it, they get it• If they don’t want it, it’s available to all• Cannot reserve more than exists
• Oversubscribe– Protect core VMs with a reservation
Commercial in Confidencewww.metron-athene.com
Memory Idle Tax
• Memory has Shares• Memory Tax associates a value to each page used• Default Idle Tax rate is 75%• This makes idle memory cost 4 times as many
shares as active memory
Commercial in Confidencewww.metron-athene.com
CPU Oversubscription
• How?– Time slicing– Co-Scheduling– Reservations– Shares– Limits
Commercial in Confidencewww.metron-athene.com
Time Slicing
• Cores are shared between vCPUs in time slices– 1 vCPU to 1 core at any point in time
• More vCPUs = More time slicing• Processes do this on CPUs all the time
– So why it is so scary?– Over 100 processes on my laptop share 4 CPUs
Running Dormant/IdleVM1
VM1
Commercial in Confidencewww.metron-athene.com
IdleReadyThreads
VMWare Processor Scheduling: vCPU Co-Scheduling & Ready Time
1
2
3
4
VM
VM
VM
VM
VM
VM
VM
VM
VM
Commercial in Confidencewww.metron-athene.com
Reservations\Shares\Limits
Commercial in Confidencewww.metron-athene.com
Reservations
Prod VMReservation
CP
U U
sed
by P
rodu
ctio
n V
M
CPU Used by Test VM
1)The Production VM wants to use all the CPU available.2)The Test VM starts and also wants to use all the CPU available.3)Each uses 50% CPU4)The Production VM wants 250MHz CPU while Test wants to use 4000MHz CPU. Production gets 100% of it’s request. Test does not.
100% CPU
100% CPU
100% CPU
0% CPU 50% CPU
50%
CP
U
Commercial in Confidencewww.metron-athene.com
Reservations & Shares
Prod VMReservation
CP
U U
sed
by P
rodu
ctio
n V
M
CPU Used by Test VM
1)The Production VM (2000 Shares) wants to use all the CPU available.2)The Test VM (1000 Shares) also wants to use all the CPU available.3)Production gets 66% CPU, Test gets 33% CPU.4)The Production VM wants 250MHz CPU while Test could still use 4000MHz CPU. Production gets 100% of it’s request. Test does not.
100% CPU
100% CPU
100% CPU
0% CPU 33% CPU
66%
CP
U
Commercial in Confidencewww.metron-athene.com
Expandable Reservation 1
Root (RP)Total CPU: 10200 MHz
Software (RP)Reservation: 3000 MHz
Expandable : Yes
Production (RP)Reservation: 1200 MHz
Expandable : Yes
Test (RP)Reservation: 1000 MHz
Expandable : No
VM1Res: 400 MHz
VM2Res: 300 MHz
VM7Res: 500 MHz
Why Cant VM7 Start?
1200 MHz Required. 1000 MHz Available.
Commercial in Confidencewww.metron-athene.com
Expandable Reservation 2
Root (RP)Total CPU: 10200 MHz
Software (RP)Reservation: 3000 MHz
Expandable : Yes
Production (RP)Reservation: 1200 MHz
Expandable : Yes
Test (RP)Reservation: 1000 MHz
Expandable : Yes
VM1Res: 400 MHz
VM2Res: 300 MHz
VM7Res: 500 MHz
VM3Res: 500 MHz
VM4Res: 500 MHz
VM5Res: 500 MHz
VM6Res: 500 MHz
2000MHz Requested1200MHz Reservation2000MHz of Parent Used
1200MHz Requested1000MHz Available In ParentWhere is the “extra” taken from?
3200MHz Requested3000MHz Reservation
200MHz used byTest (RP)
Commercial in Confidencewww.metron-athene.com
What’s the worst that can happen?
• Memory• It fills up• Then bad things happen
• CPU• Bad things happen• Then it’s full/maxed• Queueing Theory
Commercial in Confidencewww.metron-athene.com
Contention and Queuing
• Finite system resources• Single workstation = no contention (usually)• More than One User = Possible Contention• Contention = Queuing
– This is COMPLETELY NORMAL– It’s how operating systems work.
• Excessive Queuing = Poor Performance and Long Response Times
Commercial in Confidencewww.metron-athene.com
Basic Ideas of Queuing
QueueServer
Arriving customers, transactions
A
Leaving customers, transactions
L
Queuing Time
QService Time
S
Response Time
Commercial in Confidencewww.metron-athene.com
Utilization and Response Time
Response Time
0 0.5 1.0Utilization
Service Time
R = S / (1 - U)
Commercial in Confidencewww.metron-athene.com
Benefits of Multiple Servers
Response Time
0 0.5 1.0Utilization
Service Time
Single CPU
Dual CPU16-way CPU
Commercial in Confidencewww.metron-athene.com
Why are we interested in this queue stuff again?
• VMs Queue for free CPUs– Ready Time– Co-Stop time– Higher utilisation = higher contention– More concerned about CPU busy than vCPU to logical
CPU ratio– Because it’s maths, you can model it
Commercial in Confidencewww.metron-athene.com
Roundup
• Oversubscription does not equal unacceptable performance
• Virtualisation is expecting you to oversubscribe– It’s the reason it exists
• Take the fear out of oversubscription through proper planning– Plan for performance, not ratios