rmoug 2013 - where did my cpu go?
TRANSCRIPT
“Where did my CPU go?”
monitoring & capacity planning
adventures on a consolidated
environment
Presented by:
Karl Arao1
whoami
Karl Arao
• Senior Technical Consultant @ Enkitec
• Performance and Capacity Planning Enthusiast
6 years 11 months 12 days DBA experience
Oracle ACE, OCP-DBA, RHCE, OakTable
Blog: karlarao.wordpress.com
Wiki: karlarao.tiddlyspot.com
Twitter: @karlarao
www.enkitec.com 2
Agenda
• HOWTO compare CPU speeds
• Cores vs Threads
• The different CPU events
• CPU Monitoring/Capacity Planning on consolidated environments
www.enkitec.com 3
www.enkitec.com 4
12:27:15 SYS@DEMO1> show parameter cpu_count
NAME TYPE VALUE------------------------------------ ----------- --------cpu_count integer 16
www.enkitec.com 5
Socket0
Core0
CPU0 CPU8
Core1
CPU1 CPU9
Core2
CPU2 CPU10
Core3
CPU3 CPU11
Socket1
Core0
CPU4 CPU12
Core1
CPU5 CPU13
Core2
CPU6 CPU14
Core3
CPU7 CPU15
Exadata V2 => 2s8c16t
PART1: compare CPU speeds
www.enkitec.com 6
Different methods:
• Published benchmarks
– TPC-C
– SPECint_rate2006
• Actual Benchmarking
– cputoolkit
– SLOB (lio test)
www.enkitec.com 7
TPC-C• Transaction Processing Performance Council (TPC)
• Throughput => transactions per minute (tpmC)
• Price/Performance => USD / tpmC
www.enkitec.com 8
• CPU performance => tpmC / core
• 1609186.39 / 16 = 100574
SPECint_rate2006• Standard Performance Evaluation Corporation (SPEC)
• SPECint_rate2006
• Integer performance
• All CPUs are used
• Used by OEM12c Consolidation Planner (SYSMAN.EMCT_SPEC_RATE_LIB)
• CPU performance => SPECint_rate2006/core
• 702/16 = 43.875
www.enkitec.com 9
$ cat spec.txt | grep -i sun | grep -i x3-2 | sort -rnk144.0625, 16, 2, 8, 2, 632, 705, Oracle Corporation, Sun Blade X3-2B (Intel Xeon E5-2690 2.9GHz)44.0625, 16, 2, 8, 2, 630, 705, Oracle Corporation, Sun Server X3-2L (Intel Xeon E5-2690 2.9GHz)43.875, 16, 2, 8, 2, 628, 702, Oracle Corporation, Sun Server X3-2 (Intel Xeon E5-2690 2.9GHz)
2007 vs 2012
www.enkitec.com 10
Actual Benchmarking• cputoolkit and SLOB (lio test)
• LIOs/sec
www.enkitec.com 11
CPU1
CPU2
CPU3
CPU4
CPU5
CPU6
CPU7
CPU8
CPU1
CPU2
CPU3
CPU4
CPU5
CPU6
CPU7
CPU8
cputoolkit./runcputoolkit-auto <start CPU> <end CPU> <db name>
./runcputoolkit-auto 1 2 dw
SLOB./runit.sh <writers> <readers>
while :; do ./runit.sh 0 2; done
Both at 25%
CPU utilization
V2 and X2 CPU perf comparison
www.enkitec.com 12
3.6M LIOs/sec
2.1M LIOs/sec
V2 -> X2 migration
www.enkitec.com 13
chip efficiency factor = (source LIOs/sec) / (destination LIOs/sec)
= 2.1M / 3.6M
= .5833
X2 CPU requirement = source host CPUs * utilization * chip efficiency factor
= 16 * .46
= 7.36 * .5833
= 4.29 CPUs
X2 CPU Utilization = CPU requirement / CPU capacity
= 4.29 / 24
= 17.8 %
V2 X2
PART2: Cores vs Threads
www.enkitec.com 14
Socket0
Core0 Core1 Core2 Core3
Socket0
Core0
CPU1 CPU5
Core1
CPU2 CPU6
Core2
CPU3 CPU7
Core3
CPU4 CPU8
www.enkitec.com 15
~30% depends on the workload
www.enkitec.com 16
cputoolkit SLOB
17% 21%
Intel HT Technology Technical User's Guide http://goo.gl/3Ec5Z
PART3: Different CPU events
CPU
CPU Wait
CPU Scheduler
www.enkitec.com 17
www.enkitec.com 18
AAS CPU
www.enkitec.com 19
CPU Wait
www.enkitec.com 20
www.enkitec.com 21
CPU Scheduler
www.enkitec.com 25
www.enkitec.com 26
www.enkitec.com 27
Putting it all together
Instances Caged
at 12 CPU’s ea.
SQL Applied to lock
in good plan.
Problem: A single SQL Stmt. overwhelming
CPU resources.
www.enkitec.com 28
BEFORE: procs -----r 525850 5849
AFTER:procs ------r 1313121214
load average: 10.36, 17.76, 36.42
BeforeBeforeAfterAfter
PART4: CPU monitoring and
Capacity Planning
www.enkitec.com 29
OS Tools• The usual Operating System commands
– vmstat
– top
– mpstat –P ALL 1 5
• Cool tools
– collectl –sC (http://collectl.sourceforge.net)
– turbostat.c
– dcli (Exadata)
• dcli -l oracle -g /home/oracle/dbs_group --vmstat 2
• dcli -l oracle -g /home/oracle/dbs_group uptime
www.enkitec.com 30
www.enkitec.com 31
Load Map
www.enkitec.com 32
Performance Page – Historical View
AWR Toolkit
• DIY performance data warehouse
www.enkitec.com 33
run_awr
run_extract
Extract AWR data points as csv files
Package all the csv filesCustomer site
FRESH_LOAD
CHECK_LOAD
DELTA_LOAD
Create new client “dimension” tables
Check new data points
Load new data points
DIY DW server
awr_topevents_(ClientNameX)
awr_cpuwl_(ClientNameX)
awr_iowl_(ClientNameX)
1
2
3 Tableau Analytics
awr_topevents_(ClientNameY)
awr_cpuwl_(ClientNameY)
awr_iowl_(ClientNameY)
awr_topevents_(ClientNameZ)
awr_cpuwl_(ClientNameZ)
awr_iowl_(ClientNameZ)
www.enkitec.com 34
• Tableau auto creates a time dimension for the time
column “MM/DD/YY HH24:MI:SS” of AWR csv output
www.enkitec.com 35
www.enkitec.com 36
• Summary and Underlying data
1-2AM
2-3AM
www.enkitec.com 37
CPU usage across half rack Exadata
www.enkitec.com 38
CPU usage per host
www.enkitec.com 39
CPU redistribution across nodes
Wrap up!• HOWTO compare CPU speeds
o SPECint_rate2006, TPC-C, Actual benchmarking
• Cores vs Threads
o Always have HT on
o ~30% performance benefit after core count
• The different CPU events
o 1 AAS CPU = 1 CPU core
o Oracle CPU may not correlate with Host CPU if you have a lot ofCPU activity outside of the database
• CPU Monitoring/Capacity Planning on consolidated environments
o AWR analytics
www.enkitec.com 40
Resources
• cputoolkit - http://karlarao.wordpress.com/scripts-resources/
• AWR Tableau and R toolkit Visualization Examples - http://goo.gl/xZHHY
• AAS investigation - http://goo.gl/5WaAg
• Cores vs Threads - http://goo.gl/1MLFf
• Turbostat.c - http://goo.gl/jDUKg
• cpu_topology - http://goo.gl/EUDG7
• CPU centric benchmark comparisons - http://goo.gl/nR9Yy
• SLOB - http://goo.gl/yKa45
• Kyle Hailey - http://dboptimizer.com/2011/07/21/oracle-cpu-time/
• The mindmap of this presentation - http://goo.gl/XeY0e
www.enkitec.com 41
42
Fastest Growing Companies
in Dallas