nmon for aix6 & power6 -introduction & -newfeatures in version...
TRANSCRIPT
© 2008 IBM Corporation
Power Systems Technical Conference
nmon for AIX6 & POWER6- Introduction & - New Features in version 12
Nigel Griffiths, IBM Europe
nmon developer
IBM Systems & Technology Group – Power Systems 2008 – nmon 12
© 2008 IBM Corporation2
TrademarksThe following are trademarks of the International Business Machines Corporation in the United States, other countries, or both.
The following are trademarks or registered trademarks of other companies.
* All other products may be trademarks or registered trademarks of their respective companies.
Notes:
Some, all or none of this presentation might, may or will be true or not, as applicable.
Performance is in Internal Throughput Rate (ITR) ratio based on measurements and projections using standard IBM benchmarks in a controlled environment. The actual throughput that any user will experience will vary depending upon considerations such as the amount of multiprogramming in the user's job stream, the I/O configuration, the storage configuration, and the workload processed. Therefore, no assurance can be given that an individual user will achieve throughput improvements equivalent to the performance ratios stated here.
IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.
All customer examples cited or described in this presentation are presented as illustrations of the manner in which some customers have used IBM products and the results they may have achieved. Actual environmental costs and performance characteristics will vary depending on individual customer configurations and conditions.
This publication was produced in the United States. IBM may not offer the products, services or features discussed in this document in other countries, and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area.
All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only.
Information about non-IBM products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography.
Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.
Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom.
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both.
Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel
Corporation or its subsidiaries in the United States and other countries.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both.
ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office.
IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency, which is now part of the Office of Government Commerce.
For a complete list of IBM Trademarks, see www.ibm.com/legal/copytrade.shtml:
*, AS/400®, e business(logo)®, DBE, ESCO, eServer, FICON, IBM®, IBM (logo)®, iSeries®, MVS, OS/390®, pSeries®, RS/6000®, S/30, VM/ESA®, VSE/ESA, WebSphere®, xSeries®, z/OS®, zSeries®, z/VM®, System i, System i5, System p, System p5, System x, System z, System z9®, BladeCenter®
Not all common law marks used by IBM are listed on this page. Failure of a mark to appear does not mean that IBM does not use the mark nor does it mean that the product is not
actively marketed or is not significant within its relevant market.
Those trademarks followed by ® are registered trademarks of IBM in the United States; all others are trademarks or common law marks of IBM in the United States.
Some, all or none of this presentation might, may or will be true or not, as applicable.
3 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
� Abstract:
► Introduction to the nmon toolset
► Top 20 improvements for nmon12 including POWER6/AIX6
► Finish off with support and favourite warnings
� Assuming you:
► are familiar with Performance Tuning and
► understand basic POWER5, SMT, Shared CPU LPARs.
�WARNING !! � No warrantee given or implied.
� nmon is supported by me - not supported by IBM.
� If it messes up, it’s my fault, sorry!!
4 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
nmon Why and Principles?
� Why nmon?
► AIX benchmark Monitoring & Tuning ���� Simple, small & safe
► Produce benchmark reports with graphs (I am lazy)
► For myself (personal pet project) but everyone wanted a copy!!
� Design principles
► For the performance expert
► Less than 1% CPU
► Zero installation time & simple to use
► Maximum info on one screen
► Support for older AIX versions - AIX 4.1, 4.2, 4.3, 5.1, 5.2
► Capture data for post mortem & reporting
► For large machines
● 64 CPUs, 4000+ disks, 35,000 processes
5 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
nmon basics� Freely available
� Performance monitor for
► AIX 5 and 6
► AIX 4 (via older nmon version)
► Linux (POWER, x86 and mainframe)
� Near zero installation
� Two modes
1. Online see ����
2. Save data to CSV file then either:
● nmon2rrd & rrdtool for .gifs/website
● nmon Analyser Excel spreadsheet
– From Stephen Atkins
– IBM, UK.
– Requires Excel 2000
– Very good chap.
6 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
nmon flow
Screennmon
file
0
2 0
4 0
6 0
8 0
10 0
12 0
14 0
16 0
18 0
Excel graphs
Stephen’s
nmon
Analysernmon2rrd
C filter
Java Dynamic Graphs
rrdtoolopen
source
Scripts- Create rrd
- Load rrd
- Graph rrd
index.html
-f or –F optionnmon
Federico’s
pGraph
Website +.jpg graphs
Bruce’s
nmon2web
Perl
rrdtoolopen
source
Scriptsrrdtool scripts
+ CPU & RAM
aggregation
Stephen’s
nmon
Consolidator
0
10
20
30
40
50
60
70
80
90
100
7 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
nmon on POWER6 & AIX6 + New Features for V12User Requested
1. Disk Service Times
2. Selecting Particular Disks
3. Time Drift
4. Multiple Page Sizes ����
5. Timestamps in UTC & no. of digits
6. More Kernel & Hypervisor Stats *
7. High Priority nmon
Advanced, POWER6 and AIX6 items
8. Virtual I/O Server SEA
9. Partition Mobility (POWER6)
10.WPAR & Application Mobility (AIX6)
11.Dedicated Donating (POWER6)
12.Folded CPU count (SPLPAR)
13.Multiple Shared Pools (POWER6)
14.Fibre Channel stats via entstat
Housekeeping items
15.Bug fixes – small or fine tuning, see nmon wiki for a list
16.Network packet sizes now saved to file
17.Warnings from nmon like: network overflow
18.TOP Online - sizes in KB, MB, GB to keep columns aligned
19.Fast abort in capture mode – SIGUSR1 during config collection stops nmon
20.Return an error & on invalid options – return codes to check it finished normally
21.Adapter "not available" work around – see nmon FAQ wiki about these AIX bugs
22.EMC hdiskpower renamed to “power” to save screen space
23.NFS v4 for AIX 5.3 ML5+
8 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Disk Service Times ���� DDD D (file capture –d)
1. Service time msec � DISKSERV
2. Wait time msec � DISKWAIT
3. Service Queue size
4. Wait Queue size
5. Server Queue full
1
9 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Disk Service Times
Being serviced
by AdapterService Time
Wait Queue Size
Service Queue Size
Wait Time
Application
AIX kernel
Service
Queue
Full count
1
2
3
4
5
1
10 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Select-a-disk
� On large machines many disks
� Some users only want to monitor selected (not all)
� Previously you had to use disk groups
►but then you don’t get service times
� nmon –k hdisk1:hisk2
� Side effect is you get Service Times nmon –k hdisk1:hdisk2
2
11 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Time Drift
� On large machines nmon takes time to run
� These add up
� Later snap shots are “late”
� Some tools don’t handle this well
� Nmon added up the delay and shortens the “sleep” time to counter act it.
Minutes
This minutes has
no data collected
Sleep time
nmon runtime
3
12 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Multiple Page Sizes
� Hit “M”
� For performance
► Reduced memory mgmt
� Four sizes (currently)
� 37 x 4 stats!!
� Two types
► Page counts
► Pages/sec
� We show them all!
► Important ones ����you decide
� Second “M” gives you MBs
� For file output -M
► MEMPAGES4K
► MEMPAGES64K, etc
4
13 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Timestamps in UTC & number of digits
� For file capture mode only
� nmon –f –G
► G uses Universal Standard Time (GMT)
► Useful is you have lots of machines/LPARs on different time zones
► You then get a consistent picture
� nmon –f -w N
► Time stamps are the T0001,T0002, … numbers in the file
► Where N between 4 and 16 = the number of digits (if botched 8 is used)
► Also reported in “AAA,timestampsize” line
► Default is still 4 digits
► IMHO:
● 4 is already to many but users what very long running + detailed collections
5
14 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
More Kernel and Hypervisor Stats
� In the past nmon collect useful data
► Getting harder to determine ���� increasingly complex platform
� So by popular request …
� Use nmon –f –K
� libperfstat structures are “dumped” with no interpretation
► perfstat_cpu_total_t
► perfstat_partition_total_t
� New sections:
► RAWCPU
► RAWLPAR
� Good luck with the numbers ☺☺☺☺
� Online these stats are shown in the CPU and Partition screens
6
RAWLPAR:
name,type,lpar_id,group_id,pool_id,online_cpus,max_cpus,min_cpus,online_
memory,max_memory,min_memory,entitled_proc_capacity,max_proc_capacit
y,min_proc_capacity,proc_capacity_increment,unalloc_proc_capacity,var_pro
c_capacity_weight,unalloc_var_proc_capacity_weight,online_phys_cpus_sys,
max_phys_cpus_sys,phys_cpus_pool,puser,psys,pidle,pwait,pool_idle_time,p
hantintrs,invol_virt_cswitch,vol_virt_cswitch,timebase_last
RAWCPUTOTAL:
ncpus,ncpus_cfg,description,processorHZ,user,sys,idle,wait,pswitch,syscall,
sysread,syswrite,sysfork,sysexec,readch,writech,devintrs,softintrs,lbolt,loadav
g1,loadavg5,loadavg15,runque,swpque,bread,bwrite,lread,lwrite,phread,
phwrite,runocc,swpocc,iget,namei,dirblk,msg,sema,rcvint,xmtint,mdmint,
tty_rawinch,tty_caninch,tty_rawoutch,ksched,koverf,kexit,rbread,rcread,rbwrt,
rcwrt,traps,ncpus_high,puser,psys,pidle,pwait,decrintrs,mpcrintrs,mpcsintrs,
phantintrs
15 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
High Priority nmon
� On very busy machines 100% CPU busy) = a number of problems
1. Nmon does not get scheduled – CPU priority is average but the run queue large
● nmon –Z N
● Where N is renice value used in setpriority() system call
● Range -20 to 20
● Only root can make it higher priority (negative numbers)
● nmon –Z -20 ���� is the best you can get – it works too
2. nmon gets swamped out
● No fixed, I am too nervous to pin nmon in memory
3. Thousands of Processes makes nmon CPU large handling the data
● Have special nmon version for this!
7
16 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
VIOS SEA Stats ���� Virtual I/O Server only
� This was handled via nmon external data module
� Now internal by running entstat command ���� larger overhead and API
� nmon -f -O or hit “O” ���� Ocean!
� To locate the SEAs: lsdev -t sea -F name
VIO Server
Physical Virtual Network
SEA
Client Client
8
17 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Partition Mobility (POWER6)
� How does PM effect nmon?
► It doesn’t really !!
►nmon migrates along with everything else
9
18 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Dynamic Reconfig API – Class 101
� Programs are notified of the migration via the Dynamic Reconfig API
► Same as CPU and memory Dynamic LPAR change interface
� Information - See Docs for
► function dr_reconfig()
► /usr/include/sys/dr.h
► signal SIGRECONFIG
� Basics ….
� DR change triggers for signal to processes (or shell script)
► Via Asynchronous software interrupts
► Have to acknowledge them or PM fails!
� Multi-Phase ���� check ���� pre ���� doit ����post (or failed with error)
9
19 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Partition Mobility (POWER6)
� DR does not fit normal nmon output not at regular capture time
� nmon is now DR Aware
► Reported online ���� Resources screen & LPAR screens
► Captured to file ���� new BBBR section – only output after first DR operation
�What Changes on PM?
9
► Jumped between machines = Machine Serial Number
● few seconds later
► Jumped between LPARs = LPAR Number & Name (may change)
► New environment could be different
● Same CPU and Entitlement but …
● Faster CPUs, more or less CPUs in the pool, pool contention
different.
20 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Partition Mobility ���� Online9
� Reported online ���� Resources screen & LPAR screens
21 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Partition Mobility ���� File output
BBBR,000,when,add,remove,cpu,mem,check,pre,doit,post,posterror,force,bindproc,softpset,hardpset,plock,pshm,ent_cap,var_wgt
,splpar_capable,splpar_shared,splpar_capped,cap_constrained,migrate,hibernate,partition,wpar,checkpoint,restart,logical_cpu,bin
d_cpu,memory_change,capacity,delta_cap,old_serialno,current_serialno,lpar_number,lpar_name
BBBR,001,13:33:52,add,-,cpu,-,-,check,-,-,-,-,-,-,-,-,-,-,-,
splpar_capable,-,splpar_capped,-,-,-,-,-,-,-,0,2,0,0,0,C1CD7F,C1CD7F,3,p06-AIX6 0736A
BBBR,002,13:33:53,add,-,cpu,-,-,-,pre,-,-,-,-,-,-,-,-,-,-,
splpar_capable,-,splpar_capped,-,-,-,-,-,-,-,0,2,0,0,0,C1CD7F,C1CD7F,3,p06-AIX6 0736A
BBBR,003,13:33:53,add,-,cpu,-,-,-,-,post,-,-,-,-,-,-,-,-,-,
splpar_capable,-,splpar_capped,-,-,-,-,-,-,-,0,2,0,0,0,C1CD7F,C1CD7F,3,p06-AIX6 0736A
9
►Captured to file ���� new BBBR section
►Only output after first DR operation
►Example below = Add 2 CPUs
BBBR,000,when,add,remove,cpu,mem,check,pre,doit,post,posterror,
cpu, serialno,lpar_number,lpar_name
BBBR,001,13:33:52,add,-,cpu,-,-,check 2 C1CD7F,3,p06-AIX6 0736A
BBBR,002,13:33:53,add,-,cpu,-,-,-,pre 2 C1CD7F,3,p06-AIX6 0736A
BBBR,003,13:33:53,add,-,cpu,-,-,-,-,post 2 C1CD7F,3,p06-AIX6 0736A
22 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Workload Partition (WPAR) (AIX6)
� Inside a WPAR level
► Inside WPAR only see your own processes
● Can’t see the Global view
► Note: no disks, no adapters and no paging spaces
► No point in monitoring inside WPAR?
► Unless you are system admin only for that WPAR
� WPAR Global Level
► WPARs run like Workload Manager classes and nmon does WLM class stats
► Online
● @ = WPAR stats
● W = WLM classic view
● W and @ then WLM shows WPAR dynamic classes too
● tU = Top with User cmd includes the WPAR names
► Capture mode – look for WPARCPU and WPARMEM
► NFS mounted WPAR filesystems don’t appear in the stats much!
AIX6
WPAR
WPAR
WPAR
WPAR
10
23 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
10WPAR
App/Sys
Defined/Active
Usr/Sys% CPU
RAM use &
FScache
RunQ/pSwich
and Fork
Dynamic
WPAR class’
Default No
WLM control
Top Procs
WPAR name
System/Default
Etc =Global AIX
24 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Application (WPAR) Mobility (AIX6)
� AM moves the WPAR processes to a new copy of AIX
► Uses the Dynamic Reconfiguration API like PM
� You can move WPAR between
► AIX6 on different machines – Serial Number change + LPAR name may change
► AIX6 on same machine – Serial Number no change+ LPAR name will change
� Environment could be different
► Faster CPUs, more or less CPUs power,
► memory, adapters (Etherchannel or MPIO)
AIX6 AIX6
WPAR
10
25 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Application (WPAR) Mobility (AIX6)10
“Post” Clean up
= Done
“Check & Pre” Migrate phase
Different LPARs
Same Serial No
26 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
POWER6 Dedicated Donating (POWER6)
� Dedicated CPU LPAR profile
� Dedicated Donating =
CPU Sharing when Active
� Not obvious!!!!
11
27 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
POWER6 Dedicated Donating (POWER6)
� Hit “p” for Partition stats at bottom
� LPAR Idle
► Donated to pool
� LPAR Busy
► Pulled back to owning LPAR
� “Stolen” means cycles used by the Hypervisor
� Data capture to file by default (if Donating Enabled) ���� see DONATE section
� Good learning exercise
► I thought this work like a Dedicated LPAR but loan CPU cycles on demand
► “Doh!! No” to quote Homer
► Behaves like SPLPAR but gets preference for its CPUs on demand
11
28 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
POWER6 Dedicated Donating (POWER6)� Actual measured data LPAR has 2 Dedicated CPUs with SMT and Donating
► Start 8 programs (semi-CPU intensive)
► At 1 second intervals that run for 30 seconds each
� pIdle – Physical Idle time is CPU pulled back to the LPAR
► Occasionally over reaction but playing it safe
Donating Logical Partition p06 10/09/2007
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
23:2
6
23:2
6
23:2
6
23:2
6
23:2
6
23:2
6
23:2
6
23:2
6
23:2
6
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
7
23:2
8
23:2
8
23:2
8
pUser pSys pWait pIdle pDonateIdle pDonateBusy pStolenIdle pStolenBusy
11
Actively
Running
Progs
Hypervisor
making sure
CPU is
available
29 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Folded CPU count (SPLPAR)
� Virtual Processor folding is a clever optimisation from AIX
� There is no stats for this but we can detect it.
� Folded VP Idle = 100% CPU time between the logical SMT CPU pair
► Also zero system calls.
� This is shown on the CPU section and new column of the LPAR stats
12
30 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Folded CPU count (SPLPAR)
� Notes:
► Actual measured data LPAR has 4 VP with SMT=on
● Start 10 jobs at 1 second intervals that run for 30 seconds each
► Folded must be whole Physical CPU
► Can’t Fold the last CPU – one is always running
► As CPU decrease Fold is slow
► As CPU increases Unfold is quick
► Plotted here is:
● Use Physical CPU
● Unfolded = VP – Folded
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
15:3
9:4
7
15:3
9:5
1
15:3
9:5
5
15:3
9:5
9
15:4
0:0
3
15:4
0:0
7
15:4
0:1
1
15:4
0:1
5
15:4
0:1
9
15:4
0:2
3
15:4
0:2
7
15:4
0:3
1
15:4
0:3
5
15:4
0:3
9
PhysicalCPU
Unfolded
12
31 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Multiple Shared Pools (POWER6)13
Select a
machine
32 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Multiple Shared Pools (POWER6)� To aid licence control & different workload requirements in LPAR groups
► Limiting factor for your LPAR & important for tuning
� Shared Pool number in the LPAR stats (as below) ���� changes with PM
► Also now saved to file in the LPAR sheet last column
� File capture new sheets: POOLS below numbers plus Pool id and Entitlement
13
Pool id=1
Shared CPUs All Shared CPU useMy PoolSum(E)
In my Pool
33 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Fibre Channel Stats from the fcstat command
� Fibre channel stats are not available via libperfstat API
► Adapters Stats are only the sum of disk stats
� Using the fcstat command to get the raw I/O stats
► This has a performance hit have to run fcstat command
► This is a point in time number not averages over the period
� Hit: ^
► This is VIO Server use of a FC while VIO Client doing I/O
� File capture new sheets: FCREAD, FCWRITE, FCXFERIN, FCXFEROUT
� This will include Fibre Channel Tape I/O
► If a FC only has tape drives ���� we have tape stats – hurray!!!
14
34 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Finishing off
� nmon Bugs
� nmon Support
� Warning: Uncapped Shared CPU LPAR Utilisation
� Final words and URLs
35 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Bugs
� AIX 5.3 ML5
► Symptom: nmon core dumps with “Invalid instruction”
► Cause: AIX upgrade script bug - See FAQ Question 54
► Fix: check: lslpp -L | grep -i perfstat then upgrade to match
● bos.perf.libperfstat 5.3.0.50 C F Performance Statistics Library
● bos.perf.perfstat 5.3.0.60 C F Performance Statistics
� AIX 5.3 ML6
► Symptom: nmon file corrupt with NULL’s, some CPU stat missing, “nfs”
► Cause: AIX libperfstat bug in NFS stats return junk & data overwrite!
► Fix: Leave off –N = NFS or use nmon 12
� VIO Server 1.5 – the “raso” command causes a AIX kernel panic!
► Fix: Avoided in nmon12e
� AIX 5.3 TL7
► Memory percent for Processes & Users ����massive large numbers
► Cause: AIX libperfstat bug estimated value unfortunately negative in an unsigned!
► Fix: Work around in nmon 12e
� AIX 5.3 TL7
► fcstat command on adapter with missing cables hangs/spins.
► Fix in nmon12e
36 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
nmon Support – often requested by customers
� nmon is already Fully Supported
► I and other nmon users support nmon
1. You can ask questions? ���� Performance Tools Forum
2. Report bugs to be fixed? ���� Performance Tools Forum
3. Get newer versions? ���� nmon Wiki (on AIX Wiki)
4. Nmon docs and FAQ? ���� nmon Wiki (on AIX Wiki)
5. Send nmon data to IBM Support regarding Performance Problems
● nmon is NOT a problem determination tool (it’s a tuning tool)
● Use the right tools ���� snap and perfPMR
� Note:
► Running nmon on the VIO Server does not invalidate your IBM support
37 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
Shared CPU LPAR Utilization is Very Misleading
� Very common mistake with customers
on POWER5 and soon POWER6
► Decades of monitoring with Utilisation
���� User, System, Wait, Idle = 100%
► QED wedded/welded to this number
� Utilization gets to 100% at Entitlement level
but uncapped LPAR can go x10 faster
but not reported as =1000%
� Therefore measure used CPU cycles
► AIX commands = Physc (Physical Consumed)
► nmon tool = UsedCPU
38 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
nmon - final words
1) Run the latest version of nmon
� Changing one file is hardly an upgrade!!
2) Physical CPU use for Shared Processor LPARs (SPLPAR)
� Not Utilisation – it is only an interesting ratio now
3) nmon problem?
� Read�nmon FAQ�Performance Forum�nmon Manual
� Then add a question to the Forum
� I check the Forum most days
39 nmon 12 for POWER6 + AIX6
IBM Systems & Technology Group – Power Systems 2008
© 2008 IBM Corporation
References
Forum
�Performance Tools Forum
� Mostly about nmon ☺☺☺☺
� For questions & answers
Wiki
���� AIX 5L Wiki
�“Other Performance Tools”
� nmon
� Download
� FAQ
� Manual and lots more
http://www.ibm.com/systems/p/community/
� “Got a good idea for nmon 14 ?”
►Good ideas welcome, bad ones are a good laugh too ☺☺☺☺
� Thank you for using nmon
� Questions
http://www.ibm.com/collaboration/wiki/display/WikiPtype/Movieshttp://www.ibm.com/collaboration/wiki/display/WikiPtype/Movies