performance case studies common europe june 2012

45
IBM Power Systems © 2012 IBM Corporation Performance Case Studies Examples of performance analysis Dawn May - [email protected]

Upload: common-europe

Post on 24-May-2015

307 views

Category:

Technology


1 download

DESCRIPTION

COMMON Europe Congress 2012 - Vienna

TRANSCRIPT

Page 1: Performance case studies Common Europe june 2012

IBM Power Systems

© 2012 IBM Corporation

Performance Case StudiesExamples of performance analysis

Dawn May - [email protected]

Page 2: Performance case studies Common Europe june 2012

© 2012 IBM Corporation2

IBM Power Systems

Viewing Collection Services Data

General Analysis Review

Page 3: Performance case studies Common Europe june 2012

© 2012 IBM Corporation3

IBM Power Systems

• In this example, we are not looking at a reported problem. Rather, we are surveying Collection Services data to see if anything interesting shows up.

• Open the Collection Services content package• Select “CPU Utilization and Waits Overview”

• Select the COMMON2 collection library• Select Q071123119 for the collection

• Display this

Page 4: Performance case studies Common Europe june 2012

© 2012 IBM Corporation4

IBM Power Systems• This chart is a basic system overview of CPU

utilization and the more common wait conditions.

• Use the Tooltips tool from the tool box. As you move this over various areas of the chart it will tell you what metric you are looking at and more explicit information on that point.

• Note the drop in CPU consumption in interval 11. Also note operating system contention time just prior to that in interval 10 be related.

Tooltips

Page 5: Performance case studies Common Europe june 2012

© 2012 IBM Corporation5

IBM Power Systems

Selection tool

Selected point

• Using the selection tool, click on the CPU utilization line at the low point.

• Note that you can select multiple points or a range. When you drill down after selecting points or a range, that information is remembered and used as input to future charts.

Page 6: Performance case studies Common Europe june 2012

© 2012 IBM Corporation6

IBM Power Systems• Since we are wondering why the CPU has dropped,

lets look at wait data.

• Go to the Select Action window where you will find a list of possible actions.

• Select the Waits Overview chart.

Page 7: Performance case studies Common Europe june 2012

© 2012 IBM Corporation7

IBM Power Systems • By using the fly-over, we can see that most of our wait time is due to disk page faulting.

• There are several next steps that could be taken, and those steps may be based upon experience, knowledge of the workload running on the system, etc.

Page 8: Performance case studies Common Europe june 2012

© 2012 IBM Corporation8

IBM Power Systems • Using the selection tool, click on the large orange area that reflects Disk Page Faults Time.

• Select Waits by Job or Task.

Page 9: Performance case studies Common Europe june 2012

© 2012 IBM Corporation9

IBM Power Systems• Use the Tooltips tool to look at the pages faults for the

QRWTSRVR/QUSER/436662 jobs.

• Note that by clicking on the Disk Page Faults Time on the prior chart, this chart is sorted by Disk Page Faults Time.

Page 10: Performance case studies Common Europe june 2012

© 2012 IBM Corporation10

IBM Power Systems• One point of interest here is that we are looking at a couple of jobs

that have thousands of seconds of wait time. How so? A guess is the server is multi-threaded.

• Click on the first job with the selection tool and then select All Waits by Thread or Task from the Select Action window.

Page 11: Performance case studies Common Europe june 2012

© 2012 IBM Corporation11

IBM Power Systems• Yes, there are four threads spending nearly all their

time waiting on page faults. But how do you know that for sure?

• The interval size is 900 seconds. Most of that is spent waiting, not much left over for running. We can see that by looking at wait data.

Page 12: Performance case studies Common Europe june 2012

© 2012 IBM Corporation12

IBM Power Systems• Use the Zoom Region tool from the tool bar and make a

rectangle over the very left hand part of the chart.

• Do this a couple times to zoom into the leftmost data.

• (If you have trouble with the zoom region, use the Reset Zoom tool to start over).

Zoom Region Tool

Page 13: Performance case studies Common Europe june 2012

© 2012 IBM Corporation13

IBM Power Systems• This data was hidden from view due to the large time

• You can see each thread did run for almost a second.

• You can press the reset zoom tool button to get back to where this chart started.

Reset Zoom Tool

Page 14: Performance case studies Common Europe june 2012

© 2012 IBM Corporation14

IBM Power Systems• What more can we find out? Collection Services does not know what data this server is

working with (you would need to use Disk Watcher or Job Watcher for that). We can find out what this server is and who the client is.

• Click on the arrow for the History and go back to Waits Overview.

• From the Select Action window, select Waits by Server Type.

Page 15: Performance case studies Common Europe june 2012

© 2012 IBM Corporation15

IBM Power Systems• This is the DDM/DRDA server job.

• Press Done.

• When back at the Waits Overview chart, select the Waits by Job Current User Profile chart.

Page 16: Performance case studies Common Europe june 2012

© 2012 IBM Corporation16

IBM Power Systems• Now we can see the user that this server job is doing

work for is VCPANYLT.

• From the History menu, select Home to go back to the main Investigate Data panel.

Page 17: Performance case studies Common Europe june 2012

© 2012 IBM Corporation17

IBM Power Systems• Once again, take the CPU Utilization and Waits

Overview on the same collection we have been working with.

• Select interval 10 we identified with Operating System Contention time earlier in this lab by clicking on the bar.

• Drill down into Contention Waits Overview

Page 18: Performance case studies Common Europe june 2012

© 2012 IBM Corporation18

IBM Power Systems• From this chart we can see that the blue bar is

“Machine Level Gate Serialization Time” and the pink is “Semaphore Contention Time”. Semaphore contention is often a normal wait condition, so we're not really interested in that. We are, however interested in what would be causing the machine level gate serialization.

• Drill down into Waits by Job or Task to see if we can figure out what jobs are contributing to this contention.

Page 19: Performance case studies Common Europe june 2012

© 2012 IBM Corporation19

IBM Power Systems • We have identified the QRWTSRVR jobs as the jobs with the machine level gate serialization. However, Collection Services cannot tell us more. To understand this machine level gate serialization would require Job Watcher data that has more information including holders and call stacks.

• Press Done to return to the main Investigate Data panel. Or, you can use the History arrow to go “Home”.

Page 20: Performance case studies Common Europe june 2012

© 2012 IBM Corporation20

IBM Power Systems

This concludes the exercise using the Performance Data Investigator with Collection Services data.

We have barely scratched the surface of the capabilities of the Performance Investigator and what information you can discover by looking at Collection Services data.

We hope you found this interesting, useful, and realize that you do not need to be a performance expert to benefit from the performance data available to you when using the Performance Investigator.

Page 21: Performance case studies Common Europe june 2012

© 2012 IBM Corporation21

IBM Power Systems

Viewing Waitswith

Job Watcher

Example of Machine Level Gate Serialization

Page 22: Performance case studies Common Europe june 2012

© 2012 IBM Corporation22

IBM Power Systems Start with the IBM Systems Director Navigator for iExpand the “IBM i Management” treeSelect Performance category. Click on “Investigate Data”.

Performance Tasks

Page 23: Performance case studies Common Europe june 2012

© 2012 IBM Corporation

IBM Power Systems • Open the Job Watcher content package• Select the “CPU Utilization and Waits Overview”

• Select the COMMON collection library• Select DAWNJW2 for the collection

• Display this (the next chart may take a minute to two to display)

Page 24: Performance case studies Common Europe june 2012

© 2012 IBM Corporation

IBM Power Systems • You will see the following chart

• Click on the “Full Zoom Out” icon

Page 25: Performance case studies Common Europe june 2012

© 2012 IBM Corporation25

IBM Power Systems • Look for unusual patterns as a way to start the investigation

• Here, note the drop in CPU utilization just before 8:52, along with a corresponding increase in wait information.

• Let's zoom into that timeframe using the zoom region tool. The zoom region will let you draw a box around the timeframe you are interested in.

Page 26: Performance case studies Common Europe june 2012

© 2012 IBM Corporation26

IBM Power Systems By zooming in, we can see that “Operating System Contention Time” is a significant wait contributor during the time when CPU utilization dropped.

Use the tooltips tool to see the information for the Operating System Contention time.

You will also note that there are gaps in the graph between some of the stacked bars. With Job Watcher, it is possible for a collection interval to take longer to complete than the Job Watcher definition specifies. When these “long” collection intervals occur, they will show up as gaps in the graph.

Page 27: Performance case studies Common Europe june 2012

© 2012 IBM Corporation27

IBM Power Systems Using the selection tool (arrow), click on the first bar with significant Operating System Contention. Also select the last bar with significant Operating System Contention.

Drill into Contention Waits Overview once the two data points have been selected.

By selecting specific data points in the graph, all future drill-downs will now be limited to the timeframe which has been selected.

Using the zoom tool (as we did a few steps earlier) does NOT select data points and does not limit the scope of drill-downs.

Page 28: Performance case studies Common Europe june 2012

© 2012 IBM Corporation28

IBM Power Systems• Machine Level Gate Serialization now shows up as a wait type• Use the flyover tool to display the wait information• Return to using the selection tool (arrow)

Page 29: Performance case studies Common Europe june 2012

© 2012 IBM Corporation29

IBM Power Systems • Select an interval to investigate further by clicking on a bar as shown below

• Drill down into All Waits by Thread or Task sorted by Machine Level Gate Serialization

Page 30: Performance case studies Common Europe june 2012

© 2012 IBM Corporation30

IBM Power Systems• You will see the following chart

• Use the zoom tool to get a closer view

Page 31: Performance case studies Common Europe june 2012

© 2012 IBM Corporation31

IBM Power Systems• You will get the following chart when you zoom in.

• Use the flyover tool to display information about the machine gate serialization waits.

• Return to using the selection tool.

Page 32: Performance case studies Common Europe june 2012

© 2012 IBM Corporation32

IBM Power Systems• Select a thread to investigate further.

• Drill into All Waits for One Thread or Task

Page 33: Performance case studies Common Europe june 2012

© 2012 IBM Corporation33

IBM Power Systems• Select the interval to investigate further.

• Drill into Interval Details for One Thread or Task

Page 34: Performance case studies Common Europe june 2012

© 2012 IBM Corporation34

IBM Power Systems Here you see this thread is waiting for the QAUDJRN journal at 8:51:05.

In the call stack you will see an entry that shows the job is creating an audit journal entry.

Note that access to the audit journal is serialized by a “gate”. So why is this job blocked and waiting to create the audit record?

Page 35: Performance case studies Common Europe june 2012

© 2012 IBM Corporation35

IBM Power Systems

Display Journal Entries Journal . . . . . . : QAUDJRN Library . . . . . . : QSYS Largest sequence number on this screen . . . . . . : 00000000000088885894 Type options, press Enter. 5=Display entire entry Opt Sequence Code Type Object Library Job Time 88885883 T GS BEIJINGA 8:51:02 88885884 T SK QSYSARB 8:51:02 88885885 J NR QDBSRV02 8:51:02 88885886 J PR QDBSRV02 8:51:06 88885887 T GS BEIJINGA 8:51:07 88885888 T GS BEIJINGA 8:51:07 88885889 T GS BEIJINGA 8:51:07 88885890 T SK QSYSARB 8:51:07 88885891 T GS BEIJINGA 8:51:07 88885892 T GS BEIJINGA 8:51:07 88885893 T GS BEIJINGA 8:51:07 88885894 T GS BEIJINGA 8:51:07 More... F3=Exit F12=Cancel

If the audit journal information was still available, you could look at it.

This screen capture shows the audit journal entries from the matching time period.

● NR is Next Receiver● PR is Previous Receiver

Page 36: Performance case studies Common Europe june 2012

© 2012 IBM Corporation36

IBM Power Systems

This exercise shows how a normal system function for going to a new journal receiver affected the CPU utilization of the system for a short period of time.

In this scenario, the next steps would be to evaluate what information is being captured in the security audit journal to ensure you are not auditing information you do not need.

This exercise also shows how powerful the Job Watcher capabilities are for understanding the details of what is happening on the system.

This is something only IBM i can do!

Page 37: Performance case studies Common Europe june 2012

© 2012 IBM Corporation37

IBM Power Systems

If you were to start this lab over, the graph which is displayed after doing the “full zoom out” will show other potentially interesting timeframes in the data.

At about 9:04 and a bit after 9:26 there are additional spikes in operating system contention time.

Throughout the graph there are several drops in CPU utilization.

Feel free to examine this Job Watcher data further if time allows.

With the Job Watcher data and the Performance Data Investigator, you can learn quite a bit about the performance of your IBM i.

Page 38: Performance case studies Common Europe june 2012

© 2012 IBM Corporation38

IBM Power Systems

Viewing Waitswith

Job Watcher

Example of Object Lock Contention

Page 39: Performance case studies Common Europe june 2012

© 2012 IBM Corporation39

IBM Power Systems

Job Watcher: CPU Utilization and Waits Overview

Look at the run/wait signature for the entire collection

Drill down into thedetails for that wait

bucket

Look for the wait time that appears to be the most

pervasive throughout the collection. In this case, it Is Lock Contention Time

Page 40: Performance case studies Common Europe june 2012

© 2012 IBM Corporation40

IBM Power Systems

Seizes and Locks Waits Overview All Waits by Thread or Task …

Look at all the waits by thread or task for that wait type

Page 41: Performance case studies Common Europe june 2012

© 2012 IBM Corporation41

IBM Power Systems

All Waits by Thread or Task All Waits for One Thread or Task Select the job with the

object lock contention time. Look at all waits

for that one thread ortask

Page 42: Performance case studies Common Europe june 2012

© 2012 IBM Corporation42

IBM Power Systems

All Waits for One Thread or Task Interval Details

Select an interval where the wait is

displayed by clicking on it

Display the interval details for that thread or task

Page 43: Performance case studies Common Europe june 2012

© 2012 IBM Corporation43

IBM Power Systems

Interval DetailsThe information about the object

waited on and who is holding the lock to that object can be found

here. The call stack is below.The call stack can give an idea

of where to look tofind the root cause of the problem.

Very powerful!!

Page 44: Performance case studies Common Europe june 2012

© 2012 IBM Corporation44

IBM Power Systems

This document was developed for IBM offerings in the United States as of the date of publication. IBM may not make these offerings available in other countries, and the information is subject to change without notice. Consult your local IBM business contact for information on the IBM offerings available in your area.Information in this document concerning non-IBM products was obtained from the suppliers of these products or other public sources. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.IBM may have patents or pending patent applications covering subject matter in this document. The furnishing of this document does not give you any license to these patents. Send license inquires, in writing, to IBM Director of Licensing, IBM Corporation, New Castle Drive, Armonk, NY 10504-1785 USA. All statements regarding IBM future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. The information contained in this document has not been submitted to any formal IBM test and is provided "AS IS" with no warranties or guarantees either expressed or implied.All examples cited or described in this document are presented as illustrations of the manner in which some IBM products can be used and the results that may be achieved. Actual environmental costs and performance characteristics will vary depending on individual client configurations and conditions.IBM Global Financing offerings are provided through IBM Credit Corporation in the United States and other IBM subsidiaries and divisions worldwide to qualified commercial and government clients. Rates are based on a client's credit rating, financing terms, offering type, equipment type and options, and may vary by country. Other restrictions may apply. Rates and offerings are subject to change, extension or withdrawal without notice.IBM is not responsible for printing errors in this document that result in pricing or information inaccuracies.All prices shown are IBM's United States suggested list prices and are subject to change without notice; reseller prices may vary.IBM hardware products are manufactured from new parts, or new and serviceable used parts. Regardless, our warranty terms apply.Any performance data contained in this document was determined in a controlled environment. Actual results may vary significantly and are dependent on many factors including system hardware configuration and software design and configuration. Some measurements quoted in this document may have been made on development-level systems. There is no guarantee these measurements will be the same on generally-available systems. Some measurements quoted in this document may have been estimated through extrapolation. Users of this document should verify the applicable data for their specific environment.

Revised September 26, 2006

Special notices

Page 45: Performance case studies Common Europe june 2012

© 2012 IBM Corporation45

IBM Power Systems

IBM, the IBM logo, ibm.com AIX, AIX (logo), AIX 5L, AIX 6 (logo), AS/400, BladeCenter, Blue Gene, ClusterProven, DB2, ESCON, i5/OS, i5/OS (logo), IBM Business Partner (logo), IntelliStation, LoadLeveler, Lotus, Lotus Notes, Notes, Operating System/400, OS/400, PartnerLink, PartnerWorld, PowerPC, pSeries, Rational, RISC System/6000, RS/6000, THINK, Tivoli, Tivoli (logo), Tivoli Management Environment, WebSphere, xSeries, z/OS, zSeries, Active Memory, Balanced Warehouse, CacheFlow, Cool Blue, IBM Systems Director VMControl, pureScale, TurboCore, Chiphopper, Cloudscape, DB2 Universal Database, DS4000, DS6000, DS8000, EnergyScale, Enterprise Workload Manager, General Parallel File System, , GPFS, HACMP, HACMP/6000, HASM, IBM Systems Director Active Energy Manager, iSeries, Micro-Partitioning, POWER, PowerExecutive, PowerVM, PowerVM (logo), PowerHA, Power Architecture, Power Everywhere, Power Family, POWER Hypervisor, Power Systems, Power Systems (logo), Power Systems Software, Power Systems Software (logo), POWER2, POWER3, POWER4, POWER4+, POWER5, POWER5+, POWER6, POWER6+, POWER7, System i, System p, System p5, System Storage, System z, TME 10, Workload Partitions Manager and X-Architecture are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries.

A full list of U.S. trademarks owned by IBM may be found at: http://www.ibm.com/legal/copytrade.shtml.

Adobe, the Adobe logo, PostScript, and the PostScript logo are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States, and/or other countries.AltiVec is a trademark of Freescale Semiconductor, Inc.AMD Opteron is a trademark of Advanced Micro Devices, Inc.InfiniBand, InfiniBand Trade Association and the InfiniBand design marks are trademarks and/or service marks of the InfiniBand Trade Association. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce.Java and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.Linear Tape-Open, LTO, the LTO Logo, Ultrium, and the Ultrium logo are trademarks of HP, IBM Corp. and Quantum in the U.S. and other countries.Linux is a registered trademark of Linus Torvalds in the United States, other countries or both.Microsoft, Windows and the Windows logo are registered trademarks of Microsoft Corporation in the United States, other countries or both.NetBench is a registered trademark of Ziff Davis Media in the United States, other countries or both.SPECint, SPECfp, SPECjbb, SPECweb, SPECjAppServer, SPEC OMP, SPECviewperf, SPECapc, SPEChpc, SPECjvm, SPECmail, SPECimap and SPECsfs are trademarks of the Standard Performance Evaluation Corp (SPEC).The Power Architecture and Power.org wordmarks and the Power and Power.org logos and related marks are trademarks and service marks licensed by Power.org.TPC-C and TPC-H are trademarks of the Transaction Performance Processing Council (TPPC).UNIX is a registered trademark of The Open Group in the United States, other countries or both.

Other company, product and service names may be trademarks or service marks of others.

Revised December 2, 2010

Special notices (cont.)