© 2009 IBM Corporation
IBM Integration Bus V10 PerformanceHow to analyse your system to optimise performance and throughput
Martin Ross - [email protected]
© 2015 IBM Corporation
14 July 2015
Overview
The purpose of this presentation is to demonstrate how to find the cause of poor performance for an IBM Integration Bus node (broker) for two different types of problem.
The examples are obtained on a Windows system but the principles of investigation and problem determination apply equally on all platforms. The system level tools will differ though.
© 2015 IBM Corporation
14 July 2015
Agenda
IntroductionToolsTechniquesDemonstration
© 2015 IBM Corporation
14 July 2015
What are the main performance costs in message flows?
ZYX…CBAParsing Tree Navigation
Root.Body.Level1.Level2. Level3.Description.Line[1];
Tree Copying
Set OutputRoot = InputRoot;
Resource Access Processing Logic
© 2015 IBM Corporation
14 July 2015
Integration Bus Processes
Integration Node
Integration Server
Application ApplicationMessage
flowsMessage
flows
Libraries Libraries
Integration Server [n]
Application ApplicationMessage
flowsMessage
flows
Libraries Libraries
Bipservice– Lightweight and resilient process that starts
and monitors the bipbroker process– If the bipbroker process fails, bipservice will
restart it
Bipbroker– A more substantial process. Contains the
deployment manager and administrative agent. All commands, toolkit connections and WebUI go through this process.
– Responsible for starting and monitoring the biphttplistener, bipMQTT and DataFlowEngine processes.
– If any process fail, bipbroker will restart them.
BipMQTT– Handles MQTT events
Biphttplistener– Runs the brokerwide HTTP connector for
HTTP and SOAP nodes.
DataFlowEngine– Runtime engine for all deployed resources.
© 2015 IBM Corporation
14 July 2015
Which resources and how much
Understand typical resource utilisation – need to understand if resource utilisation is higher than expected or running as normal...
In busy times expect to use what is needed (!)– Exactly what will depend on the configuration and the applications– Typical to use CPU and memory plus I/O to some level
In quiet times Message Broker and MQ processes– Should use very little CPU– Should use very little I/O capacity– Will retain memory
Some memory sizes whilst running the Coordinated Request Reply sample– Bipservice 3.7 MB– Bipbroker 112 MB– Biphttplistener 35 MB– DataFlowEngine 154 MB
• Can use from ~100 MB to GigaBytes depending on number of flows, complexity of the message flow, the size of the messages
MQ processes– Expect it to be less than IBM Integration Bus (76 MB for a simple queue manager)– Will depend on number of open queues, channels, queue buffer sizes etc.
© 2015 IBM Corporation
14 July 2015
Tools that are needed
Monitoring tools– At the operating system level to observe
• System resource usage – CPU, memory, I/O activity• Heaviest resource users
– At the component level to observe• Behaviour within the particular component (MQ / IBM Integration Bus)
– Both types of tools are needed• They have different views of the world• They are complimentary
Driving tools– Needed to generate a continuous workload
• Important to assess performance after warm-up during sustained activity
© 2015 IBM Corporation
14 July 2015
UNIX tools
Vmstat
Iostat
Nmon
filemon
System Configuration: lcpu=64 mem=8192MB
kthr memory page faults cpu----- ----------- ------------------------ ------------ ----------- r b avm fre re pi po fr sr cy in sy cs us sy id wa 1 0 1977672 25823 0 0 0 0 0 0 3 958 696 4 0 96 0 1 0 1977838 25719 0 2 0 98 100 0 29 2941 2250 4 0 96 0 1 0 1977685 25872 0 0 0 0 0 0 2 636 483 4 0 96 0 System configuration: lcpu=64 drives=5 paths=6 vdisks=2
tty: tin tout avg-cpu: % user % sys % idle % iowait 0.0 29.5 3.6 0.1 96.2 0.0
Disks: % tm_act Kbps tps Kb_read Kb_wrtnhdisk3 0.0 0.0 0.0 0 0hdisk2 0.0 0.0 0.0 0 0hdisk0 0.0 4.0 1.0 8 0hdisk1 0.0 0.0 0.0 0 0cd0 0.0 0.0 0.0 0 0
© 2015 IBM Corporation
14 July 2015
Windows tools – Process Explorer
Watch system activity in detail on Windows
Watch– CPU usage– Commit charge– I/O activity– Physical memory history– Summary information– Individual processes
Download from https://technet.microsoft.com/en-us/sysinternals/bb896653.aspx
© 2015 IBM Corporation
14 July 2015
Process Explorer
DataFlowEngine.exe:– This is the Integration
Server
amqzlaa0.exe– This is the MQ agent for
LOCAL connections (including the broker)
amqrmppa.exe– This is the MQ agent for
CLIENT connections
Can quickly see system is busy
Customise by selecting columns of choice
© 2015 IBM Corporation
14 July 2015
Process Explorer – Task Information
© 2015 IBM Corporation
14 July 2015
Find Integration Server name from PIDmqsilist DEMO -d2-----------------------------------BIP1286I: Integration server 'default' on integration node 'DEMO' is running.Number of message flows that are enabled to run: '4'.Number of applications that are enabled to run: '2'.Process ID: '8248'UUID: 'e1306ebc-3c3a-43c2-b18a-bbdb99e07d5c'Short description: ''Long description: ''BIP8071I: Successful command completion.-----------------------------------BIP1286I: Integration server 'default' on integration node 'DEMO' is running.
Number of message flows that are enabled to run: '4'.Number of applications that are enabled to run: '2'.Process ID: '8248'UUID: 'e1306ebc-3c3a-43c2-b18a-bbdb99e07d5c'Short description: ''Long description: ''BIP8071I: Successful command completion.-bash-4.1$ ps -ef | grep DataFlowEnginemqm 4331 4302 0 13:12 pts/0 00:00:00 grep DataFlowEnginemqm 28788 28701 99 12:57 ? 00:41:56 DataFlowEngine CSIM d2cd939a-a7a3-46ce-8168-b89c77744511 default
© 2015 IBM Corporation
14 July 2015
Key tools at the component level
Integration Bus– User trace– Trace nodes– Activity Log– WebUI
• Accounting & Statistics: Compare flow statistics at the node (broker), server (execution group), container (application or library) or at an individual message flow level
• Resource Statistics: View resource use at the execution group level
MQ Explorer
Java Healthcenter
© 2015 IBM Corporation
14 July 2015
Statistics scope
Node (broker)
Server (execution group)
Node
Message Flow
Thread
Message ModelTerminals
ResourceStatistics
Accounting &Statistics
© 2015 IBM Corporation
14 July 2015
Accounting & Statistics
Dynamic reporting of message flow usage for• Problem determination• Chargeback• Profiling
Data at two levels1. Universal (CPU cost, IO, elapsed time)2. Broker specific (messages, commits,
backouts)
A Choice of intervals• Short or snapshot• Long or archive
Attach a label to the information
– AccountingOrigin to collect data to identify and collate this information according to the specific origin of a message, even in consolidated flows. Choice of destination
1. Trace2. PubSub3. SMF (z/OS only)
© 2015 IBM Corporation
14 July 2015
Subscribing to Accounting & Statistics
Publish/Subscribe data is published on the topic
$SYS/Broker/brokerName/StatisticsAccounting/recordType/executionGroupLabel/messageFlowLabel
A subscription for $SYS/Broker/+/StatisticsAccounting/+ receives all statistics for all brokers
Notes: The following three characters have a special meaning:– The topic level separator "/"– The multilevel wildcard "#"– The single-level wildcard "+"
© 2015 IBM Corporation
14 July 2015
Resource Statistics
Dynamic reporting of the performance and operating details of resources used by execution groups• Problem determination• Profiling
Supported resources1. CICS2. CORBA3. FTEAgenet4. JDBCConnectionPools5. JVM6. ODBC7. SOAPInput8. Security9. Outbound Sockets
Data • XML messages
distributed using PubSub
• Access as raw data or view in WebUI
Regular reporting • Data published approximately every 20 seconds
© 2015 IBM Corporation
14 July 2015
Subscribing to Resource Statistics
Data is published on the topic
$SYS/Broker/brokerName/ResourceStatistics/executionGroupLabel
A subscription for $SYS/Broker/+/ResourceStatistics/+ receives all statistics for all brokers
Notes: The following three characters have a special meaning:– The topic level separator "/"– The multilevel wildcard "#"– The single-level wildcard "+"
© 2015 IBM Corporation
14 July 2015
WebUI – Accounting & Statistics
Using the WebUI in IBM Integration Bus v10:
– Control statistics at all levels– Easily view and compare flows,
helping to understand which are processing the most messages or have the highest elapsed time
– Easily view and compare nodes, helping to understand which have the highest CPU or elapsed times
– View all statistics metrics available for each flow
– View historical flow data
© 2015 IBM Corporation
14 July 2015
WebUI – Resource Statistics
View resource statistics for resource managers in IIB such as:– JVM– ODBC– JDBC– parsers, etc.
© 2015 IBM Corporation
14 July 2015
MQ Explorer
© 2015 IBM Corporation
14 July 2015
IBM Support Assistant and Java Health Centre
Java Health Centre is provided as part of the IBM Support Assistant– Offers very low overhead monitoring tool– Runs along side an IBM Java application
Get visibility, monitoring and profiling in the following application areas:– Performance
• Java method profiling• Lock analysis• Garbage collection
– Memory– System– Java Class– File input and– Object
Enable the application JVM prior to use– IBM_JAVA_OPTIONS=-Xhealthcenter
© 2015 IBM Corporation
14 July 2015
Demonstration of analysing performance issues
Identify problems in two message flows using– Process Explorer– WebUI Statistics– MQ Explorer– Java Healthcenter
Java Compute TransformCoordinated Request Reply
© 2015 IBM Corporation
14 July 2015
Demonstration 1
Analysing a performance problem in the Coordinated Request Reply Scenario
© 2015 IBM Corporation
14 July 2015
Coordinated Request Reply message flows
Consists of three message flows– Request
• Converts incoming message from XML to CWF• Saves the incoming message in a queue for subsequent reply processing• Writes a message for the back end reply message flow
– BackendReplyApp• Sets the completion time in the message• Writes a reply message
– Reply• Reads the message from the back end message flow• Retrieves the original message saved by the request message flow• Writes an output message
© 2015 IBM Corporation
14 July 2015
Coordinated Request Reply queues
The queues– Request
– BackendReplyApp
– Reply
CSIM_SERVER_IN_Q
GET_REPTO_STORE
GET_REPTO_STORE
GET_BACKEND_REPGET_BACKEND_REQ
GET_BACKEND_REQ
GET_BACKEND_REP CSIM_COMMON_REPLY_Q
© 2015 IBM Corporation
14 July 2015
Run and investigate
Steps1. Ensure all components are started and the applications works as expected
- Message flows, databases, external applications etc.2. Start a load generator [JMSPerfharness in this case]3. Look at activity
- Is processing happening at the expected rate?- Is CPU usage as expected?- Is memory usage as expected?
4. If things do not seem as expected- Look for build up of messages- Poor service times
5. Enable and view statistics6. Analyse statistics7. Examine message flows
© 2015 IBM Corporation
14 July 2015
Step 1 – Check flows are running using the WebUI
Check the server is running
Check the flows are running
Check the event/sys log for any errors
Processing messages and no errors
© 2015 IBM Corporation
14 July 2015
Step 2 – Start a load generator
Run JMSPerfharness– Using 10 threads
All threads start successfully– Each thread PUTs a message then GETs a
message so should be no messages on queues for any period of time
Check event/sys log for any error messages
© 2015 IBM Corporation
14 July 2015
Step 3 – Look at CPU activity
Messages being processed but:– Rate is low, much lower than expected– Very little CPU being used
• Integration Server does not register any CPU activity
© 2015 IBM Corporation
14 July 2015
Step 4 – Look for a build up of messages
Key queues are Build up of messages on queues:– GET_REPTO_STORE– GET_BACKEND_REQ
What does this mean?
GET_BACKEND_REQGET_BACKEND_REP
CSIM_SERVER_IN_QGET_REPTO_STOREGET_BACKEND_REQ
GET_BACKEND_REPGET_REPTO_STORECSIM_COMMON_REPLY_Q
Request
BackendReplyApp
Reply
© 2015 IBM Corporation
14 July 2015
Step 4 – Look for a build up of messages...
Looking at the flows– Queue GET_REPTO_STORE is used by Request and Reply message flows– Queue GET_BACKEND_REQ is used by BackendReplyApp message flow
– GET_REPTO_STORE is used mid-flow (so flows using this are less likely to be the problem)– GET_BACKEND_REQ is the input queue for the BackendReplyApp
• Indicates flow is not running fast enough or not enough instances allocated
Need to investigate what is happening with BackendReplyApp– For this use WebUI flow statistics
© 2015 IBM Corporation
14 July 2015
Step 5 – Enable flow statistics
Start and stop statistics using the WebUI for:– All flows in a server– All flows in a container– Individual flows
© 2015 IBM Corporation
14 July 2015
Step 5 – View statistics
Select the statistics view
Drill down to the problem flow
Start by comparing flows
Flow analysis view for most detail
© 2015 IBM Corporation
14 July 2015
Step 6 – Compare flows
Compare flows to determine which one might be causing the problem
We can see that the BackendReplyApp flow has an average elapsed time of 1,000.9 milliseconds. It only has 1 active thread, and has processed 20 messages in the 20 second statistical snapshot period.
This matches the rate we see in JMSPerfHarness!
© 2015 IBM Corporation
14 July 2015
Step 6 – Analyse the flow
Display historical flow details such as message rate, CPU and elapsed time
View all nodes within the flow to determine and sort by average elapsed and CPU times
The compute node Modify_CompletionTime seems to be a problem!
What does high elapsed time and low CPU time suggest the problem might be?
© 2015 IBM Corporation
14 July 2015
Step 7 – Review the code
Having worked out which node is causing the problem
We can quickly see why the node is taking 1 second elapsed time but little CPU
© 2015 IBM Corporation
14 July 2015
Problem found!!
1 second sleep in the compute node within the message flow is causing slow processing times and no CPU usage
– Matches the observations at the start• Low CPU and low message rate
Unlikely to be so easy in future but slow service times, like slow synchronous web service invocations would have the same effect
If it was slow web service response times then allocate more additional instances to improve processing rate
© 2015 IBM Corporation
14 July 2015
Summary of steps for this investigation
Use a systemic approach– Key steps used were
1.Ensure all components are started and the applications works as expected- Message flows, databases, external applications etc.
2.Start a load generator [JMSPerfharness in this case]3.Look at activity
- Is processing happening at the expected rate?- Is CPU usage as expected?- Is memory usage as expected?
4.If things do not seem as expected- Look for build up of messages- Poor service times
5.Enable and view statistics6.Analyse statistics7.Examine message flows
It is very important to– Use tools
• System level and component level– Start at a high level – system level and then close-in on the problem
© 2015 IBM Corporation
14 July 2015
Demonstration 2
Analysing a performance problem in the Java Compute Transform Scenario
© 2015 IBM Corporation
14 July 2015
JavaComputeTransform message flows
Consists of one message flow– JavaCompute
• Reads an XML message• Transforms to a different format using a Java Compute node
JAVA_COMPUTE_IN JAVA_COMPUTE_OUT
© 2015 IBM Corporation
14 July 2015
What is the problem we need to solve?
The problem is characterised by– Low message rate– High CPU usage at both system and Integration Server level– Sufficient messages on the input queue
Likely issue is one of high CPU usage in a message flow– But which flow and which node?
© 2015 IBM Corporation
14 July 2015
Compare the flows
All of the elapsed and CPU time is in the JavaCompute message flow, so continue investigation here
© 2015 IBM Corporation
14 July 2015
Finding the processing Node for investigation
The majority of the elapsed and CPU time within the flow is spent in the JavaCompute Node
What might cause this?
As this is a Java Compute Node continue investigation using the Java Healthcenter
© 2015 IBM Corporation
14 July 2015
Find the Integration Server port for Java Health Center
Environment variable:
IBM_JAVA_OPTIONS=-Xhealthcenter
Opens ports starting 1972, the Integration Server running the JavaComputeTransform scneario is using port 1974
© 2015 IBM Corporation
14 July 2015
Alternate method for finding the port number
> mqsilist DEMO -d2
-----------------------------------
BIP1286I: Integration server 'default' on integration node 'DEMO' is running.
Number of message flows that are enabled to run: '4'.
Number of applications that are enabled to run: '2'.
Process ID: '7284'
UUID: 'e1306ebc-3c3a-43c2-b18a-bbdb99e07d5c'
Short description: ''
Long description: ''
BIP8071I: Successful command completion.
> netstat -a -b -n -o
[runmqlsr.exe]
TCP [::]:1972 [::]:0 LISTENING 7352
[bipbroker.exe]
TCP [::]:1973 [::]:0 LISTENING 7748
[biphttplistener.exe]
TCP [::]:1974 [::]:0 LISTENING 7284
[DataFlowEngine.exe]
TCP [::]:4417 [::]:0 LISTENING 7352
[bipbroker.exe]
TCP [::]:49152 [::]:0 LISTENING 924
© 2015 IBM Corporation
14 July 2015
Invoking the Java Health Center
© 2015 IBM Corporation
14 July 2015
Attaching to the Integration Server JVM
© 2015 IBM Corporation
14 July 2015
Connect to a port
© 2015 IBM Corporation
14 July 2015
Connection complete and ready to analyse
© 2015 IBM Corporation
14 July 2015
Analysis and Recommendations – CPU
© 2015 IBM Corporation
14 July 2015
Analysis and Recommendations – Classes
© 2015 IBM Corporation
14 July 2015
Analysis and Recommendations – Environment
© 2015 IBM Corporation
14 July 2015
Analysis and Recommendations – Garbage Collection
© 2015 IBM Corporation
14 July 2015
Analysis and Recommendations – I/O
© 2015 IBM Corporation
14 July 2015
Analysis and Recommendations – Locking
© 2015 IBM Corporation
14 July 2015
Analysis and Recommendations – Native Memory
© 2015 IBM Corporation
14 July 2015
Analysis and Recommendations – Profiling
© 2015 IBM Corporation
14 July 2015
The cause
Having worked out which node is causing the problem
We can quickly see why the node is consuming a lot of CPU
A call to the method bubble_sort() just before propagating out of the node is sorting the entire output message
© 2015 IBM Corporation
14 July 2015
If you suspect there is a product problem
Identify the problem as best you can
Find the simplest test that recreates the problem
Collect the data identified in the Must Gather list– For IIB http://www-01.ibm.com/support/docview.wss?rs=849&uid=swg21209857– For MQ: http://www-01.ibm.com/support/docview.wss?uid=swg21229861#MG6
© 2015 IBM Corporation
14 July 2015
Summary
Wide range of tools available covering operating system and component performance– Expect to use multiple tools– After all it is important to understand what is happening at different levels– Demonstration has shown how to use the key tools for MQ and IIB to debug a problem
Practice before hand– Being familiar with the tools is a great help in a crisis– Learning a new tool and solving a crisis is not a good combination
Know your applications and systems– What is normal in terms of processing rate, CPU usage etc.– This information allows to know whether there is a problem and to what extent
© 2015 IBM Corporation
14 July 2015
Additional Information
WebSphere Message Broker: Designing for Performance– http://www-01.ibm.com/support/docview.wss?rs=849&uid=swg24006518
WebSphere Message Broker: Message display, test & performance utilities (IH03)– http://www-01.ibm.com/support/docview.wss?rs=171&uid=swg24000637
IBM Monitoring and Diagnostic Tools for Java – Getting started with Health Center– http://www.ibm.com/developerworks/java/jdk/tools/healthcenter/getting_started.html
IBM Monitoring and Diagnostic Tools for Java – Health Center– http://www.ibm.com/developerworks/java/jdk/tools/healthcenter/
IBM Monitoring and Diagnostic Tools for Java – Knowledge Center– http://www-01.ibm.com/support/knowledgecenter/#!/SS3KLZ/SS3KLZ/welcome_tools_family.html
© 2015 IBM Corporation
14 July 2015
Backup chart
MQ processes
Additional Instances usage and tuning
© 2015 IBM Corporation
14 July 2015
WebSphere MQ V7 Processes
Task Function
AMQALMPX The checkpoint processor that periodically takes journal checkpoints.
AMQZMUC0 Utility manager. This job executes critical queue manager utilities, for example the journal chain manager.
AMQZXMA0 The execution controller that is the first job started by the queue manager. It handles MQCONN requests, and starts agent processes to process WebSphere MQ API calls
AMQZFUMA Object authority manager (OAM) AMQZLAA0 Queue manager agents that perform most of the work for applications that connect to
the queue manager using MQCNO_STANDARD_BINDING. AMQZLAS0 Queue manager agent. AMQZMUF0 Utility Manager AMQZMGR0 Process controller. This job is used to start up and manage listeners and services. AMQZMUR0 Utility manager. This job executes critical queue manager utilities, for example the
journal chain manager. AMQZDMAA Deferred Message Processor
AMQFQPUB Publish/subscribe process.
AMQFCXBA Broker worker job.
RUNMQBRK Broker control job.
AMQRMPPA Channel process pooling job.
AMQCRSTA TCP/IP-invoked channel responder.
© 2015 IBM Corporation
14 July 2015
WebSphere MQ V7 Processes
Task Function
AMQCRS6B LU62 receiver channel and client connection.AMQRRMFA Repository manager for clusters.
AMQCLMAA Non-threaded TCP/IP listener.
AMQPCSEA PCF command processor that handles PCF and remote administration requests. RUNMQTRM Trigger monitor.
RUNMQDLQ Dead letter queue handler. RUNMQCHI The channel initiator. RUNMQCHL Sender channel job that is started for each sender channel. RUNMQLSR Threaded TCP/IP listener. AMQXSSVN Shared memory servers.
AMQRCMLA Channel MQSC and PCF command processor.
AMQZTRCN Trace.
Number present at any time will vary– Dependent on configuration, applications running, etc.– Some will always be present, such as AMGZXMA0 (Execution Controller)
© 2015 IBM Corporation
14 July 2015
Additional Instances usage and tuning
Integration Server level data contains the following data for each message flow in it:
– MessageFlowName– TotalElapsedTime– MaximumElapsedTime– MinimumElapsedTime– TotalCPUTime– MaximumCPUTime– MinimumCPUTime– CPUTimeWaitingForInputMessage– ElapsedTimeWaitingForInputMessage– TotalInputMessages– TotalNumberOfTimeOutsWaitingForRepliesToAggregateMessages
– TotalSizeOfInputMessages– MaximumSizeOfInputMessages– MinimumSizeOfInputMessages– NumberOfThreadsInPool– TimesMaximumNumberOfThreadsReached– TotalNumberOfMQErrors– TotalNumberOfMessagesWithErrors– TotalNumberOfErrorsProcessingMessages– TotalNumberOfCommits– TotalNumberOfBackouts
Fields NumberOfThreadsInPool and TimesMaximumNumberOfThreadsReached show for every message flow the number of additional instances allocated and the number of times they were all used
– Use this data to determine if:• More additional instances are required• Too many are allocated
© 2015 IBM Corporation
14 July 2015
Additional Instances usage and tuning
% Time Thread Pool Limit Reached = TimesMaximumNumberOfThreadsReached / TotalInputMessages