performance forensics - understanding application performance
DESCRIPTION
An introduction to performance measurement and diagnosisTRANSCRIPT
Performance
Forensics
Alois ReitbauerdynaTrace Software
Forensicsforensis adj - "of or before the forum." In Roman times, a criminal charge meant presenting the case before a group of public individuals in the forum. … The individual with the best argument and delivery would determine the outcome of the case.
Collecting Evidence
What does this chart tell us
Time
Resp
onse
Tim
e
A- Response time problem?B - CPU problem?C - Sync problem?D - Database problem?
We don‘t know
We have to collect our evidence first
System/OS/Virtualization-Level
Container/App Server-Level
Application-Level
User-Level
Multi-Layered Measurement
Understand your measurements
Response Time only Response Time and GC
A: Our response time is 2.3 secondsB: Our response time is 1.5 secondsC: Our response time is 6 seconds
How can this happen?
Browser FirewallNetworkSniffer Web Server Application
Server
Page Load Time HTTP Request Time
Request Time (max)
95 % Servlet Time
The beauty of measures ...
... is that there are so many to choose from
Types of Measurements
• Cyclic Measurements– Are collected ar regular time intervals– Are time based– JMX, CPU, Memory
• Event-based measurements– Are collected as a request occurs– Are transactional– Response Times, CPU consumption
Types of Statistics
• Min/Max• Average• Median• Percentiles
Use percentiles for event-based measures and averages (or max) for cyclical measures
How you measure is important
Typical Measurements we work with
• Memory– Consumption, GC
• CPU– Usage, Load Average
• Response Time– Transactions
• Database– Statements, Pool Sizes
• Communication– Calls, Latency, Size, Threads
A Clearer Picture
What is the problem?
We have response times of 6 seconds.
We have response times of 6 seconds for 95 percent of our users at a load of 500 users with a CPU utilization of 10 percent.
Indentify the suspect
Don‘t trust your assumptions
Top 10 Optimizations are bad
Seperate transaction types
Baseline and Delta
Understand the difference
Make the problem reproducable. Otherwise you cannot check
whether you fixed it.
When are we done?
... when there are no more why questions
Solving the case
Ensure you solved the real problem
Fight problems not symptoms
When a measure supports a problem. Check for all measures
affected by the problem.
Have you tuned at the right place
Watch out for side effects
The usual suspects
O/R Access
Rendering
State Handling
Latency
Data Volume
Comm. Behavior
JavaScript
Database
Business Tier
Browser
Web TierData Volume
Number of Requests
Memory and GC
Memory and GC
Client ApplicationClient Application
StubStub
SerializationSerialization
Client InfrastructureClient Infrastructure
Server ApplicationServer Application
FacadeFacade
DeserialisationDeserialisation
Server InfrastructureServer Infrastructure
NetworkingNetworking NetworkingNetworking
Application Developers View
Remoting Stack
Avoid Protocol Overhead
Reduce Interactions Create Data Locality
Adjust Interfaces
Application Code
Connection PoolConnection
Result Set
Application
SQLTCP/IP
Database
O/R Mapping Layer
Caching Layer
StatementConnection
PreparedStatementPreparedStatement
.
.
.
The DB layer
PersistenceFramework
JDBCLayer
Database
Execution PlanCache
Prepared StatementCache
Cross SessionCache
Session Caches
QueryCache(s)
Caching in the DB layer
Reduce DB Calls
Tune Loading Behavior Optimize for caching
Define Proper Entities
select … from a,b,cselect … from b,c,a
… join fetch ….
The Web Layer
Browser ServerJavaScriptPerformance
HTML Rendering
Many AJAX/HTTPcalls
HighLatency
HighDatavolume
Thread-/Connectionpools
Network
DatabaseAccess
WebService/Backend Calls
BrowserBrowser
Caching on the Web
Clients
Server
Cache per Client
Server providingCaching Information
Proxy Cachefor Many Clients
Serverside Data Cache
Alois [email protected]
@AloisReitbauerblog.dynatrace.com