save the users! monitoring and diagnosing problems in production
TRANSCRIPT
![Page 1: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/1.jpg)
Save the Users!Monitoring and Diagnosing Problems in Production
Steven Haines
J2EE Architect and Evangelist
Quest Software
![Page 2: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/2.jpg)
Agenda
• Speaker introduction
• Overview of the problem
• Overview of the strategy throughout the development lifecycle
• What needs to be monitored
• Solving problems quickly in production
• Questions & Answers
![Page 3: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/3.jpg)
Speaker Introduction J2EE Architect and Evangelist for Quest Software Author of Java 2 Primer Plus and Java 2 From Scratch Co-Author of Java Web Services Unleashed Java Host and columnist on InformIT.com (Pearson
Education) Java Instructor at the University of California, Irvine
(UCI) and previously Learning Tree University (LTU) Recruited as a J2EE architect in the “real world”
![Page 4: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/4.jpg)
Why Do I Need To Worry?
![Page 5: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/5.jpg)
Market Trends: The Pain of J2EE
IT must answer to the business
Fewer than 20% of J2EE applications meet their performance SLAs in production. (IDC Research)
![Page 6: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/6.jpg)
The Cost of Failure
Business to Consumer– Site Abandonment = lost revenue
Business to Business– Damaged Business Relationships = lost
opportunity
Internal– Loss of organizational efficiency
– Slower time-to-market
![Page 7: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/7.jpg)
The Strategy:
Performance Throughout the Development Lifecycle
![Page 8: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/8.jpg)
Analysts & Industry-Experts Agree…
Gartner recommends you approach J2EE performance throughout the lifecycle
![Page 9: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/9.jpg)
Full Lifecycle Analysis
Application-level code assurance
Certify applications before deployment
24x7 application performance management
![Page 10: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/10.jpg)
In Development… Put performance requirements in Use Cases
Unit test your components for performance
– Both for memory usage and response time
Test your application for performance at every integration milestone
Integration of un-tuned components is analogous to building a car with broken parts!
![Page 11: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/11.jpg)
In QA Testing… Test performance along with functionality
Try to create load scripts that mirror your user’s actions
Analyze the reality (as much as possible)
Failed performance is not acceptable
![Page 12: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/12.jpg)
The Performance StakeholdersWhat code is behind the symptom?
Is the application architecture a problem?
What component is at fault?
Who should fix the problem?
?Which SQL statements need tuning?
Is the DB really the problem?
? Is the application available?
Is the app server configured correctly?
?
?
![Page 13: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/13.jpg)
In Live Production… Measure end-user performance
Watch for resource contention problems
Make sure you get warnings early, but avoid alarm storms
Keep historical data for trending and capacity planning
![Page 14: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/14.jpg)
What Needs To Be Monitored?
![Page 15: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/15.jpg)
End-User Measurement
This is the most important
Passive versus Active
A combination gives the best balance
Either way you must be able to follow users through the system
![Page 16: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/16.jpg)
Resource Contention
Helps you to avoid system failure
Makes capacity planning easier and more reliable
Combine with end-user data
Tiered, composite alerts make your life easier
![Page 17: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/17.jpg)
Setting Up An Alert
• Tiered alerts
• Normal
• Warning
• Critical
• Fatal
• Composite conditions
• Intelligent messaging
• Evaluation options
• Actions when triggered versus actions when cleared
![Page 18: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/18.jpg)
Loaded System Behavior
# Concurrent Users (Load)
Res
po
nse
Tim
e (R
)
Th
rou
gh
pu
t (X
)
Uti
liza
tio
n (
U)
Buckle Zone
Light Load
Heavy Load
Resource Saturated
![Page 19: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/19.jpg)
Consider Service Demand
Best measure of resource utilization
Service Demand = Utilization / Throughput
Normalizes your utilization against throughput, provides clarity
![Page 20: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/20.jpg)
Breadth And Depth
J2EE problems come from many points across the system – not just the application or application server (Gartner Group)
You need to combine a broad, system-wide view and deep domain-specific data
![Page 21: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/21.jpg)
Supporting Systems
Problem areas according to an IBM study
1.Database2.Application Code3.Application Server Configuration4.Infrastructure: OS, Network
![Page 22: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/22.jpg)
J2EE Complexity: Vertical and Horizontal
![Page 23: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/23.jpg)
Solving Problems Quickly In Production
![Page 24: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/24.jpg)
Solving Problems Effectively
Fast detection + clear diagnosis = quick resolution
Need to be able to transition from detecting to diagnosing problems quickly
Triaging is essential
![Page 25: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/25.jpg)
What Makes Fast Detection Possible?
Targeted, composite alerts
Trending analysis
A clear process for dealing with issues
![Page 26: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/26.jpg)
How Do I Get Clear Diagnostic Data?
Make sure you can get deep data from:
– Application code
– Application server
– Database
– OS, Network and support systems
The data needs to be presented in a way that is tied to end-user requests
![Page 27: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/27.jpg)
Guaranteeing Quick Resolution
Quick resolution is in the hands of the domain experts
They can work miracles with the right data at their fingertips
Getting the data smoothly from production to the developer is essential
![Page 28: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/28.jpg)
Conclusions
![Page 29: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/29.jpg)
Conclusions
Take a full lifecycle approach
Measure end-user response time
Track resource utilization / saturation
Ensure a smooth transition from detection to diagnosis and resolution
![Page 30: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/30.jpg)
Quest APM Suite for J2EE
Product Overview
![Page 31: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/31.jpg)
An integrated solution that empowers all the stakeholders in J2EE application performance management to accelerate the detection, diagnosis and resolution of business-threatening performance issues.
Quest’s Application Performance Management Suite for the J2EE platform
![Page 32: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/32.jpg)
RE
SO
LV
ER
ES
OL
VE
Breadth And Depth
• Expert advice and intuitive interfaces make finding the root cause of problems simple in:
• The application server• The database• ERP, CRM, Network or operating system
Deep Source-code
View
High LevelSystemic
View
•Our developers real-world experience creates tools which are intuitive for your domain experts
• Broad coverage ensures problems are found before they impact your users
DIA
GN
OS
ED
IAG
NO
SE
DE
TE
CD
ET
EC
TT
![Page 33: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/33.jpg)
Measuring End-Users and the SystemCustomizable web dashboard may include all the following and more…
Availability
Business StatusSystem Usage
Alert Viewing
![Page 34: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/34.jpg)
Business- and Silo-Specific Reporting Choose from
hundreds of out-of-the-box reports
Create fully customized reports
Automate their generation
Control how often they are created
Use them to measure historical performance and capacity planning
![Page 35: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/35.jpg)
Problem Solving in QA and Production Drill down to the Java components
– Enterprise Java Beans (EJBs)– Servlets, JSPs– HTTP Sessions– Class and method response times
Find JDBC and database problems Expose OS and network resource
contention problems
![Page 36: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/36.jpg)
Problem Solving in QA and Production Real-time
diagnostics Context-
sensitive expert help suggests solutions
End-to-end view includes in depth data on: Web Servers Application
Servers Databases Windows Unix ERP, CRM
![Page 37: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/37.jpg)
Auto-Record for Deeper Diagnosis
Capture the transaction flow and easily find J2EE application bottlenecks and resource contention
The appearance of an anomaly in the J2EE system automatically starts deeper data recording for the domain-expert
![Page 38: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/38.jpg)
Unique Call Tree Shows the path of a
request through an application
From HTTP to SQL Cumulative response
time shows critical path
Individual response time shows method-level bottleneck
Popup windows show relevant metrics
![Page 39: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/39.jpg)
Correlated Metrics View Quickly correlate
metrics from the:– Java/J2EE code– Application server– Database– Operating system– Web, ERP, CRM
servers– Network
Dynamically add metrics to the same graph to see them side-by-side
![Page 40: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/40.jpg)
Line-of-Code Resolution
It’s Easy– Bottlenecks are
automatically highlighted in red
It’s Fast– Find Memory
Leaks quickly with the most detailed object allocation information
It’s Flexible– Line-of-code
differencing– Reporting in Excel,
HTML or text
![Page 41: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/41.jpg)
Customer Results
“AutoDesk saves up to 80% of our time in investigation and diagnosing performance issues in our clustered WebLogic environment, which previously was done through manual log sifting and trial and error techniques.” – Senior Applications Manager, AutoDesk
“…Helped the team narrow down the bottlenecks within our Java code in days versus weeks.” – J2EE Architect, HSBC
“Within minutes, we were able to profile two different J2EE applications and get valuable results immediately.” Manager of Technical Architecture, UICI
Using Quest Software products…
At Toyota our tools were able to find a problem that had plagued them for six months, with our deep data and expert advice it was solved in less than two days.
![Page 42: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/42.jpg)
Attend a PerformaSure Web Cast
Presented every Thursday1:00pm PST, 4:00pm EST
http://www.quest.com/events/webcast_index.asp
![Page 43: Save the Users! Monitoring and Diagnosing Problems in Production](https://reader035.vdocuments.us/reader035/viewer/2022062514/557cce27d8b42a0c368b46db/html5/thumbnails/43.jpg)
Thank you
http://www.quest.com