real life java ee performance tuning

55
Real Life Java EE Performance Tuning Matt Brasier Principal Consultant C2B2 Consulting LTD [email protected]

Upload: mbrasier

Post on 18-Nov-2014

1.044 views

Category:

Documents


2 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Real Life Java EE Performance Tuning

Real Life Java EE Performance Tuning

Matt BrasierPrincipal ConsultantC2B2 Consulting [email protected]

Page 2: Real Life Java EE Performance Tuning

About MeProfessional Services ConsultantCustomers include• Red Hat (JBoss)• BEA• Cape Clear• Government/Finance/Telecoms

C2B2 Consulting• SOA and Java EE consultancy• Fast, Reliable, Manageable, Secure

Page 3: Real Life Java EE Performance Tuning

What we will cover Philosophy• How I approach a performance problem situation

Enterprise Java Performance• What kind of things affect performance of Enterprise Systems

Case Study 1• A new version of the application runs slowly

Case Study 2• Logging in takes a long time in the live environment

Case Study 3• The application does not scale

Page 4: Real Life Java EE Performance Tuning

What we will learnPhilosophy• Suggestions to keep in mind when looking at a

performance problem

Tools • Suggested tools for looking at a performance

problem

Techniques• How to use the tools, knowledge and skills to solve

your performance problem

Page 5: Real Life Java EE Performance Tuning

Philosophy‘A good understanding’ is the best

performance tuning toolPrefer common and open source toolsObserve, Hypothesize, Tweak, Test‘Trust no-one’

Page 6: Real Life Java EE Performance Tuning

Classic Java performance problemsMemory leaks• Increased GC Time

Poor GC or JVM Memory configurationCPU bound codeIO bound codeMemory bound code• Increased GC time

Page 7: Real Life Java EE Performance Tuning

Enterprise Java PerformanceCAVEAT: Consultancy Selection Bias80/20: 80% of time finding, 20% fixingMany ‘Enterprise’ Java performance problems turn

out not to be ‘classic’ performance bottlenecks• Infrastructure/Middleware performance

There are many factors that can affect the performance of an enterprise system• Not just code

Page 8: Real Life Java EE Performance Tuning

Enterprise Java PerformanceNot all Java EE performance problems are

classical ‘Java performance problems’Common types of Java EE performance

problem• Resource starvation• Threading problems• ‘Suboptimal configuration’• Network related problems• Scalability problems

Page 9: Real Life Java EE Performance Tuning

A Good UnderstandingConsider the system as a wholeKnow how infrastructure components work• Not just what they do, but how they do it

How do the Java EE specifications say they should work?

Page 10: Real Life Java EE Performance Tuning

ApproachUnderstand the systemUnderstand the environmentUnderstand the situationTalk to people who know• But trust no-one

Take a look for myselfObserve, Hypothesize, Tweak, Test• Rinse and repeat

Page 11: Real Life Java EE Performance Tuning

Case Study 1

Page 12: Real Life Java EE Performance Tuning

Case Study 1Existing customer calls• “We deployed a new version of the application, and it is

running a lot slower”

The Environment• Sun Java 5• WebLogic Server 9.2 Cluster (3 nodes)• WebLogic Integration 9.2 Cluster (3 nodes)• Documentum Document Management• Oracle Database• Solaris OS

Page 13: Real Life Java EE Performance Tuning

Case Study 1The System• Web Application• WLI based workflow system

The situation• New version deployed into the performance

testing environment• Automated performance tests indicate the

application is approximately 30% slower

Page 14: Real Life Java EE Performance Tuning

Case Study 1Observe• No monitoring in place• Some alerting, but no historical data

Hypothesize• If we had more monitoring, we would stand a better

chance

Tweak• Put some monitoring in place• Hyperic HQ from SpringSource

Page 15: Real Life Java EE Performance Tuning

Case Study 1 Test• Re-run tests

Observe• Monitoring indicates that one server is slower

Handling less requests per second Lots of transaction timeouts Higher CPU Less network traffic

Tweak• Add more monitoring to the slow server• Examine log files• Thread dumps!

Page 16: Real Life Java EE Performance Tuning

Case Study 1 Hypothesize• Thread dumps show lots of threads in logging code waiting to

write to the log file• Log files for the slow server have DEBUG messages in them

The other servers don’t

“The logging configurations are identical, the servers are configured with Maven”• Trust no one

Test• Log in to the server and manually check the logging

configuration

Page 17: Real Life Java EE Performance Tuning

Case Study 1Solution• Debug logging was enabled on one server• Turned debug logging off - the system was then

about the same speed as the old release

Page 18: Real Life Java EE Performance Tuning

Hyperic HQ

Page 19: Real Life Java EE Performance Tuning

Hyperic HQMonitoring tool• Not a profiling tool

Historical data• Trends• Abnormal behaviour• ‘Hot’ spots

Wide variety of data• JVM level statistics• JMX statistics• OS statistics

Page 20: Real Life Java EE Performance Tuning

Thread DumpsMy Number 2 tool for finding performance

problems• CTRL-BREAK in windows• Kill -3 on Unix/Linux• Jstack tool• Available from consoles of many application

servers

All threads in the VM and what they are doing at that moment

Page 21: Real Life Java EE Performance Tuning

Thread DumpsA number of thread dumps over time gives a

good picture• Any operation that appears a lot is a suspect• Understand what ‘normal’ thread dumps look like

http://java.sun.com/developer/technicalArticles/Programming/Stacktrace/

Page 22: Real Life Java EE Performance Tuning

Thread Dump

Page 23: Real Life Java EE Performance Tuning

Thread DumpsLook near the top of each stackLook for stacks with your code in themLook for long stacksLook for deadlocks and other threading

issues

Page 24: Real Life Java EE Performance Tuning

The UnderstandingWhat does a normal WebLogic thread dump look like? It is not normal to see logging code frequently in a

thread dumpLots of threads all waiting on a single lock object is a

Bad Thing™ If three servers are supposed to do the same thing,

their thread dumps should look similar• Over time

Page 25: Real Life Java EE Performance Tuning

LessonsThread dumps hold a lot of informationInfrastructure configuration faults are more

common than infrastructure bugsAutomated/continuous build and deploy

solutions are no silver bullet• Check the results yourself

Believe your ‘instincts’

Page 26: Real Life Java EE Performance Tuning

Case Study 2

Page 27: Real Life Java EE Performance Tuning

Case Study 2Customer Call• “We deployed our application into the live environment

and it takes several minutes for users to log in”

Environment• Apache web servers• WebLogic Portal 8.1 Cluster (2 nodes)• Oracle Database• Windows Server 2003• Bespoke Single Sign On server

Page 28: Real Life Java EE Performance Tuning

Case Study 2The System• Web application based on WSRP portlets • Oracle database storing user data

The Situtation• The first users to log-in in the morning find that it

takes several minutes• After the first few log-ins, the application runs fine

Page 29: Real Life Java EE Performance Tuning

Case Study 2Hypothesize• The bespoke Single Sign On server makes me

suspicious Bespoke code is tested less

Test• Turn on debug logging for the SSO implementation• Observe timings of log messages

Page 30: Real Life Java EE Performance Tuning

Case Study 2Observe• The logs indicate that the SSO log-in is proceeding

as expected• It appears that loading the users profile data from

the database is taking a long time

Hypothesize• TCP timeouts when connecting to the database

due to a firewall

Page 31: Real Life Java EE Performance Tuning

Case Study 2Test• Observe the connection pool statistics in the

WebLogic console• The console indicates that a large number of

connections have been opened during the time the application has been running Connections are not normally closed and re-opened

• See how long you need to leave the system before the problem occurs

Page 32: Real Life Java EE Performance Tuning

Case Study 2Solution• Discussions with the networking team indicated

that there was a firewall, configured to silently terminate network connections that were Idle for 60 minutes

• Set WebLogic to test connections after they have been idle for 50 minutes.

Page 33: Real Life Java EE Performance Tuning

LessonsConsider the system as a whole• Hardware• Networking• OS• Middleware• Application

Page 34: Real Life Java EE Performance Tuning

The UnderstandingFirewalls are often configured to silently terminate

idle TCP connectionsThe TCP protocol requires that a connection is closed

by both sides, or times out• The time out is several minutes

In a healthy WebLogic connection pool, the number of connections opened since the server started = the maximum number in the pool

Page 35: Real Life Java EE Performance Tuning

Case Study 3

Page 36: Real Life Java EE Performance Tuning

Case Study 3Customer call• “It takes about 20 seconds to render a page, and

the performance does not scale”

Environment• WebLogic Portal 9.1 Cluster (2 nodes)• Oracle 10g Database• Red Hat Enterprise Linux

Page 37: Real Life Java EE Performance Tuning

Case Study 3The System• Online content delivery system• WebLogic Portal with a commercial set of portlets

The Situation• Two problems

Running the performance tests with 20 threads in JMeter is twice as slow as running the tests with 10 threads

Viewing a content item takes around 20 seconds

Page 38: Real Life Java EE Performance Tuning

Case Study 3Handle the two problems separately• They may be related, they may not be

Page 39: Real Life Java EE Performance Tuning

Case Study 3Observe• Viewing a content item takes around 16 seconds

on my laptop

Test• Is the rendering speed dependent on the browser

used?• Is the rendering speed dependent on the client

machine?• What does the page source look like?

Page 40: Real Life Java EE Performance Tuning

Case Study 3Observe• In Opera the page renders quickly except for the

table of contents on the left• In Firefox, the whole page renders at the same

time• The page renders faster in IE and Opera than

firefox• The page renders faster on faster machines• There is a lot of Javascript, and AJAX is used to

load the table of contents

Page 41: Real Life Java EE Performance Tuning

Case Study 3Hypothesize• The AJAX rendering of the TOC is taking a long

time, and slowing down the whole page load

Tweak• Remove the TOC from the page• Disable JavaScript in the browser

Test• The page renders in less than 2 seconds

Page 42: Real Life Java EE Performance Tuning

Case Study 3Hypothesize• JMeter does not execute the javascript, so the poor

performance of JMeter is not related to the poor page load speed

Page 43: Real Life Java EE Performance Tuning

Case Study 3Solution 1• The portlet developers have used AJAX to render

the table of contents for a content item, this is much slower than just constructing the table of contents on the server side

• Rewrite the portlet to construct the table of contents on the server side

• Developers sometimes select a technology to enhance their CVs, not to implement a business requirement

Page 44: Real Life Java EE Performance Tuning

Case Study 3Problem 2 – ScalabilityObserve• Running the tests on JMeter with 10 users, each

page response takes 5s• Running the test with 20 users each page

response takes 12s• JMeter is being run on an old laptop, which is at

100% CPU in both cases

Page 45: Real Life Java EE Performance Tuning

Case Study 3Hypothesize• As the test machine is at 100% CPU, it is the

performance of JMeter that is being measured, not the performance of WebLogic

Observe• WebLogic is running at around 2% CPU usage, with

many idle threads

Page 46: Real Life Java EE Performance Tuning

Case Study 3Tweak• Run the test from a number of more modern

machines, and make sure each one does not exceed 70% CPU

Observe• Four machines can each run 20 threads and get

responses in 1.5 seconds, and WebLogic is still running at around 5% CPU and not struggling

Page 47: Real Life Java EE Performance Tuning

Case Study 3Solution• The problem was that the test client was not able

to generate the loads requested, resulting in the performance of the test client being measured

• Use a larger test client

Page 48: Real Life Java EE Performance Tuning

Useful toolsEthereal/Wireshark• Network traffic sniffer• See when requests/responses were sent/received

Firebug + YSlow• Firefox plugin for performance analysis

Page 49: Real Life Java EE Performance Tuning

LessonsSeparate problems should initially be

prioritised and investigated separately• Keep in mind that they may be related

Ensure the test system can generate the required load• It should have plenty of free resources available

Page 50: Real Life Java EE Performance Tuning

LessonsThe consultant effect• Take a step back• Get a fresh perspective

Page 51: Real Life Java EE Performance Tuning

The UnderstandingA slow test client will give slow resultsClient side rendering is usually less efficient

than server sideWebLogic is normally fast!

Page 52: Real Life Java EE Performance Tuning

What did we learn?Simple tools can provide a lot of informationUnderstanding how the system should

behave will help highlight possible causesExperience is vital• Write a log of what you find

Take a step back from the problem• Use a second pair of eyes

Page 53: Real Life Java EE Performance Tuning

What did we learn?Philosophy• Understand they system as a whole• A deep understanding of how it should work

Tools• Thread dumps• Monitoring tools• Packet sniffing

Techniques• Observe, Hypothesize, Tweak, Test

Page 54: Real Life Java EE Performance Tuning

Questions

Page 55: Real Life Java EE Performance Tuning

Session EvaluationPlease complete a session evaluation and

turn it into any conference staff member or at the registration desk. Thank you.