take your oracle weblogic applications to the next level with … · •further inspection through...

Post on 16-Oct-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Take Your Oracle WebLogic Applications to The

Next Level with Oracle Enterprise Manager 12c

Mojahedul Hoque Abul Hasanat CTO, Therap Services

Neelima Bawa Consulting Tech. Lead, SCP, EM, Oracle

Agenda

• Background of Therap Services

• The problem

• Application Performance Management

• Quick description of Oracle’s APM offering

• How OEM and RUEI helped us

• Actual scenarios

• The future

• Tips

• Q&A

Therap Services - OOW 2013 2

Therap Services, LLC

• Documentation and Communication Software for MR/DD

• EHR for the DD industry is the closest for describing us

• Niche segment in the health sector

• Improve quality of life for people with DD by improving efficiency of delivery through communication

• SaaS business model

• 150K+ active users

• 1000+ providers in 48 states

• State customers

• Extensive usage for DD in DHS ND and DHHS NE

• 150+ employees

• Based in CT, dev center in Bangladesh

• http://www.therapservices.net

Therap Services - OOW 2013 3

The Application

• The application is our business

• 1M+ lines of code

• 60+ modules

• 1M+ sustained HTTP requests/hour

• 30K+ peak requests/minute

• 6000+ concurrent users

• Based on JEE and the Spring Framework

• Hibernate

• Seam

• GRAILS

Therap Services - OOW 2013 4

Delivery Platform

• 2 identical sites in two states

• Primary hosts (per site):

• 4 WebLogic application servers in cluster

• 1 Memory based data server (in-house, java)

• 1 Oracle database server

• 1 NetApp storage (SAN)

• 1 F5 Load balancer

• Supporting hosts

• Use Dyn for site high availability

• Data replication with Oracle Golden Gate

Therap Services - OOW 2013 5

What Matters

• Availability

• Application is used 24x7

• Application use is critical to the business of our customers

• Performance

• A user needs to spend as little time as possible in our application

• Most users use it daily, multiple times

• Data integrity

• Fast development turnaround

Therap Services - OOW 2013 6

Evolution of Therap

• Improved testing

• Formal code review

• Improved processes

• Removed repeating problems

• Now all problems we face are new

• Acquired large customers

• Availability and performance have become critical factors

Therap Services - OOW 2013 7

Before OEM & RUEI

• Heavy use of logging

• Nagios

• Cacti

• kill -3

Therap Services - OOW 2013 8

The Problem

• Diagnosis of performance issues

• Has become much harder with the growing system

• Application availability

• With a larger customer base, uptime has become a major factor

• Complexity of the system increases difficulty

• Limits of logging

• Works for known unknowns

• Need infrastructure to visualize and store historic data

• Limits of OS based monitoring

• Limited metrics

• Limits of simple JMX monitoring

Therap Services - OOW 2013 9

Application Performance Management

• Deep insight into running application

• Profiling at runtime

• Some bottlenecks are only visible at runtime

• Historic data

• Invaluable for preventing performance regression

Therap Services - OOW 2013 10

Which Vendor?

• We were moving from JBoss to WebLogic

• JVM Diagnostics

• Extensive WebLogic metrics

• Probably the best database diagnostics

• Our team was already familiar with OEM

• Deep integration with the database

• Integration of JVMD and app server metrics

• Expect better support from Oracle

Therap Services - OOW 2013 11

Oracle’s APM

• Oracle Enterprise Manager 12c

• WebLogic Metrics

• Middleware Diagnostics Advisor

• JVM Diagnostics

• Configuration Management

• Incident Management

• Lifecycle Management

• Oracle Real User Experience Insight

• Oracle Business Transaction Management

Therap Services - OOW 2013 12

WebLogic Metrics

• Pro-active monitoring

• Helps us in avoiding downtime

• Correlation between various metrics

• Middleware Diagnostics Advisor

Therap Services - OOW 2013 13

JVM Diagnostics

• Deep insight into the JVM

• Invaluable for understanding application performance issues

• Helped us in identifying log4j bottleneck

• Early identification of performance problems

Therap Services - OOW 2013 14

Oracle Real User Experience Insight

• Measure performance seen from the customer end

• Detect performance regression

• Enables shorter release cycles

• Quick and real feedback for performance tuning

operations

Therap Services - OOW 2013 15

Our Journey

Therap Services - OOW 2013 16

Timeline

• Evaluation of various vendors – July 2012

• Purchase of WebLogic, OEM, RUEI – Nov 2012

• Start JBoss to WebLogic migration

• Start building expertise on OEM

• Start using OEM in test environment

• Fix problems found through OEM, JVMD

• Production deployment – Mar 2013

Therap Services - OOW 2013 17

How OEM and RUEI helped us

Therap Services - OOW 2013 18

The log4j bottleneck

• During load testing, we could not increase load beyond a

certain point

• CPU load was low

• JVMD showed us something that we could hardly believe

• Many threads were contending for lock for writing to the

log file

• The contention only shows up at high loads

• Used JVMD heavily to find the best logging backend and

the best configuration

Therap Services - OOW 2013 19

log4j…

Therap Services - OOW 2013 20

log4j…

Therap Services - OOW 2013 21

log4j...

Therap Services - OOW 2013 22

EJB Transaction Optimization

• Noticed abnormally high number of bean transaction

commits

• We had forgotten to optimize some frequently used EJB

• Read-only methods do not need to be transactional

Therap Services - OOW 2013 23

EJB Transaction Optimization…

Therap Services - OOW 2013 24

Unexpected Top Method

• Noticed a JMS listener in the top method list

• In production!

• Did not show up during synthetic load testing

• We forgot to add a “message selector” on the listener

Therap Services - OOW 2013 25

Top Method…

Therap Services - OOW 2013 26

The MDA Catch

• MDA reported an unexpected “The EJB is taking too long

to execute”

• Related method was showing in the top methods list

• There were extraneous calls to the EJB

• The method did not need to be in an EJB

Therap Services - OOW 2013 27

The MDA Catch…

Therap Services - OOW 2013 28

The MDA Catch…

Therap Services - OOW 2013 29

The Slow Library

• A library call for producing JSON showed on the top

method list

• JSON is needed for AJAX

• It was totally unexpected

• The library was old and inefficient

• Replaced it with a newer and more efficient library

Therap Services - OOW 2013 30

The Slow Library…

Therap Services - OOW 2013 31

The Slow Library…

Therap Services - OOW 2013 32

The Slow Library…

Therap Services - OOW 2013 33

In-efficient Network Write

• Initially discovered in production through JVMD

• There were instances of high network waits

• Methods a certain module in the application showed up in

the top list during the high network wait periods

• Discovered a 3 level loop that writes data

• Further inspection through JProfiler confirmed it

Therap Services - OOW 2013 34

In-efficient Network Write…

Therap Services - OOW 2013 35

In-efficient Network Write…

Therap Services - OOW 2013 36

Automatic Thread Snapshots

• Previously, relied on kill -3

• Manual, missed dumps at crucial moments

• Now, JVMD takes thread snapshots when an abnormal

thread state is reached on any WebLogic server

• Combined with auto-restart from WebLogic, eliminated

unplanned downtime

Therap Services - OOW 2013 37

Compliance

• Helps us identify patches needed for:

• WebLogic

• Oracle Database

• OEM

• Downtime log

• Useful for tracking operations improvement

Therap Services - OOW 2013 38

The JDK Upgrade

• Upgraded JDK from 1.6.0_29 to 1.6.0_45

• At that time, we have not brought RUEI into our regular

operations process

• Started getting slowness complaints after a few days

• A look into RUEI instantly revealed performance

regression

• Downgrading the JDK fixed the regression completely

Therap Services - OOW 2013 39

The JDK Upgrade…

Therap Services - OOW 2013 40

The JDK Upgrade…

• We could not reproduce the performance regression in lab

environment

• We now have a new procedure to upgrade JDK

• Do normal tests and load tests as before

• In production, upgrade the JDK of one server only

• Wait a few days

• Compare performance

• Decide whether to upgrade or rollback

Therap Services - OOW 2013 41

The Results

• 1 unplanned downtime in the last 4 months!

• Improving DevOps culture

• Foster collaboration between dev, ops and DBA

• No more fighting between dev and DBA

• With RUEI, even the business team joins the fun

Therap Services - OOW 2013 42

Most

Importantly

I can do this!

Therap Services - OOW 2013 43

Future with OEM and RUEI

Therap Services - OOW 2013 44

Future

• Involve more people

• Use KPI in RUEI

• Integrate KPI with OEM by defining a Business

Application

• Leverage alerting and incident management in OEM

Therap Services - OOW 2013 45

Challenges

• Steep learning curve

• Took us time to understand the role of BTM and JRF

• Better documentation available now

• Lack of full WebLogic 12c support when we started

• Should be solved in latest release of OEM

• Support has been fanatical!

• Thank you Oracle!

Therap Services - OOW 2013 46

Tips

• Understand what you need

• Any web app with performance requirements needs RUEI

• If you face performance issues, you will need JVMD

• Engage as many people in your team as possible

• Give wide access to many

• The learning curve

• The problems these tools help you with are also complex

Therap Services - OOW 2013 47

Q&A

Therap Services - OOW 2013 48

Other Sessions

• JVM Diagnostics: Java Profiling in Production

Environments [CON9571]

• Thursday 2:00pm – 3:00pm @ Moscone North - 130

Therap Services - OOW 2013 49

Therap Services - OOW 2013 50

Contact

neelima.bawa@oracle.com

http://neelimabawa.blogspot.com

masum@therapservices.net

@Masum6

top related