connecticut computer measurement group 2015 spring meeting [email protected] 5 ingredients...

33
Connecticut Computer Measurement Group 2015 Spring Meeting [email protected] 5 Ingredients to Executing Application Performance Management on z/OS

Upload: gerald-kelley

Post on 30-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Connecticut Computer Measurement Group2015 Spring Meeting

[email protected]

5 Ingredients to Executing Application Performance Management on z/OS

“The translation of IT metrics into business meaning (value) is what APM is all about.”

http://apmdigest.com/the-anatomy-of-apm-4-foundational-elements-to-a-successful-strategy

• Agreed?

• If so, what are the prerequisites to getting this done efficiently?

• Is there anything special in doing APM on the mainframe?

• While user satisfaction is based on response time and availability, you have to watch the consumed CPU seconds on z/OS to optimize and control costs.

Motivation

Ingredient 1One View

• Our daily life is full of “communication issues“ caused by

• Different (technical) languages

• Different metric system

• Missing information

• Looking at different spots

• …

• Let’s avoid those troublemakers wherever we can

• How would this translate into APM?

One View, One Language One Solution

• End-To-End

• End User Perspective

• No Gaps

• No Blind Spots

One View on the whole Environment

Browsers / Rich-Client

Mobile Apps

CICS

DB2

ESB/MB/MQ Mainframe.NET

JavaWeb Server CTG

Database

Java

SQLServer

PostgreSQL

DL/I

IMS

End-To-End

Key Metrics for each TierKey Metrics for each TierKey Metrics for each TierKey Metrics for each TierKey Metrics for each Tier

One Hotspot!

Ingredient 2Top Down

• Where is the issue?• On my Mobile App?

• A poor Network Connection?

• On the Web Server?

• On the Mainframe/DB2?

• …

• What’s the Root Cause?

• How could it be fixed?

Detect the Hotspot and Answer Open Questions

30,000 Feet

One user tap in the mobile app

Introduced 12Calls to CICS

Generated 2,671DB2 Calls

Which programs are executed and why?

Which program generated these DB2 statements?

• Where is the issue?• On my Mobile App? – No

• A poor Network Connection? – No

• On the Web Server? – A least it’s part of the problem

• On the Mainframe/DB2? – Yes

• What’s the Root Cause?• Inefficient Use of the mainframe

• Too many DB2 Statements

• How can it be fixed?• We need more details to answer that

Detect the Hotspot and Answer Open Questions

Who: A Java application istriggering the mainframe. How: Using the CICS

Transaction Gateway

What: Callstack for all programs and DB statements on z/OS

Zoooooooooom

Ingredient 3Start Early

• Agile Development and Continuous Integration• Forces teams to automate their build and testing processes

• Shortens development cycles from months (years?) to weeks or even days

• To maintain such a system you have to watch your builds with a handful of smart KPIs

• Is this also applicable for z/OS?

• If yes, what would be a smart set of KPIs?

Trend your Builds with valuable KPIs

Test Automation

2 122 0

1 34 0

Build 20 testPurchase OK

testSearch OK

Build 17 testPurchase OK

testSearch OK

Build 18 testPurchase FAILED

testSearch OK

Build 19 testPurchase OK

testSearch OK

Build # Test Case Status # Trans. # DB2 # Abend

2 122 0

1 34 0

2 122 1

1 34 0

6 758 01 34 0

Test Framework Results Detailed zOS Data

We identified a regresesion

Problem solved

Lets look behind the scenes

Abend is probably reason for failed tests

Problem fixed but now we have an architectural regression

Now we have the functional and architectural confidence

Problem fixed but now we have an architectural regression

What you currently measure

What you should measure

# Functional Test Failures

Overall Duration

Related to a User Action:# of z/OS Transaction# executed Programs# executed DB2 statements# MQ calls# AbendsCPU secondsExecution Time of Tests…

ReleaseAcceptance TestingUnit Testing Performance

Testing

Quality Gate between Stages

Automated Semi-Automated

Monitor Tests Analyze Results Integrate with Build Infrastructure

Ingredient 4Focus

• You tuned your top X z/OS transactions they are really fast and efficient now

• Those transactions are causing ~ 90% of your CPU time on z/OS

• So your mainframe environment looks like this:

Let‘s assume

Are you now done with APM?

Efficient, Fast & Beautiful

My Mainframe

• You can’t let these new small, agile drivers ding your beautiful car (while they are texting).

• But that’s exactly what can happen when distributed services are using the mainframe

• Too many mainframe transactions can be triggered.

• Huge/expensive transactions can be triggered, where only a very small portion of the response is used/required.

• The mainframe is simply not being used as it was designed to be used.

• How to tackle this issue, and prevent those dings?

Now you’ve introduced the mobile users…

• Analyze the top x User Actions transactions based on production data

• What’s the use case?

• APM End-to-End can tell you what your top user actions are, by invocation count or response time

• What mainframe transactions are currently invoked to serve this use case?

• APM End-End can tell you exactly what mainframe transactions, programs, and DB2 activity is generated due to these user actions

Focus on user actions

• Analyze the top x User Actions transactions based on production data

• How many times is the same transaction/data needed?

• Could it be cached on the distributed side?

• Are new transactions needed to fit the needs of the distributed side in the most efficient way?

• Ultimately, what you want is an efficient, fast, and beautiful experience for your mobile users

Focus just on solutions for these transactions

Efficient, Fast & Beautiful

Ingredient 5Always On

• What benefits could be worth this investment?

• What data should be captured in production?

• How many MIPS are burned for this purpose?

• We try to reduce MIPS wherever we can.

• Now we should monitor ALL transaction on the mainframe, 24/7, in production? Really?!

Always on – Brilliant Idea

1. You did APM in the Pre-Production• You know what your transactions look like

• You know how they are used from the distributed side

• You fixed any performance issues

2. Based on this knowledge you are able to predict• What production data you are interested in

• How much data will be captured

• How big the investment is to capture this data

• MIPS

• APM Infrastructure

Prerequisites for always on APM

Reason #1 – Mobile Workload Pricing

• What is the operational cost for your new web application or mobile application?

• Across the entire enterprise

• Which user actions are completed in this rollout?

• How do these new/modified actions affect mainframe activity?

• Impact on your KPI metrics• Did these increase/decrease compared to the previous version?

• Baselining

• Session comparison.

Reason #2 – Total Cost of Ownership

• How is the response time?

• Who are the main contributors?

• What’s the bounce rate?

• What’s the conversion rate?

• How many user actions are failing?• Failing due to z/OS activity

• MQ Queues

• CICS problems

Reason #3 – User Satisfaction

• Let’s discover issues before they affect the end user

• If a user does experience an issue• What went wrong with this particular transaction?

• Attach this information to a ticket and pass it to development

• No need to reproduce the issue all the information is there

• The root cause is identified in minutes, with no war room

• Fix it and prevent other users from experiencing the same issue

Reason #4 – Customer Care

Conclusion

• One view – over all departments/teams, end-to-end from the tap on the mobile device down to the DB2 backend.

• Top down – Start at 30,000 FT and dig into the details for root cause analysis.

• Start early – Catch performance issues as early as possible.

• Focus – Just focus on the applications/transactions that matter.

• Always on - Trend CPU resources and response time in production, 100% of the transactions of interest, and 24x7.

And don’t forget the recipe…