practical tips for ops: end user monitoring

Andreas GrabnerChief DevOps Activist @ DynatraceTwitter: @grabnerandi

Brian ChandlerSales Engineer @ DynatraceTwitter: @Channer531

Practical Tips for Ops: End User MonitoringThe DevOps Journey Series Part 3

Drag picture to placeholder or click icon to add

Drag picture to placeholder or click icon to add

State of DevOps Report Adoption Metrics

200x 2,555xmore frequent deployments faster lead times than their peers

Dynatrace DevOps Adoption Metrics

12xMore feature

releases

170Deployments / Day

93%Production bugs found before

impacting end users

Interesting Ops Learnings from Adopters

New Tech Stack and Architectures

3rd Party / CDN

More Apps / Multi-Version

„Twitter Driven“ Load Models

DevOps Requirements and Engagement Options for OpsFeedback through High Quality App & User Data

Ops as a Service: “Self-Service for Application Teams” + Promote YOUR Monitoring through Shift-Left

Bridge the Gap between Server Side and End User

Shift-Left: (No)Ops as “Part of Application Delivery”

Requ

irem

ents

Enga

gem

ent O

ption

s

Basic App Monitoring1

App Dependencies2

End User Monitoring3 How to monitor mobile vs desktop vs tablet vs service endpoints?How much network bandwidth is required per app, service and feature?Where to start optimizing bandwidth: CDNs, Caching, Compression?

Are our applications up and running?What load patterns do we have per application?What is the resource consumption per application?

What are the dependencies between apps, services, DB and infra?How to monitor „non custom app“ tiers?Where are the dependency bottlenecks? Where is the weakest link?

Closing the Ops to Dev Feedback Loop: One Step at a Time!

“Soft-Launch” Support4

Virtualization Monitoring5 How to automatically monitor virtual and container instances?

What to monitor when deploying into public or private clouds?

How to deploy and monitor multiple versions of the same app / service?What and how to baseline?Do we have a better or worse version of an app/service/feature?

Ops: Need answers to these questions! Closing the gap to AppBizDev

Ready for “Cloud Native”

How to alert on real problems and not architectural patterns?How to consolidate monitoring between Cloud Native and Enterprise?

Who is using our apps? Geo? Device?Which features are used? Whats the behavior?Where to start optimizing? App Flow? Page Size?Conversion Rates? Bounce Rates?

Where are the performance / resource hotspots?When and where do applications break?

Do we have bad dependencies through code or config?How does the system really behave in production?What to learn for future architecturs?

What are the usage patterns for A/B or Green/Blue?Difference between different versions and features?

Does the architecture work in these dynamic enviornments?Does scale up/down work as expected?

Provide „Monitoring as a Service“ for Cloud Native Application Teams6

Today

confidential

How End User Monitoring Works!

7

Out-Side In Perspective: See your App from your users perspective

User Experience = Availability (Synthetic) + Performance, Errors & User Behavior (Real Users)

Every User, Every Click, Every App/Version

9

Visibility into Visitors and Sessions!#1: Unique Visitors

#2: All Sessions#3: Across all Apps

#4: Full Details for each Session

10

Seeing Every Single Step Along the Way!

#1: Timeline of a

single User Session

#2: Details for each User Action

#3: User Experience Breakdown

+ Events (Errors/Crashes)

#4: User Experience

Optimize Performance to Impact Behavior

#1: Performance Data

#2: Behavior Data

Key User Experience Metrics Feedback

#1: Who are they?

#2: Bandwidth!

#3: Response Time Breakdown

#4: Conversions: Total & Rate

#5: Client-Side Errors!

#6: CPU / Memory

#4: Conversions: Total & Rate

#6: Key User Action(s)

Questions to answer!

Efficiency: How to optimize end user experience, infrastructure & costs?Optimize Top vs Remove Flop Features!Analyze and optimize page load, network traffic and costs!

Impact: Do we impact our end users experience?Is the issue in Content Delivery, Network or Server Side?Can users use our services? Crashes? Bad or Slow Responses?

Mobile: as First Class Citizen!Usage feedback based on mobile versions & user experienceAnalyze crashes and optimize server-side resource usage

confidential


50,000 Foot View on User Experience

Birds eye view of holistic user experience

Green – SatisfiedYellow – ToleratingRed – Frustrated

• Line chart represents volume• Market Open • 60 User Actions per second


#1: User Experience

#2: Load

#3: App Layer

Focus on high value users and branches

Visual recognition of a problem

Popular dashboard template for execs


Hyperlyzer: Close-Up View

#1: Multi-Dimensional Analysis

#2: Top Findings

20

Understanding user click path

Analyze browser performance problems

Recognize performance patterns within branches

Ground-Level View

Automated Key User Experience Findings

#1: Key WPO Findings

#2: Actionable for Devs

Automated Comparison

#1: Compare with previous Timeframe / Release

#2: Actionable Diff-View for Devs

User ExperienceGreen – SatisfiedYellow – ToleratingRed – Frustrated

API PerformanceGreen – FastYellow – WarningRed – SlowPurple – Error

• Problem with mainframe (HPNS)

• Major outage on proprietary web server

• Notification of the problem at 5:30am

Purple creeping death

Automated JavaScript Error Analysis

#1: JavaScript Errors

confidential


Daily Traffic Pattern – bucketizing usage

Client Center sees a peak of about 3,800 Request/min against the it’s API.



60 unique calls/functions that make up the Client Center API




~20% of that traffic is ClientCenter/API/Holdings





~20% of that traffic is ClientCenter/API/ClientDetails





~20% of that traffic is ClientCenter/API/ClientDetails

~20% of that traffic is ClientCenter/API/RecentSearch


Auto-Detect Top/Flop User Actions#1: Feature Analysis by Usage,

Performance, Failures, …



#2: Frontend Analysis



#2: Frontend Analysis

#3: Backend Analysis

Automated Resource (DB) Usage Analysis

#1: DB Usage per Feature / Page

Feature Resource Analytics

Automated Resource Impact Analysis

#1: Impact by Resource Type

#2: Impact by Delivery Model

38

Automated CPU Consumption for User Actions

#1: Server-Side CPU Impact

confidential


Automated Mobile Version Usage Monitoring

#1: Which versions are used?

#2: Where and When is mobile used?

Automated Mobile Crash Analytics

#1: Crash Overview

#2: Crash details

Questions to answer!




How Can You Scale in the New DevOps World?

New Tech Stack and Architectures

3rd Party / CDN

More Apps / Multi-Version

„Twitter Driven“ Load Models

Confidential, Dynatrace, LLC

Monitoring redefinedEvery user, every app, everywhere. AI powered, full stack, automated.

Full lifecycle - development, test, and production


Complete monitoring coverage for all applications

Digital experience analytics Application performanceCloud, container, infrastructure

AgentsWiredata

Synthetics

Logdata

Real usermonitoring

Auto Discover Apps, Monitor, Baseline and Alert

#1: Peak Load on frontend

#2: Auto Detected Errors

Automated Problem and Impact Detection

Automatic Integration with ChatOps


A better way

Self-service for all

Automated monitoring

User experience is everything

More time innovating, not monitoring

Basic App Monitoring1

App Dependencies2

End User Monitoring3 How to monitor mobile vs desktop vs tablet vs service endpoints?How much network bandwidth is required per app, service and feature?Where to start optimizing bandwidth: CDNs, Caching, Compression?

Are our applications up and running?What load patterns do we have per application?What is the resource consumption per application?

What are the dependencies between apps, services, DB and infra?How to monitor „non custom app“ tiers?Where are the dependency bottlenecks? Where is the weakest link?

Closing the Ops to Dev Feedback Loop: One Step at a Time!

“Soft-Launch” Support4

Virtualization Monitoring5 How to automatically monitor virtual and container instances?

What to monitor when deploying into public or private clouds?

How to deploy and monitor multiple versions of the same app / service?What and how to baseline?Do we have a better or worse version of an app/service/feature?

Ops: Need answers to these questions! Closing the gap to AppBizDev

Ready for “Cloud Native”

How to alert on real problems and not architectural patterns?How to consolidate monitoring between Cloud Native and Enterprise?

Who is using our apps? Geo? Device?Which features are used? Whats the behavior?Where to start optimizing? App Flow? Page Size?Conversion Rates? Bounce Rates?

Where are the performance / resource hotspots?When and where do applications break?

Do we have bad dependencies through code or config?How does the system really behave in production?What to learn for future architecturs?

What are the usage patterns for A/B or Green/Blue?Difference between different versions and features?

Does the architecture work in these dynamic enviornments?Does scale up/down work as expected?

Provide „Monitoring as a Service“ for Cloud Native Application Teams6

Today

DXS DevOps Xcelerator will: Differentiate your sale Create value based outcomes Accelerate growth opportunities

Watch the DXS EnablementCourse on Dynatrace

University!

Stop by the DXS networking table to learn more!

confidential

Q & A

Brian ChandlerSales Engineer @ Dynatrace@Channer531

Andreas GrabnerChief DevOps Activist @ Dynatrace@grabnerandi

Try Dynatrace: http://bit.ly/dtsaastrialList to our Podcast: http://bit.ly/pureperf Read more on our blog: http://blog.dynatrace.com

practical tips for ops: end user monitoring

Technology