practical tips for ops: end user monitoring
TRANSCRIPT
Andreas GrabnerChief DevOps Activist @ DynatraceTwitter: @grabnerandi
Brian ChandlerSales Engineer @ DynatraceTwitter: @Channer531
Practical Tips for Ops: End User MonitoringThe DevOps Journey Series Part 3
Drag picture to placeholder or click icon to add
Drag picture to placeholder or click icon to add
State of DevOps Report Adoption Metrics
200x 2,555xmore frequent deployments faster lead times than their peers
Dynatrace DevOps Adoption Metrics
12xMore feature
releases
170Deployments / Day
93%Production bugs found before
impacting end users
Interesting Ops Learnings from Adopters
New Tech Stack and Architectures
3rd Party / CDN
More Apps / Multi-Version
„Twitter Driven“ Load Models
DevOps Requirements and Engagement Options for OpsFeedback through High Quality App & User Data
Ops as a Service: “Self-Service for Application Teams” + Promote YOUR Monitoring through Shift-Left
Bridge the Gap between Server Side and End User
Shift-Left: (No)Ops as “Part of Application Delivery”
Requ
irem
ents
Enga
gem
ent O
ption
s
Basic App Monitoring1
App Dependencies2
End User Monitoring3 How to monitor mobile vs desktop vs tablet vs service endpoints?How much network bandwidth is required per app, service and feature?Where to start optimizing bandwidth: CDNs, Caching, Compression?
Are our applications up and running?What load patterns do we have per application?What is the resource consumption per application?
What are the dependencies between apps, services, DB and infra?How to monitor „non custom app“ tiers?Where are the dependency bottlenecks? Where is the weakest link?
Closing the Ops to Dev Feedback Loop: One Step at a Time!
“Soft-Launch” Support4
Virtualization Monitoring5 How to automatically monitor virtual and container instances?
What to monitor when deploying into public or private clouds?
How to deploy and monitor multiple versions of the same app / service?What and how to baseline?Do we have a better or worse version of an app/service/feature?
Ops: Need answers to these questions! Closing the gap to AppBizDev
Ready for “Cloud Native”
How to alert on real problems and not architectural patterns?How to consolidate monitoring between Cloud Native and Enterprise?
Who is using our apps? Geo? Device?Which features are used? Whats the behavior?Where to start optimizing? App Flow? Page Size?Conversion Rates? Bounce Rates?
Where are the performance / resource hotspots?When and where do applications break?
Do we have bad dependencies through code or config?How does the system really behave in production?What to learn for future architecturs?
What are the usage patterns for A/B or Green/Blue?Difference between different versions and features?
Does the architecture work in these dynamic enviornments?Does scale up/down work as expected?
Provide „Monitoring as a Service“ for Cloud Native Application Teams6
Today
confidential
How End User Monitoring Works!
7
Out-Side In Perspective: See your App from your users perspective
User Experience = Availability (Synthetic) + Performance, Errors & User Behavior (Real Users)
Every User, Every Click, Every App/Version
9
Visibility into Visitors and Sessions!#1: Unique Visitors
#2: All Sessions#3: Across all Apps
#4: Full Details for each Session
10
Seeing Every Single Step Along the Way!
#1: Timeline of a
single User Session
#2: Details for each User Action
#3: User Experience Breakdown
+ Events (Errors/Crashes)
#4: User Experience
Optimize Performance to Impact Behavior
#1: Performance Data
#2: Behavior Data
Key User Experience Metrics Feedback
#1: Who are they?
#2: Bandwidth!
#3: Response Time Breakdown
#4: Conversions: Total & Rate
#5: Client-Side Errors!
#6: CPU / Memory
#4: Conversions: Total & Rate
#6: Key User Action(s)
Questions to answer!
Efficiency: How to optimize end user experience, infrastructure & costs?Optimize Top vs Remove Flop Features!Analyze and optimize page load, network traffic and costs!
Impact: Do we impact our end users experience?Is the issue in Content Delivery, Network or Server Side?Can users use our services? Crashes? Bad or Slow Responses?
Mobile: as First Class Citizen!Usage feedback based on mobile versions & user experienceAnalyze crashes and optimize server-side resource usage
confidential
Impact: Do we impact our end users experience?Is the issue in Content Delivery, Network or Server Side?Can users use our services? Crashes? Bad or Slow Responses?
50,000 Foot View on User Experience
Birds eye view of holistic user experience
Green – SatisfiedYellow – ToleratingRed – Frustrated
• Line chart represents volume• Market Open • 60 User Actions per second
50,000 Foot View on User Experience
#1: User Experience
#2: Load
#3: App Layer
Focus on high value users and branches
Visual recognition of a problem
Popular dashboard template for execs
10,000 Foot View on User Experience
Hyperlyzer: Close-Up View
#1: Multi-Dimensional Analysis
#2: Top Findings
20
Understanding user click path
Analyze browser performance problems
Recognize performance patterns within branches
Ground-Level View
Automated Key User Experience Findings
#1: Key WPO Findings
#2: Actionable for Devs
Automated Comparison
#1: Compare with previous Timeframe / Release
#2: Actionable Diff-View for Devs
User ExperienceGreen – SatisfiedYellow – ToleratingRed – Frustrated
API PerformanceGreen – FastYellow – WarningRed – SlowPurple – Error
• Problem with mainframe (HPNS)
• Major outage on proprietary web server
• Notification of the problem at 5:30am
Purple creeping death
Automated JavaScript Error Analysis
#1: JavaScript Errors
confidential
Efficiency: How to optimize end user experience, infrastructure & costs?Optimize Top vs Remove Flop Features!Analyze and optimize page load, network traffic and costs!
Daily Traffic Pattern – bucketizing usage
Client Center sees a peak of about 3,800 Request/min against the it’s API.
Daily Traffic Pattern – bucketizing usage
Client Center sees a peak of about 3,800 Request/min against the it’s API.
60 unique calls/functions that make up the Client Center API
Daily Traffic Pattern – bucketizing usage
Client Center sees a peak of about 3,800 Request/min against the it’s API.
60 unique calls/functions that make up the Client Center API
~20% of that traffic is ClientCenter/API/Holdings
Daily Traffic Pattern – bucketizing usage
Client Center sees a peak of about 3,800 Request/min against the it’s API.
60 unique calls/functions that make up the Client Center API
~20% of that traffic is ClientCenter/API/Holdings
~20% of that traffic is ClientCenter/API/ClientDetails
Daily Traffic Pattern – bucketizing usage
Client Center sees a peak of about 3,800 Request/min against the it’s API.
60 unique calls/functions that make up the Client Center API
~20% of that traffic is ClientCenter/API/Holdings
~20% of that traffic is ClientCenter/API/ClientDetails
~20% of that traffic is ClientCenter/API/RecentSearch
Daily Traffic Pattern – bucketizing usage
Auto-Detect Top/Flop User Actions#1: Feature Analysis by Usage,
Performance, Failures, …
Auto-Detect Top/Flop User Actions#1: Feature Analysis by Usage,
Performance, Failures, …
#2: Frontend Analysis
Auto-Detect Top/Flop User Actions#1: Feature Analysis by Usage,
Performance, Failures, …
#2: Frontend Analysis
#3: Backend Analysis
Automated Resource (DB) Usage Analysis
#1: DB Usage per Feature / Page
Feature Resource Analytics
Automated Resource Impact Analysis
#1: Impact by Resource Type
#2: Impact by Delivery Model
38
Automated CPU Consumption for User Actions
#1: Server-Side CPU Impact
confidential
Mobile: as First Class Citizen!Usage feedback based on mobile versions & user experienceAnalyze crashes and optimize server-side resource usage
Automated Mobile Version Usage Monitoring
#1: Which versions are used?
#2: Where and When is mobile used?
Automated Mobile Crash Analytics
#1: Crash Overview
#2: Crash details
Questions to answer!
Efficiency: How to optimize end user experience, infrastructure & costs?Optimize Top vs Remove Flop Features!Analyze and optimize page load, network traffic and costs!
Impact: Do we impact our end users experience?Is the issue in Content Delivery, Network or Server Side?Can users use our services? Crashes? Bad or Slow Responses?
Mobile: as First Class Citizen!Usage feedback based on mobile versions & user experienceAnalyze crashes and optimize server-side resource usage
How Can You Scale in the New DevOps World?
New Tech Stack and Architectures
3rd Party / CDN
More Apps / Multi-Version
„Twitter Driven“ Load Models
Confidential, Dynatrace, LLC
Monitoring redefinedEvery user, every app, everywhere. AI powered, full stack, automated.
Full lifecycle - development, test, and production
Confidential, Dynatrace, LLC
Complete monitoring coverage for all applications
Digital experience analytics Application performanceCloud, container, infrastructure
AgentsWiredata
Synthetics
Logdata
Real usermonitoring
Auto Discover Apps, Monitor, Baseline and Alert
#1: Peak Load on frontend
#2: Auto Detected Errors
Automated Problem and Impact Detection
Automated Problem and Impact Detection
Automatic Integration with ChatOps
Confidential, Dynatrace, LLC
A better way
Self-service for all
Automated monitoring
User experience is everything
More time innovating, not monitoring
Basic App Monitoring1
App Dependencies2
End User Monitoring3 How to monitor mobile vs desktop vs tablet vs service endpoints?How much network bandwidth is required per app, service and feature?Where to start optimizing bandwidth: CDNs, Caching, Compression?
Are our applications up and running?What load patterns do we have per application?What is the resource consumption per application?
What are the dependencies between apps, services, DB and infra?How to monitor „non custom app“ tiers?Where are the dependency bottlenecks? Where is the weakest link?
Closing the Ops to Dev Feedback Loop: One Step at a Time!
“Soft-Launch” Support4
Virtualization Monitoring5 How to automatically monitor virtual and container instances?
What to monitor when deploying into public or private clouds?
How to deploy and monitor multiple versions of the same app / service?What and how to baseline?Do we have a better or worse version of an app/service/feature?
Ops: Need answers to these questions! Closing the gap to AppBizDev
Ready for “Cloud Native”
How to alert on real problems and not architectural patterns?How to consolidate monitoring between Cloud Native and Enterprise?
Who is using our apps? Geo? Device?Which features are used? Whats the behavior?Where to start optimizing? App Flow? Page Size?Conversion Rates? Bounce Rates?
Where are the performance / resource hotspots?When and where do applications break?
Do we have bad dependencies through code or config?How does the system really behave in production?What to learn for future architecturs?
What are the usage patterns for A/B or Green/Blue?Difference between different versions and features?
Does the architecture work in these dynamic enviornments?Does scale up/down work as expected?
Provide „Monitoring as a Service“ for Cloud Native Application Teams6
Today
DXS DevOps Xcelerator will: Differentiate your sale Create value based outcomes Accelerate growth opportunities
Watch the DXS EnablementCourse on Dynatrace
University!
Stop by the DXS networking table to learn more!
confidential
Q & A
Brian ChandlerSales Engineer @ Dynatrace@Channer531
Andreas GrabnerChief DevOps Activist @ Dynatrace@grabnerandi
Try Dynatrace: http://bit.ly/dtsaastrialList to our Podcast: http://bit.ly/pureperf Read more on our blog: http://blog.dynatrace.com