a taste of monitoring and post mortem debugging with node
Post on 16-Feb-2017
1.320 Views
Preview:
TRANSCRIPT
§ Michael Dawson/Michael Tunnicliffe
§ September 2015
A Taste of Monitoring and Post Mortem Debugging with Node
A Workshop
2
To talk about: Application reliability and performance● Common problem scenarios● Monitoring● Problem diagnosis● Tailoring tooling to your application
Try it for yourself with prepared hands-on activities
To share and learn from each others experiences
Why are we here ?
3
• Meet and Greet
• Runtime Monitoring• Concepts and Tools• Hands on - Monitor an App
• GC Visualisation• Concepts and Tools• Hands on – visualize gc activity
• Post Mortem Debugging • Concepts and Tools• Hands on with a heap dump• Hands on with a core dump
• Experience/Gaps Brainstorm
• Monitoring – Instrument a Module
Heap usage/Object layout
Initial Problem Determination
Dig into GC behaviour (if appropriate)
Execution State/Native Memory
Retrospective
Improve your tools
Modelled on typical workflow
Agenda
4
About Michael Dawson Loves the web and building software
Senior Software Developer @ IBM22 years development experience10 years in runtime development
Contact me: Michael_dawson@ca.ibm.comhttps://www.linkedin.com/pub/michael-dawson/2/128/605Twitter: @mhdawson1
5
About Mike TunnicliffeBug detective and low level systems programmer
Software Developer @ IBM15 years development experience12 years in runtime development
Contact me: mike.tunnicliffe@uk.ibm.com@fierydrake
6
• Interest in Monitoring/Post mortem debugging ?
• What you hope to get out of the workshop today ?
About You (workshop participants)
Used Node Before ?
Role (Dev/QA/Ops) ?
Apps in Production ?
Production Monitoring ?
Background
Memory Usage ?
GC activity
Throughput
Reponse Time
Something else ?
Metrics – what do you track
High Memory
Memory Exhaustion
Memory Leaks
Footprint
Performance
Crash
Unhandled Exceptions
Problems – what have you seen
7
• USB drive provided for easy installation
• IBM is not a distributor
• While freely available, licences apply to some components● Licences are included on USB drive and in install packages● By copying from USB drive you are accepting licence terms in USB drive
as licensee
• Printed copy also available
• Alternatively follow full download/install instructions.
• Install Instructions● Pre-Install/Instructions.txt● Self-Install/Instructions.txt
Workshop materials
8
• Runtime Monitoring
• GC Visualization
• Post Mortem Debugging
• Experience/Gaps Brainstorm
• Monitoring – Instrument a Module
Heap usage/Object layout
Initial Problem Determination
Dig into GC behaviour
Execution State/Native Memory
Retrospective
Improve your tools
Modelled on typical workflow
Runtime Monitoring
9
To understand the “real” characteristics of the application
• In production – health check under real load
• During development – test under artificial load
Why do runtime monitoring?
10
Data about the application and the environment on which it runs, typically:
• Machine and operating system• Node.js and dependencies (eg V8)• Application and dependencies (eg http, mongodb)
What will it give you?
11
Typically production monitoring consists of:
• Data sources• Data store aggregator• Web-based visualizer
How it works
Browser
MonitoringData
HTTP
Loa
d B
alan
cer
12
Production monitoring solutions
13
Setup
• Run startelk
• Run startapps
• Open Kibana in a browserhttp://localhost:5601/
• Open the dashboard
Try to identify any potential problems with the monitored applications
Monitoring – Hands on – 20 minutes
14
Data Sources (open source)
• ElasticSearch / Kibana (ELK)● bunyan-child, hapi-good-logstash, winston-kibana, log4js-elasticsearch,
black-pearl, appmetrics-elk, etc
• StatsD / Graphite● Express-metrics, connect-statsd, mongodb-datadog-statsd, node-statsd,
statsd-profiler, node-statsd-instrument, appmetrics-statsd, etc.
• Munin● nodejs-munin, munin-plugin, etc
• Custom:● monitor, appmetrics, etc
15
Node Application Metrics ('appmetrics')
• https://www.npmjs.com/package/appmetrics
• https://github.com/RuntimeTools/appmetrics
• Open Source under Apache 2.0 License*
• Provides:● producer and consumer APIs● monkey-patching framework and “dropins”● collection of connectors to existing monitoring stacks
*has a (currently) proprietary dependency
16
appmetrics Data Collection
Data Type Event Name Data Provided
Application Data
HTTP request/response 'http' Time, Method, URL, Duration
MySQL database queries 'mysql' Time, SQL, Duration
MongoDB database queries 'mongo' Time, Query, Duration
Request tracking 'request' Time, Event Tree, Duration
JavaScript Runtime
V8 Garbage Collection 'gc' Time, Type, Used, Size, Duration
V8 Profiler 'profiling' Time, Function Array, Samples
System Resources
CPU Usage 'cpu' Time, Process, System
Memory Usage 'memory' Time, Process (Physical, Reserved, Virtual), System (Physical, Total)
17
Production Monitoring with 'appmetrics'
18
Development Monitoring with 'appmetrics'
Health Centerfor Eclipse
appmetrics-dash
19
Appmetrics and ElasticSearch/Kibana (ELK)var appmetrics = require('appmetrics-elk').monitor();
var http = require('http');var server = http.createServer(function handler(req, res) {
…
}).listen(process.env.PORT || 3000); console.log('App listening on port 3000');});
ES ES ES ES
client.bulk()
visualize
20
'appmetrics' and Health Center for Eclipse
• Free from Eclipse Marketplace• http://marketplace.eclipse.org/content/ibm-monitoring-and-diagnostic-tools-health-center• Supports Node and Java
• Uses MQTT for the connection
Pub/Sub BrokerTopic namespace:/ibm/healthcenter
/<namespace>/<agent_id>/<datatype>
Node.js
Application
LibertyJava
Application
MQTT
MQTT
MQTT
MQTT
MQTT Broker, eg. mosca(npm install -g mosca)
21
Health CenterEnvironment Hardware and Operating System ConfigurationProcess environment and configurationHighlights incorrect or non-standard configurations
Memory UtilizationDetect native memory leaks in applicationDetermine if external forces are using more memoryView components using the most native memory
CPU Utilization Visualizes process CPU usage over time Visualizes system CPU usage over time
22
Health CenterGarbage CollectionVisualizes heap usage and GC pause timesIdentifies memory leaksSuggests command-line and tuning parameters
Function ProfilingFunction profiling shows application activityIdentifies the hottest methods in an applicationFull call stacks to identify where functions are being
called from and what functions they call
23
Monitoring – Hands on – 20 minutes• Run sample app (gcload.bat/gcload.sh) and connect with Health Center
• Look at environment information• Look through other available panels• Look at gc activity so see periodic allocation load
• Run sample app with heavy cpu usage (cpuload.bat/cpuload.sh) • Enable profiling • Look at profile, map back to application source
• Run custom program that consumes data. • Edit gcload.js to un-comment line that create monitor instance• Re-run and see output on console.
24
• Runtime Monitoring
• GC Visualisation
• Post Mortem Debugging
• Experience/Gaps Brainstorm
• Monitoring – Instrument a Module
Heap usage/Object layout
Initial Problem Determination
Dig into GC behaviour
Execution State/Native Memory
Retrospective
Improve your tools
Modelled on typical workflow
Runtime Monitoring
25
GC Visualization• Visualisation of gc data long known to be important
• There are many existing gc visualizers (mostly Java)
•
•
•
•
•
• Javascript● https://github.com/adobe-research/GCview● http://marketplace.eclipse.org/content/ibm-monitoring-and-diagnostic-tools-garbage-collection-and-memory-visualizer-gcmv
We’ll use GCMV since we are most familiar with it
26
GCMV
• Free from Eclipse Marketplace • http://marketplace.eclipse.org/content/ibm-monitoring-and-diagnostic-tools-
garbage-collection-and-memory-visualizer-gcmv
• Supports both Node and Java
27
GC Visualization – Hands on (15mins)• Run sample app with memory leak with gc options enabled (gcloadWithVerbose)
• Observe there is a lot of information
• Run again sending to file
• Open up resulting file in GCMV
28
• Runtime Monitoring
• GC Visualisation
• Post Mortem Debugging
• Experience/Gaps Brainstorm
• Monitoring – Instrument a Module
Heap usage/Object layout
Initial Problem Determination
Dig into GC behaviour
Execution State/Native Memory
Retrospective
Improve your tools
Post Mortem Debugging
29
• 2 Common artifacts• Heap Dumps• Core Dumps
• No standards yet but workgroup:• https://github.com/nodejs/post-mortem
• Heap dumps provide insight into Heap memory usage
• Core Dumps can provide insight into:• Heap memory usage• Runtime State
• What was running at the time of a crash – native and JS stacks• Values in native memory
• Native memory usage
Post Mortem Debugging
30
• Run Sample OOM program
• Generate Heapdump on OOM
• Load OOM into Chrome developer tools
• Show how to find what is keeps memory alive
• Participants explore
Chrome Developer Tools – Hands on – 10 minutes
31
• Free from Eclipse Marketplace• http://marketplace.eclipse.org/content/ibm-monitoring-and-diagnostic-tools-interactive-diagnostic-data-explorer-idde
• Multi-runtime → Supports Node and Java
• Currently requires dumps from IBM SDK for Node.js
• http://www.ibm.com/developerworks/web/nodesdk/
Interactive Diagnostic Data Explorer (IDDE)
32
• Open pre-provided core file
• Use IDDE commands to show where crash occurred
• Help to list all commands
• Explore
IDDE – Hands On – 10 minutes
33
• Runtime Monitoring
• GC Visualisation
• Post Mortem Debugging
• Experience/Gaps Brainstorm
• Monitoring – Instrument a Module
Heap usage/Object layout
Initial Problem Determination
Dig into GC behaviour
Execution State/Native Memory
Retrospective
Improve your tools
Retrospective
34
• Experience from Participants (Open discussion)– Other tools
– Key issues
– Workflow
• Gaps Brainstorm
– Key gaps
– Post mortem debugging – useful commands ?
Other Experiences/Gaps Brainstorm (10-15 mins)
35
• Runtime Monitoring
• GC Visualisation
• Post Mortem Debugging
• Experience/Gaps Brainstorm
• Monitoring – Instrument a Module
Heap usage/Object layout
Initial Problem Determination
Dig into GC behaviour
Execution State/Native Memory
Retrospective
Improve your tools
Improve
36
You may collect timing and event data from your application, or modules that you use.
Wouldn't it be great if you could capture and view that in the same way as the other monitoring data?
Custom monitoring
37
The appmetrics module provides:
• A producer API for emitting custom data
• A consumer API for receiving all data
Using appmetrics APIs
38
Using the emit() and Event APIs
var appmetrics = require('appmetrics');var data = {
time: Date.now(),mydata: 7
};
appmetrics.emit('myevent', data);
var appmetrics = require('appmetrics');var monitor = appmetrics.monitor();
monitor.on('myevent', function(data) {console.log('[myevent] ' + data.mydata);
});
• emit() API is on the appmetrics object
• Convention to have a “time” field– Allows it to be used in ElasticSearch
• Events are on the monitored instance object– Allows for future remote monitoring
39
Instrument a simple piece of test code:
• Emit a custom 'iteration' event with the following data:counttime taken to generate the arraytime taken to serialize
• Add a handler that logs the data
Improve – Instrument an application (10 mins)
// Crunch some numberssetInterval(function() { var object = []; var count = Math.random()*10000000; for (var i=0; i<count; i++) { object.push(i) } JSON.stringify(object);}, 1000);
40
You may not be able to modify the code you want to instrument
• For example: a third-party or core module
• You could monkey patch it
Instrumenting modules
41
A simple monkey patching examplevar appmetrics = require('appmetrics');var fs = require('fs');
var originalReadFileSync = fs.readFileSync;
// Instrument readFileSyncfs.readFileSync = function() { var startTime = Date.now(); originalReadFileSync.apply(fs, arguments); var endTime = Date.now(); appmetrics.emit('loadfile', { 'duration': endTime - startTime });};
// Set up monitoring listenersappmetrics.monitor().on('loadfile', function(e) { console.log('File load duration=' + e['duration'] + 'ms');});
// Applicationfor (var i=0; i<100; i++) { fs.readFileSync('bigfile.dat');}
Store function to patch
Patch function by replacing
Custom “visualizer”
42
Asynchronous monkey patching examplevar appmetrics = require('appmetrics');var fs = require('fs');
var originalReadFile = fs.readFile;
// Instrument readFileSyncfs.readFile = function() { var startTime = Date.now(); var callbackIndex = (typeof(arguments[1]) === 'function') ? 1 : (typeof(arguments[2]) === 'function') ? 2 : -1; if (callbackIndex == -1) return; var originalCallback = arguments[callbackIndex]; var wrap = function() { appmetrics.emit('loadfile', { 'duration': Date.now() - startTime }); originalCallback.call(arguments); }; arguments[callbackIndex] = wrap; originalReadFile.apply(fs, arguments);};
// Set up monitoring listenersappmetrics.monitor().on('loadfile', console.log);
// Applicationfor (var i=0; i<100; i++) { fs.readFile('bigfile.dat', function(err, data) { console.log('readFile completed'); });}
Store function to patch
Patch function by replacing
Custom “visualizer”
43
Deferred monkey patching examplevar module = require('module');var moduleLoad = module._load;
module._load = function(mod, parent, isMain) { var loadedModule = moduleLoad.apply(this, arguments); if (mod === 'http') { var addListener = loadedModule.Server.prototype.addListener; loadedModule.Server.prototype.addListener = function(type, listener) { if (type === 'request' && typeof listener === 'function'){ var oldListener = listener; listener = function(req, res){ if (typeof res.startTime === 'undefined') { res.startTime = new Date().getTime(); res.once('finish', function() { res.endTime = new Date().getTime(); var dur = res.endTime - res.startTime; console.log('['+ res.startTime + '] ' + req.method + ' : ' + req.url + ' ('+dur+')'); }); }; oldListener.call(this, req, res); }; return addListener.call(this, type, listener); } else { return addListener.apply(this, arguments); } }; }; return loadedModule;};
44
The appmetrics module provides:
• Drop-in “probes”Self-contained facility to deploy patches in appmetrics
• An aspect framework for use by probesSimplify complex method replacement scenarios
Making patching easier
45
Using the Probes framework
Probe scripts are automatically loaded from:node_modules/appmetrics/probes
Template probe:var Probe = require('../lib/probe.js');var aspect = require('../lib/aspect.js');var util = require('util');
function MyExampleProbe() { Probe.call(this, 'name_of_module_to_patch');}util.inherits(MyExampleProbe, Probe);
MyExampleProbe.prototype.attach = function(name, target, am) { if (target.__probeAttached__) return target; target.__probeAttached__ = true;
// Code to patch and emit event (can use aspect)
return target;};
module.exports = MyExampleProbe;
46
Aspect API
aspect.before(obj, methods, func);
aspect.after(obj, methods, func);
aspect.around(obj, methods, func1, func2);
aspect.aroundCallback(obj, methods, func1, func2);
• Provides APIs to locate and patch functions– before, after and around supported
HTTPProbe.prototype.attach = function(name, target, am) { if (target.__probeAttached__) return target; target.__probeAttached__ = true; aspect.before(target.Server.prototype, ['on', 'addListener'], function(obj, args) { if (args[0] !== 'request') return; aspect.aroundCallback(args, function(obj, args) { var request = args[0], response = args[1]; // args for callback are [request, response] response.startTime = Date.now(); aspect.after(response, 'finish', function(obj, args, ret) { var dur = Date.now() - response.startTime; console.log('['+ response.startTime + '] ' + request.method + ' : ' + request.url + ' ('+dur+')'); }); }); }); return target;};
• Reduces the complex monkey patch example to...
47
Using the Aspects frameworkAspects framework makes it easy to patch specific functions in the moduleSimple use:
The “around” function is basically “before” and “after” combined
MyExampleProbe.prototype.attach = function(name, target, am) { if (target.__probeAttached__) return target; target.__probeAttached__ = true;
var start; aspect.around(target, 'readFileSync', function(obj, args) { start = Date.now(); }, function(obj, args, returnObject) { am.emit('file', {time: start, method: 'readFileSync', file: args[0], duration: Date.now() - start}); return returnObject; } );
return target;};
48
Using the Aspects frameworkThis, more complex, scenario can happen when you want to patch async functions
that accept callback arguments
The “aroundCallback” function will search a given list of objects and patch the first function it finds (good for patching async handlers)
MyExampleProbe.prototype.attach = function(name, target, am) { if (target.__probeAttached__) return target; target.__probeAttached__ = true;
// 1. Run some code after moduleToPatch.functionToPatch() to... // 2. - run some code before a function on its return value (returnObject.functionOnReturnObjectToPatch()) to... // 3. - search that function's arguments for a callback and run some code before that callback aspect.after(target, 'functionToPatch', function(target, args, returnObject) { aspect.before(returnObject, 'functionOnReturnObjectToPatch', function(target, methodArgs) { aspect.aroundCallback(methodArgs, function(target, args){ // code }); } ); return returnObject; });
return target;};
12
3
49
Instrument fs.readFile() by creating a probe
Start with the template code → ● Set the name to 'fs'● Replace the comment with
a call into the aspect frameworkto patch readFile() and associatedcallback
● Drop into the probes directoryinside appmetrics
Use this simple script to test →
Improve – Instrument a module (10 mins)
var fs = require('fs');setInterval(function() { fs.readFile('bigfile.dat', function(err, data) { console.log(data.length); });}, 1000);
var Probe = require('../lib/probe.js');var aspect = require('../lib/aspect.js');var util = require('util');
function MyExampleProbe() { Probe.call(this, 'name_of_module_to_patch');}util.inherits(MyExampleProbe, Probe);
MyExampleProbe.prototype.attach = function(name, target, am) { if (target.__probeAttached__) return target; target.__probeAttached__ = true;
// Code to patch and emit event (can use aspect)
return target;};
module.exports = MyExampleProbe;
50
• Runtime Monitoring
• GC Visualisation
• Post Mortem Debugging
• Experience/Gaps Brainstorm
• Monitoring – Instrument a Module
Heap usage/Object layout
Initial Problem Determination
Dig into GC behaviour
Execution State/Native Memory
Retrospective
Improve your tools
Summary and Wrap up
51
Copyrights and Trademarks
© IBM Corporation 2015. All Rights Reserved
IBM, the IBM logo, ibm.com are trademarks or registered
trademarks of International Business Machines Corp.,
registered in many jurisdictions worldwide. Other product and
service names might be trademarks of IBM or other companies.
A current list of IBM trademarks is available on the Web at
“Copyright and trademark information” at
www.ibm.com/legal/copytrade.shtml
Node.js is an official trademark of Joyent. IBM SDK for Node.js is not formally related to or endorsed by the official Joyent Node.js open source or
commercial project.
Java, JavaScript and all Java-based trademarks and logos are trademarks or registered trademarks of Oracle and/or its affiliates.
"All other servicemarks and trademarks are the property of their respective owner."
top related