![Page 1: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/1.jpg)
Performance analysis and troubleshooting using DTrace
“I keep six honest serving-men (They taught me all I knew);Their names are What and Why and When And How and Where and Who.” - Rudyard Kipling
Graeme Jenkinson - Computer Laboratory
![Page 2: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/2.jpg)
Today’s lecture
Overview of tracing and profiling
DTrace: Dynamic Instrumentation of Production Systems
Performance analysis with DTrace - case study
Tracing and profiling on Linux
Wrap up and questions
![Page 3: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/3.jpg)
Overview of tracing and profiling
• A profile provides an inventory of performance events and timings for the execution as a whole.• Sampling at a fixed time interval• Aggregated results - requests/per second• Absolute chronology of events is ignored
• A trace records the chronology, often with timestamps:• Absolute chronology of events is preserved• Trace increasing at runtime - must be periodically written out for processing• Can be used to construct a profile
Focus on tracing
![Page 4: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/4.jpg)
Who did what to who?
04 Intuition libraryAddress of task
![Page 5: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/5.jpg)
Tracing software as a black box
• print statements:
• printf(“Here”);
• #ifndef NDEBUGprintf(“State: Connected”);#endif
• Logging frameworks:
• info!(“State: connected”);
• logger.error(“Failed: to write file”);
Can declaratively turn on/off logging at compile time
Can declaratively turn on/off logging at runtime
![Page 6: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/6.jpg)
Tracing software as a black box - summary
Print statements Logging framework
Pros• As simple as it gets• Declarative control at
compile time
• Declarative control at runtime or compile time
• Finer grained control of logging (per class/module)
• Structured output• Flexibility
Cons
• Adding tracing requires a recompile (in some case)
• Ad-hoc adoption• Inflexible (requiring
compile time decisions)
• Tiny bit more effort to adopt
• Added dependency to codebase
![Page 7: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/7.jpg)
Logging can only get you so far
• Errors frequently the result of complex interactions across the software stack
• Logging limited to a single layer of the the software stack
• Logging frameworks don’t capture the events you want to see
![Page 8: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/8.jpg)
Debuggers
• When logging really doesn’t cut it we often break out the debugger
• Set breakpoints, single step code, watch variables/registers
• Debuggers often struggle when facing certain types of problems:
• Heisenbug: “a software bug that seems to disappear or alter its behaviour when one attempts to study it"
![Page 9: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/9.jpg)
Probe effects
• Probe effect - act of measuring disturbs system
• Electronics - probes introduce additional capacitance, resistance or inductance
• Software tracing - probes take time to execute
• Bugs from subtle timing effects arecommon, probe effects can rarelybe ignored
![Page 10: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/10.jpg)
Observability/transparency
• Rather than treating the system as a black box we need observability into the system when executing
• What’s needed is more flexible less intrusive tools, thankful there are plenty of them
• Do these tools have a probe effect?
![Page 11: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/11.jpg)
Observability/transparency tools
![Page 12: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/12.jpg)
Observability tools: top
NAME
top - display Linux processes
top (table of processes) - reads kernel state from /proc filesystem (on Linux, FreeBSD); procfs is a pseudo-filesystem that presents information about processes and other system information in a hierarchical filesystem-like structure.
Note: procfs doesn’t come free (probe effect). What operations need to be performed?
![Page 13: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/13.jpg)
Observability tools: strace (dtruss Mac OS X, truss FreeBSD)
Linux
strace - trace system calls and signals
Mac OS X (requires disabling System Integrity Protection - El Capitan+)
dtruss - process system call details. Uses DTrace.
FreeBSD
truss - trace system calls
![Page 14: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/14.jpg)
Utilization, Saturation and Errors (USE)
• Utilization: the average time that the resource was busy servicing work
• Saturation: the degree to which the resource has extra work which it can't service, often queued
• Errors: the error count
• Resources:
• CPUs: sockets, cores, hardware threads (virtual CPUs)
• Memory: capacity
• Network interfaces
• Storage devices: I/O, capacity
• Controllers: storage, network cards
• Interconnects: CPUs, memory, I/O
![Page 15: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/15.jpg)
Layered architectures and observability
• System performance problems are typically introduced at the highest layers of abstraction
• Effects are often first encountered and attributed at the lowest layers of abstraction
• Observability is required across all layers of the stack
![Page 16: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/16.jpg)
In production analysis
• As software engineers we graduate from screwing up our own system to screwing up other peoples
• In production tracing must be safe
• Probe effect must be zero when disabled - “low” when tracing
![Page 17: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/17.jpg)
DTrace: one tool to rule them all
• Developed by Bryan Cantrill, Mike Shapiro, and Adam Leventhal at Sun Microsystems
• Release in 2005 by Sun Microsystems
• “Performance analysis infrastructure has generally not kept pace with the shift to in-production performance analysis” - Cantrill et al
Illumos, FreeBSD, NetBSD, MacOS X
Linux (ish), Windows (yuck)
![Page 18: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/18.jpg)
DTrace overview (Cantrill ’04)
• Dynamic instrumentation
• Zero probe effect when disabled
• Unified instrumentation
• Instruments both user and kernel level software
• Arbitrary-context kernel instrumentation
• Can instrument any aspect of the kernel including delicate sub-systems such as the scheduler and synchronisation facilities
As a result probes can’t block
What is the impact when enabled?
![Page 19: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/19.jpg)
DTrace overview (cont)
• Data integrity
• Records may be dropped under load - but never silently
• High-level control language
• Scriptable “arbitrary” actions on probe firing
• Predicates that determine whether a probe fires
• Scripts executed by in-kernel virtual machine
• Production safe
• Aborts when system is heavily loaded
• Scalable in kernel aggregations (profiling)
How is safety preserved?
![Page 20: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/20.jpg)
DTrace overview - anatomy of a dynamic tracer
Consumer
Compiler
fbt::vm_fault:entry { fault_count += 1;}
Probes
Userspace
Buffers (per-CPU)
Kernel
Results
D script
![Page 21: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/21.jpg)
DTrace providers
• fbt: Function Boundary Tracing
• syscall: System Calls
• pid: User space processes
• proc: Process Operations
• sched: Scheduler
• Network Protocols
• IP, UDP, TCP
• lock: Kernel locking points
• io: I/O calls
• vfs: Filesystem Routines
• profile: Timing source
• mac_framework: Mandatory Access Control framework
![Page 22: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/22.jpg)
DTrace providers (cont)
dtrace -l List all registered probes
dtrace -l -P fbt List probes registered by Function Boundary Tracing provider
dtrace -l -f vm_fault List probes for function vm_fault
provider:module:function:name
kernel moduleDTrace provider
function to trace
probe name
![Page 23: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/23.jpg)
DTrace providers (cont)
• Statically added probes - added by developers
• Known useful points of instrumentation: starting new threads, TCP state machine transitions
• Dynamically added - probes automatically work out where to insert probes
• Static probes are stable instrumentation points, dynamic probes depend on the specifics of the implementation and therefore are subject to change
![Page 24: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/24.jpg)
DTrace tracing example
syscall:::entry/execname == “ls”/{ printf(“syscall = %s\n”, probefunc);}
dtrace -c ls -n ‘…’
Predicate - only for the ls command
Action - Print the probes function Built in variable
![Page 25: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/25.jpg)
DTrace profiling example
BEGIN { syscall_count = 0;}
syscall:::entry/execname == “ls”/{ syscall_count++;}
END { printf(“syscall_count = %d\n”, syscall_count);}
Initialize the count (global)
Print the count
Instrument all syscall entries
Only for the ls command
![Page 26: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/26.jpg)
DTrace profiling example 2
syscall:::entry/execname == “ls”/{ t[probefunc] = timestamp;}
syscall:::return/t[probefunc] != 0/{ @funct[probefunc] = sum(timestamp - t[probefunc]); t[probefunc] = 0;}
Other useful profiling probes include profile-199 - why 199 seconds?
![Page 27: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/27.jpg)
DTrace user land example 1
• pid (process ID) provider• Allow tracing of and arbitrary process such as a Web browser of
databases• Process does not need to be restarted (important in production)
dtrace -n ‘pid$target:::entry { trace(timestamp); }’ -c ls
dtrace -n ‘pid$target:::entry { trace(timestamp); }’ -p …
process id to trace
![Page 28: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/28.jpg)
DTrace internals - overview
![Page 29: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/29.jpg)
DTrace internals
• DTrace core framework
• Interfaces to userspace through ioctl() to /dev/dtrace/*
• Majority of the code (dtrace.c) is generic framework code
• Delegates instrumentation to providers
• Implements a virtual machine for the D Intermediate Format (DIF)
• Ensures safety of D scripts
• Analogous to Java bytecode
![Page 30: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/30.jpg)
DTrace internals
• Provider
• Holds the knowledge about specific instrumentation techniques
• Enables probes when consumer says so (specific to each provider)
• Transfer control to DTrace framework by calling dtrace_probe(probe_id) somehow
• Then, DTrace framework can performs actions associated with probe
• dtrace_probe()
• Runs in specialized context known as probe context
• Interrupts are disabled for the CPU executing the probe
• Does no memory allocation and takes no locks
![Page 31: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/31.jpg)
DTrace internals
• D compiler:
• Compiles D scripts to D Intermediate Format (DIF)
• Assembles DIF into DIFOs (for predicate and actions)
• Merges DIFOs into DOF (D Object Format)
• DIF has a RISC-like instruction set that supports D language constructs
dtrace -S -n 'syscall:freebsd:open:entry {printf("%s %s\n", execname, copyinstr(arg0)); }'
There’s no spec but we are working on it
![Page 32: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/32.jpg)
DTrace internals
• Enabling Control Blocks (ECBs)
• Kernel data structure that represents an enabled probe
• Each ECB contains DIFO for predicates and actions associated with probe enabling
• Can be chained together if multiple consumers are interested in a given probe
Probe
ECB ECB
PredicateDIFO
ActionDIFO
ActionDIFO
![Page 33: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/33.jpg)
DTrace internals - Function Boundary Tracing (FBT)
• sys_getpid, no DTrace
<sys_getpid> push %rbp<sys_getpid+1> mov %rsp,%rbp<sys_getpid+4> mov 0x8(%rdi),%rax<sys_getpid+8> movslq 0xbc(%rax),%rax
• sys_getpid, while running dtrace -n ‘fbt:kernel:sys_getpid:entry'• int3 replaces first sys_getpid instruction• push instruction is emulated in the kernel• original push instruction is restored after dtrace exits
<sys_getpid> int3<sys_getpid+1> mov %rsp,%rbp<sys_getpid+4> mov 0x8(%rdi),%rax<sys_getpid+8> movslq 0xbc(%rax),%rax
Used by debuggers to set breakpoints - architecture dependent
DTrace uses CTF Compressed Type Format) to identify function entry point
![Page 34: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/34.jpg)
DTrace in the real world
• How many software engineers are there in the world working in systems?
• Boem estimates 700k work in “infrastructure”, which is a fairly broad definition
• How many people have used DTrace? ~1000s.
• How many people have written DTrace scripts? <1000
• How many people now how DTrace works in any detail? ~100
![Page 35: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/35.jpg)
DTrace case study - Customers are not the enemy
• Scenario: Customer system reporting terrible performance
• NFS server
• SAN backend
• ~ 300 Mb/s
• 4 socket, 6 cores per socket
• Symptoms: Latency is “bad”, our customers are angry. Your product sucks
https://www.infoq.com/presentations/dtrace
Customers don’t report details like latency is twice average specified in service level agreement!
![Page 36: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/36.jpg)
DTrace case study - Performance goals
• Why work isn’t being done?
• Thread not being schedule
• Queue of work isn’t being processed
• Get idle into the system
• Identify work being done inefficiently
• Identify where cycles are wasted doing the wrong thing
Useful DTrace probes of-cpu
![Page 37: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/37.jpg)
DTrace case study - Measuring NFS latency
• Analysis: Develop DTrace scripts for measuring NFS latency
nfsv3:::op-read-start,nfsv3:::op-write-start{ self->ts = timestamp}
nfsv3:::op-read-done,nfsv3:::op-write-done{ @[probename == “op-write-done” ? “write” : “read” = quantize(timestamp - self.ts); self.ts = 0;}
quantize
![Page 38: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/38.jpg)
DTrace case study
Synchronousnfsv3:::op-write-start{ self->sync = args[2]->stable != 0;}
Cache misssdt:::arc-miss,sdt:::blocked-read{ self->io = 1;}
Cached, synchronous and asynchronous read/writes very different performance - different populations
![Page 39: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/39.jpg)
DTrace case study - synchronous write
• Average latency looks pretty good
![Page 40: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/40.jpg)
DTrace case study - synchronous write
Average latency not so useful - outliers dominate
Linear Log
![Page 41: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/41.jpg)
DTrace case study - summary
• Instrumentation with DTrace identified issues with…
• Throttling, inefficiencies in code and a heavily contented lock
• Lessons learnt
• Need to analyse performance on real-world systems
• Right tools, right questions
• Remedies without analysis don’t work and antagonise customers
• Systems shouldn't absorb all the pain
![Page 42: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/42.jpg)
DTrace in the real world
• DTrace Toolkit
• http://www.brendangregg.com/dtracetoolkit.html
• DTrace one liners
• http://www.brendangregg.com/DTrace/dtrace_oneliners.txt
• https://wiki.freebsd.org/DTrace/One-Liners
![Page 43: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/43.jpg)
One tool to rule them all…
• In 2017 in production means a distributed system
• Gap opened up once again between the types of in-production analysis we’d like to perform and available tools
• Distributed DTrace - see CADETS project
• Tracing internode flows?
• Volumes of data
• Security?
• What about heterogenous environments?
• Need to instrument Linux, Ilumos, FreeBSD environments seamlessly
![Page 44: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/44.jpg)
Linux: One tool to rule them all… not so much
![Page 45: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/45.jpg)
ftrace and perf
• ftrace
• Tracing utility built directly into the Linux kernel
# cd /sys/kernel/debug/tracing# cat available_tracers
• To enable tracer:
# echo function_graph > current_tracer
![Page 46: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/46.jpg)
perf
• perf - instruments "events", which are a unified interface for different kernel instrumentation frameworks.
• Hardware Events: These instrument low-level processor activity based on CPU performance counters. For example, CPU cycles, instructions retired, memory stall cycles, level 2 cache misses, etc. Some will be listed as Hardware Cache Events.
• Software Events: These are low level events based on kernel counters. For example, CPU migrations, minor faults, major faults, etc.
![Page 47: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/47.jpg)
perf (cont)
• perf - instruments "events", which are a unified interface for different kernel instrumentation frameworks.
• Tracepoint Events: This are kernel-level events based on the ftrace framework. These tracepoints are placed in interesting and logical locations of the kernel, so that higher-level behavior can be easily traced. For example, system calls, TCP events, file system I/O, disk I/O, etc. These are grouped into libraries of tracepoints; eg, "sock:" for socket events, "sched:" for CPU scheduler events.
• Dynamic Tracing: Software can be dynamically instrumented, creating events in any location. For kernel software, this uses the kprobes framework. For user-level software, uprobes.
• Timed Profiling: Snapshots can be collected at an arbitrary frequency, using perf record -FHz. This is commonly used for CPU usage profiling, and works by creating custom timed interrupt events.
![Page 48: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/48.jpg)
SystemTap
• Scripting language and tool for dynamically instrumenting running production Linux kernel-based operating systems
• stap -v -e 'probe vfs.read {printf("read performed\n"); exit()}'
virtual file system read
• “safe to use in production?” “yes”
https://sourceware.org/systemtap/wiki/SystemtapDtraceComparison
![Page 49: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/49.jpg)
BPF (nee eBPF)
• Berkley Packet Filter (PBF)
• Interface to data link layers to capture raw packets (tcpdump)
• BPF virtual machine executes user scripts to filter packets
• JIT BPF instructions to native code - to improve performance
• Bytecode Probe Framework (maybe?)
• DTrace-like facilities attaching BPF scripts to tracepoints: kprobes, uprobes, tracepoints (soon)
• bcc front-end (and others)
• What can we do with BPF?
• “crazy stuff”
Designed for tracing 10GbE networking - highly performant
Walk stacks…
![Page 50: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/50.jpg)
Bytecode Probe Framework
BPF
kprobes
uprobes
tracepoints
maps
KernelUserland
BPF bytecode
per-event data
statistics
generate load
async read
perf output
per-event output and in-kernel summarisation
![Page 51: Performance analysis and troubleshooting using DTrace](https://reader038.vdocuments.us/reader038/viewer/2022110108/58b893241a28ab3e3a8b6197/html5/thumbnails/51.jpg)
Any questions?
Tracing tools quickly outperform log based analysis in production
Vast array of tracing tools in Linux and BSD-based operating systems
DTrace provides unified tracing with guaranteed safety in production
All tracing mechanisms introduce probe effects
Probe effects can change the very behaviour we are investigating
Measure (on the real system), analyse and remediate - don’t guess
DTrace on Linux isn’t production ready but other tools such as SystemTap and eBPF are roughly equivalent