monitoring distributed (micro-)services
TRANSCRIPT
![Page 1: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/1.jpg)
Tracing distributed service calls: implementing APM
for the JVM
![Page 2: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/2.jpg)
Disclaimer
I am contracting for the APM vendor Instana and gained most of my experienceworking with APM while being with the company. In order to discuss what I havebeen factually working with, I cannot avoid showcasing the tool I helped to make. I am not paid to feature Instana in this presentation.
![Page 3: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/3.jpg)
I. InventoryII. Tracing (micro-)servicesIII. Implementing APMIV. Advanced topics
Outline
![Page 4: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/4.jpg)
Where we are coming from: the “distributed monolith”.
EAR EAR EAR EAR EAR EAR
node A node B node C
![Page 5: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/5.jpg)
Where we are coming from: distribution by cloning.
Sources: Pictures by IBM and Oracle blogs.
![Page 6: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/6.jpg)
Where we are coming from: inferred tracing.
Use of standard APIs: limits development to app server’s capabilities.Sacrifice freedom in development but ease operation.Standard APIs allow server to interpret program semantics.
Sources: Picture by IBM blog.
![Page 7: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/7.jpg)
Where we transition to: greenfield “micro”-services.
JAR JAR JAR JAR JAR JAR
service A service B service C
HTTP/protobuf HTTP/protobuf
Sources: Lagom is a Lightbend trademark, Spring Boot is a Pivotal trademark.
![Page 8: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/8.jpg)
Where we transition to: “The wheel of doom”
Twitter Hailo The Empire
Simple bits, complex interaction: death star topologies.
Sources: Screenshots from Zipkin on Twitter Blog and Hailo Blog. Death Star by “Free Star Wars Fan Art Collection”.
![Page 9: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/9.jpg)
Where we transition to: simple services, complex operation.
1. Writing distributed services can ease development but adds challenges on integration.2. Distributed services without standardized APIs cannot easily be observed in interaction.3. Distributed services make it harder to collect structured monitoring data.4. Distributed (micro-)services require DevOps to successfully run in production.
Sources: Picture by Reddit.
![Page 10: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/10.jpg)
JAR JAR
The next big thing: serverless architecture?
JAR JAR JAR JAR
dispatcher A dispatcher B dispatcher C
Sources: AWS Lambda is an Amazon trademark.
app servers:now with free vendor lock-in
![Page 11: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/11.jpg)
The next big thing: serverless architecture?
Sources: Picture by Golden Eagle Coin. Concept from: “silver bullet syndrome” by Hadi Hariri.
silver bullet syndrome
![Page 12: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/12.jpg)
I. InventoryII. Tracing (micro-)servicesIII. Implementing APMIV. Advanced topics
Outline
![Page 13: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/13.jpg)
trace collector
192.168.0.2/bar.jar – uid(B)
0ms
What information do we want?
192.168.0.1/foo.jar 192.168.0.2/bar.jar 192.168.0.3/MySQL
HTTP JDBC
77.44.250.1
HTTP
uid(A)
192.168.0.1/foo.jar – uid(A)
192.168.0.2 – uid(A)
192.168.0.3/MySQL – uid(B)
100ms 200ms
entry exit entry exit
uid(A) uid(B)
192.168.0.2/bar.jar – uid(B)uid(A) uid(A)
![Page 14: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/14.jpg)
How do we get it?
cs sr ss cr
span
trace
192.168.0.1/foo.jar – uid(A)
192.168.0.3/MySQL – uid(B)
192.168.0.2/bar.jar – uid(B)uid(A) uid(A)
{ query = select 1 from dual } annotation
Source: Logo from zipkin.io.
![Page 15: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/15.jpg)
How do we get it?
Source: Zipkin screenshot.
![Page 16: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/16.jpg)
Span.Builder span = Span.builder() .traceId(42L) .name("foo") .id(48L) .timestamp(System.currentTimeMillis());
long now = System.nanoTime(); // do hard work
span = span.duration(System.nanoTime() - now);// send span to server
How do we get it?
Several competing APIs: 1. Most popular are Zipkin (core) and Brave. 2. Some libraries such as Finagle (RPC) offer built-in Zipkin-compatible tracing. 3. Many plugins exist to add tracing as a drop-in to several libraries.4. Multiple APIs exist for different non-JVM languages.
![Page 17: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/17.jpg)
A standard to the rescue.
Source: Logo from opentracing.io.
Span span = tracer.buildSpan("foo") .asChildOf(parentSpan.context()) .withTag("bar", "qux") .start();
// do hard work
span.finish();
![Page 18: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/18.jpg)
A standard to the rescue.
Source: “Standards” by xkcd.
Problems:1. Single missing element in chain breaks entire trace.2. Requires explicit hand-over on every context switch.
(Span typically stored in thread-local storage.)
![Page 19: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/19.jpg)
I. InventoryII. Tracing (micro-)servicesIII. Implementing APMIV. Advanced topics
Outline
![Page 20: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/20.jpg)
Drop-in tracing.
public class TracingAgent { public static void premain(String arg, Instrumentation inst) { inst.addTransformer( (classLoader, typeName, type, pd, classFile) -> { if (shouldTraceClass(typeName)) { return addTracing(classFile); } else { return null; } }); }
private static boolean shouldTraceClass(String typeName) { return false; // TODO: implement }
private static byte[] addTracing(byte[] binary) { return binary; // TODO: implement } }
![Page 21: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/21.jpg)
High-level instrumentation with Byte Buddy.
Code generation and manipulation library:1. Apache 2.0 licensed.2. Mature: Over 2 million downloads per year.3. Requires zero byte-code competence.4. Safe code generation (no verifier errors).5. High-performance library (even faster than vanilla-ASM).6. Already supports Java 9 (experimental).7. Offers fluent API and type-safe instrumentation.
Check out http://bytebuddy.net and https://github.com/raphw/byte-buddy
![Page 22: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/22.jpg)
class Foo { String bar() { return "bar"; }}
assertThat(new Foo().bar(), is("Hello World!"));
public static void premain(String argument, Instrumentation instrumentation) { new AgentBuilder.Default() .type(named("Foo")) .transform( (builder, type, classLoader) -> builder.method(named("bar")) .intercept(value("Hello World!")); ) .installOn(instrumentation);}
Java agents with Byte Buddy
![Page 23: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/23.jpg)
class ServletAdvice {
@OnMethodEnter static void enter(@Argument(0) HttpServletRequest request) { String traceId = request.getHeader("X-Trace-Id"); String method = request.getMethod(); String uri = request.getRequestURI(); if (traceId != null) { ServletTracer.continueTrace(traceId, method, uri); } else { ServletTracer.startTrace(method, uri); } }
@OnMethodExit static void exit(@Argument(1) HttpServletResponse response) { ServletTracer.complete(response.getStatusCode()); } }
Inlining code with Byte Buddy advice
![Page 24: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/24.jpg)
public class ServletTraceAgent {
public static void premain(String arg, Instrumentation inst) { new AgentBuilder.Default() .type(isSubTypeOf(Servlet.class)) .transform( (builder, type, classLoader) -> builder.visit(Advice.to(ServletAdvice.class) .on(named("service"))); ).installOn(inst); } }
Inlining code with Byte Buddy advice
![Page 25: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/25.jpg)
public class ServletTraceAgent {
public static void agentmain(String arg, Instrumentation inst) { new AgentBuilder.Default() .disableClassFormatChanges() .with(AgentBuilder.RedefinitionStrategy.RETRANSFORMATION) .type(isSubTypeOf(Servlet.class)) .transform( (builder, type, classLoader) -> builder.visit(Advice.to(ServletAdvice.class) .on(named("service"))); ).installOn(inst); } }
Inlining code with Byte Buddy advice
![Page 26: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/26.jpg)
APM architecture: example of Instana
JARJS
JARPHPtraces/metricsm
etric
str
aces
feedback
![Page 27: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/27.jpg)
Trace view in Instana (example)
![Page 28: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/28.jpg)
Logical view in Instana (example)
![Page 29: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/29.jpg)
I. InventoryII. Tracing (micro-)servicesIII. Implementing APMIV. Advanced topics
Outline
![Page 30: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/30.jpg)
(Adaptive) sampling
![Page 31: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/31.jpg)
(Adaptive) sampling: events per second (without queue-bound)
![Page 32: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/32.jpg)
(Adaptive) sampling: marketing “X% overhead”
class MyApp { void foo() { while (true) { handleWebRequest(); } } }
class MyOtherApp { void foo() { while (true) { Thread.sleep(100L); } } }
![Page 33: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/33.jpg)
JIT-friendly tracing
JIT-optimized
C CJava
incoming outgoing
-XX:MaxInlineSize=35 (auto-reduced)-XX:FreqInlineSize=325-XX:InlineSmallCode=2000
-XX:MaxInlineLevel=9
![Page 34: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/34.jpg)
monomorphic bimorphic polymorphic megamorphic
direct link vtablelookup
(about 90%)
Most available tracers know three types of spans: client, server and local. This often leads to “trace call megamorphism” in production systems.
optimization
deoptimization
home of rumorsconditionaldirect link
(data structures) (but dominant targets)
JIT-friendly tracing: enforcing monomorphism
![Page 35: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/35.jpg)
JIT-friendly tracing: copy&paste monomorphism
class ServletAdvice {
@OnMethodEnter static void enter(@Argument(0) HttpServletRequest request) { String traceId = request.getHeader("X-Trace-Id"); String method = request.getMethod(); String uri = request.getRequestURI(); if (traceId != null) { ServletTracer.continueTrace(traceId, method, uri); } else { ServletTracer.startTrace(method, uri); } }
@OnMethodExit static void exit(@Argument(1) HttpServletResponse response) { ServletTracer.complete(response.getStatusCode()); } }
![Page 36: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/36.jpg)
Memory-friendly tracing
package com.twitter.zipkin.gen;
public class Span implements Serializable {
public volatile Long startTick; private long trace_id; private String name; private long id; private Long parent_id; private List<Annotation> annotations = emptyList(); private List<BinaryAnnotation> b_annotations = emptyList(); private Boolean debug; private Long timestamp; private Long duration;
// ... }
![Page 37: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/37.jpg)
Memory-friendly tracing: Zero-garbage tracer
Span per eventImmutable events
Some privitivesLinked list attachments
Allocation rate correlates with eventsVulnerable to false-sharing
User-thread centricScala-style model
Span (container) per threadFully mutable events
All primitives (void ids)Raw-data array annotations
Allocation rate correlates with sampled eventsEnsures thread-localityTracer-thread centric
Java-style model
![Page 38: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/38.jpg)
Span identification
incoming outgoing
class MyBatchFramework {
void doBatchJob() { // do hard work... } }
@com.instana.sdk.Span("trace-me")
foo()
![Page 39: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/39.jpg)
Context-switch tracing
incoming
outgoing
thread 1
thread 2
Requires explicit context hand-over upon each context-switch.
![Page 40: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/40.jpg)
Tracing sandboxed applications
class AccessController {
public static void checkPermission(Permission permission) throws AccessControlException {
AccessControlContext stack = getStackAccessControlContext();
// perform check based on stack }
class AccessController {
public static void checkPermission(Permission permission) throws AccessControlException {
if (isInstanaSandboxed()) { return; }
AccessControlContext stack = getStackAccessControlContext();
// perform check based on stack privileges }
![Page 41: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/41.jpg)
Testing instrumentation and trace collection
JAR
main(String[])TestCollector
![Page 42: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/42.jpg)
Java 9: challenges ahead of us
ClassLoader.getSystemClassLoader() .getResourceAsStream("java/lang/Object.class");
class MyServlet extends MyAbstractServlet class MyAbstractServlet extends Servlet
Applies class hierarchy analysis without using reflection API! (Cannot load types during instrumentation. Unless retransforming.)
Java 8
Java 9
URL
null
CHA is also required for inserting stack map frames. Byte Buddy allows for on-the-fly translation of such frames. This way, Bytre Buddy is often fasterthan vanilla ASM with frame computation enabled.
Byte Buddy automatically use loaded type reflection upon retransformation.
![Page 43: Monitoring distributed (micro-)services](https://reader031.vdocuments.us/reader031/viewer/2022022200/58a952991a28ab77408b511d/html5/thumbnails/43.jpg)
http://rafael.codes@rafaelcodes
http://documents4j.comhttps://github.com/documents4j/documents4j
http://bytebuddy.nethttps://github.com/raphw/byte-buddy