splunking the jvm (java virtual machine)
TRANSCRIPT
Copyright © 2012 Splunk Inc.
Splunking the Java Virtual Machine(JVM)
Presented by Damien Dallimore
Developer Evangelist at Splunk
About me
• Developer Evangelist at Splunk since July 2012• http://dev.splunk.com• http://splunk-base.splunk.com• Slides available for my “Using the Java SDK” session
• Splunk Community Member• Splunk for JMX• SplunkJavaLogging• SplunkBase Answers
• Splunk Architect and Administrator
• Coder, hacker, architect of Enterprise Java solutions around the globe in many different industries(aviation, core banking, card payments etc…)
• If Splunk had been there at the start of my career I would have a lot more hair today
2
Agenda
• The JVM Landscape
• JVM Machine Data
• Splunk for JMX
• Community Projects – call to arms
• Questions (feel free to yell out at any time also)
The JVM Landscape
What is this JVM thing ?
• Circa 1991, Dr. James Gosling at Sun started developing a technology for next generation smart devices/appliances
• “Green” became “Oak” which became “Java”
• Java 1.0 first appeared in January 1996.
• The JVM is a virtual machine that runs programs that are compiled into Java bytecode
• Available for many hardware and software platforms
• 17 years later , the JVM has evolved from a consumer device technology, to a browser oriented technology with the explosion of the web , to now becoming deeply rooted in the enterprise software landscape on the server side and in the cloud
5
17 years later
6
• Oracle took ownership of Java from Sun in January 2010
• The Java Community Process(JCP) is the forum where members develop specifications for Java technology
• Java Specification Requests(JSR) get submitted for new features, are reviewed and then voted on by the JCP Executive committee.
• Editions• Embedded Java, Java ME , Java SE , Java EE
• Current Version is Java 7 (Dolphin)• Java 8 scheduled for 2013
Application Servers Enterprise Service Buses Databases
NoSQL Distributed Big Data Web Servers
Directory Servers Search Engines Build Systems
Gaming Platforms Trading Systems Reservation Systems
Core Banking Messaging Infrastructure Proprietary Systems
JVM Variants
• Oracle Hotspot (formerly SUN)• the primary reference JVM implementation
• Oracle JRockit (formerly BEA)• free since May 2011• code base currently being merged with Hotspot, ETA ~JDK 8
• Open JDK• SUN open sourced Hotspot and the Java class library in 2006• Slight differences with Oracle Java still• OpenJDK is the official Java SE7 Reference Implementation
• J9• IBM’s JVM for AIX, Linux, MVS, OS/400, Pocket PC, z/OS
• Azul Systems Zing• based on HotSpot• supports memory heaps up to 512 GB without GC pauses and is able to grow and shrink
the heap based on load
http://en.wikipedia.org/wiki/List_of_Java_virtual_machines
7
The JVM has a healthy future
• Hotspot / JRockit code merge creating a best of breed JVM, Oracle to contribute this to OpenJDK
• OpenJDK is thriving, Oracle are contributing and being good stewards of Java (despite initial
skepticism)
• Proliferation of alternative JVM languages that can all co-habitate in the JVM and new features in Java
8 to further enhance this multi language platform
• Scala
• Groovy
• Clojure
• The JVM is evolving organically with the shifting tides of Enterprise software, it isn’t about the “J”
anymore.
• From the clustered Application Server domination of the 00’s we now see an explosion of Big Data
products running in massively distributed environments on commodity hardware or in the cloud
• Apache Hadoop family (MapReduce, Hive, Hbase, Cassandra, HDFS)
8
What is running in JVMs ?
9
JVM “Fanboi”
10
Dr. GoslingFanboi
Speaking of Java as a language as opposed to the JVM platform, James Gosling, the Father of Java, said "Most people talk about Java the language, and this may sound odd coming from me, but I could hardly care less." He went on to explain, "What I really care about is the Java Virtual Machine as a concept, because that is the thing that ties it all together."
JVM Machine Data
JVM Machine Data
12
Custom Developed CodeWAR file
Application CodeTomcat
JVMHotspot
Operating SystemLinux
• The JVM footprint cross cuts the data centre and represents a massive source of valuable machine data
• Large scale Application/Web Server clusters
• Hadoop & Cassandra Node topologies in the 100’s and in some cases 1000’s
JMX, SNMP, HPROF,GC Logs, Custom Agents, Usage Tracker
JMX, Application Logs
JMX, Developer Logs, Splunk Java SDK, SplunkJavaLogging
JVM process OS resource metrics
CO
RR
ELA
TE
Application & Developer Logs
13
• Application logs• default logs that are part of the
product• Developer logs
• any custom code created and deployed to the application that has it’s own logging
• Written to local disk or a mounted network volume
• Monitor with a Splunk UF
Splunk Indexer
Splunk Universal Forwarder
Monitor Log Files/ Directorys
Developed Code
Application
JVM
OS
Splunk Java SDK / SplunkJavaLogging
14
Splunk Indexer
Developed Code
Application
JVM
OS
HTTP$REST$/$TCP$/$UDP • Alternative to writing to log file or needing to deploy a Splunk Universal Forwarder
• Use the Splunk Java SDK to input events directly to Splunk via HTTP Rest.
• Use SplunkJavaLogging to input events directly to Splunk using custom logging appenders.
• Come to my “Using the Java SDK” session for more on this !!
JVM Process OS Metrics
15
• By JVM Process ID : Process State, Memory, CPU, Disk Usage, Disk I/O, Network I/O, File Descriptor Usage.
• Some OS metrics also exposed via JMX• Splunk for Unix and Linux• Splunk for Windows• Correlate this OS data across your JVM and
Application events ie: your JVM may have hung because of CPU starvation caused by some other process thrashing
Splunk Indexer
Splunk for Unix or Linux
Monitor Log Files &
Directorys
Developed Code
Application
JVM
OS
Poll output from
commands
Garbage Collection logs
16
Splunk Indexer
Splunk Universal Forwarder
Monitor GC Log Files
Developed Code
Application
JVM
OS
• Extended Hotspot JVM options
-verbose:gc-Xloggc:/home/damien/jvm_logs/gc.log-XX:+PrintGC-XX:+PrintGCTimeStamps-XX:+PrintGCDetails
• The log is written to at Garbage Collection time• Be careful , can affect performance• Need to perform field extractions in Splunk• GC metrics also available via JMX
54.736: [Full GC 54.737: [Tenured: 172798K->18092K(174784K), 2.3792658 secs] 257598K->18092K(259584K), [Perm : 20476K->20476K(20480K)], 2.4715398 secs] [Times: user=0.56 sys=0.05, real=0.07 secs]
Custom JVMTI Agents (Advanced)
17
Splunk Indexer
Splunk Universal
Forwarder
Monitor Agent Log Files
Developed Code
Application
JVM
OS
REST/TCP/UDP
• Java Virtual Machine Tool Interface• Write custom agents that get injected into
the natively running JVM• Dynamically inspect the state of applications
running in the JVM• Profiling, debugging, monitoring, thread/me
mory analysis…the JVMTI Interface has extensive coverage
• As you write the agent code , the data output can be file based or over the network
Usage Tracker for Oracle JVMs
18
Splunk Indexer
Splunk Universal
Forwarder
Usage Tracker Log Files
Developed Code
Application
JVM
OS
UDP
• Enable via a JVM system property and a config file
-Dcom.oracle.usagetracker.config.file=/path/usagetracker.properties
• Output to CSV file or over UDPVM start,Fri Oct 22 14:13:03 BST 2010,examplehost/192.0.2.0,AppName,/path/to/jre,1.7.0,19.0-b09,Oracle Corporation,Oracle Corporation,Linux,i386,2.6.29.x86_64,-Xmx128m,/opt/programs,user.home=/home/username foo.bar=null
• All these metrics also available via JMX
SNMP
19
• The JVM SNMP Agent provides a single MIB that exposes the JVM’s Management and Monitoring API
http://docs.oracle.com/javase/1.5.0/docs/guide/management/JVM-MANAGEMENT-MIB.mib
• Setup the JVM (just the basic settings shown)
Open a UDP Port : -Dcom.sun.management.snmp.port=9004 Configure the ACL : $JAVA_HOME/jre/lib/management/snmp.acl
• Traps can be caught locally to file and monitored• A scripted input on the Splunk UF can poll the JVM SNMP Objects
pysnmp python module : http://pysnmp.sourceforge.net
snmpget command : http://www.net-snmp.org/docs/man/snmpget.html
There is a nice example of this on SplunkBase
Splunk Indexer
Splunk Universal
ForwarderDeveloped Code
Application
JVM
OS
SNMP%Objects%Polled
JVM MIB
snmptrapd UDP:162
SNMP%Traps%wri6 en%to%file
HPROF Profiling Dumps
20
Splunk Indexer
Splunk Universal
Forwarder
Binary HPROF dump file
Developed Code
Application
JVM
OS
Monitor and decode into
textual key=value pairs
• Binary JVM dumps that allow for deeper JVM resource inspection
• Typical use case is diagnosing memory issues after JVM crashes with java.lang.OutOfMemoryError
• Binary file is usually batch loaded into a third party memory analysis tool like Eclipse MAT
• Generate a heap dump on demand via JMX• Or tell the JVM to generate a heap dump under certain
conditions :• -XX:HeapDumpPath=./java_pid<pid>.hprof• -XX:-HeapDumpOnOutOfMemoryError
• But what if we could Splunk this awesome source of information, this could be really useful in dev/test !
Warning : heap dumping is an expensive operation as a full GC gets performed
Splunk HPROF Decoder
21
• A scripted input that monitors for HPROF file dumps , reads the binary file in and rolls it out into key=value format for Splunking
• Deploy the Scripted input to a Universal Forwarder• Use Splunk for JMX to periodically trigger an HPROF dump via a JMX operation• Splunk is now a JVM Heap Profiling utility• Diagnose Heap issues before they hit production• Splunk for JMX can tell you that the Heap is growing• This will tell you what is causing the growth
Splunk Universal
Forwarder
Binary HPROF dump fileJVM
Monitor and decode into
textual key=value pairsTrigger HPROF file generation
via a JMX operation
Splunk Heap Memory Analysis
22
JMX (Java Management Extensions)
23
Splunk Indexer
Developed Code
Application
JVM
OS
Splunk Universal
Forwarder
JMX
• Manage and Monitor the JVM and Application via exposed MBeans• JVM MBeans (java.lang domain)
• Vendor MBeans (most vendors ship their products with
extensive MBean coverage)
• Custom Coded MBeans (whatever your devs wish to
code)
• MBeans expose attributes, operations and notifications to give you a powerfully dynamic insight into the runtime state of the JVM and your application.
• Add Splunk to the mix for historical and realtimeoperational visibility, pro-active issue detection etc..
• Splunk for JMX app on SplunkBase
JMX vs SNMP
24
JMX• Open and easily extensible• Developers can simply create new MBeans• Vendor products(JBoss, Cassandra, Hadoop etc..) ship with thorough MBean coverage, not MIBs
SNMP• The built-in SNMP agent of the JVM is not extensible. • You will not be able to use it in order to expose your own custom MIB• If you do want to expose your own MIB, you’d have to create a custom agent
Putting it all together, JVM Splunking Nirvana
25
Splunk Indexer Cluster
Developed Code
Application
JVM
OS
JMX
HPROF
OS*Metrics/Logs
Splunk Forwarder
Logs
JMX
REST/TCP/UDP
Auto Load Balanced
JMX
Logs
Distributed Search
Splunk for JMX
Splunk for JMX
• Connect to any local or remote JVM's JMX server, Hotspot/JRockit/IBM J9
• Query any MBean running on that server
• Extract any MBean attributes (simple, composite or tabular)
• Invoke MBean operations
• Write attributes and operation results out in a default key/value format, or plugin your own custom format, for SPLUNK indexing and searching
• Transport events over STD OUT(default), TCP, Syslog, Splunk REST endpoint or direct to file.
• Declare clusters of JVM's for larger scale JVM deployments
• Runs on *Nix and Windows
• Out of the box dashboards for common JVM MBeans
• Freely available from SplunkBase, all source code is on GitHub
27
Connectivity Options
Remote JMX interface
• rmi (JSR160 Standard Implementation and MX4J's JSR160 Implementation)
• iiop (JSR160 Standard Implementation and MX4J's JSR160 Implementation)
Direct Process attachment
• Connect directly to a locally running JVM process
MX4J HTTP connectors (requires MX4J in the target JVM also)
• soap , soap+sssl
• hessian, hessian+ssl
• burlap, burlap+ssl
28
Setup and Configuration
The main goal of the app was to make it as simple and intuitive as possible to connect to your JVMs and start Splunking JMX data
• Enable your target JVM’s remote JMX interface , test connectivity with JConsole
• Install Splunk for JMX• Set your SPLUNK_HOME , JAVA_HOME environment variables, JRE 6+ required• Extract Splunk for JMX tarball to SPLUNK_HOME/etc/apps• Restart Splunk• At the setup screen, choose a scripted input for your platform (Nix / Windows)
• Setup your JMX configuration file• The default config.xml file is pre configured for common JVM MBeans• Browse your JVM (using JConsole) for other MBeans that you wish to poll and configure
these• You can have as many config files as you require, and you might set these up to fire off at
different scheduled frequencies
29
Configuration Examples - Simple
30
• MBean Object name format “domain:key=value,key2=value2”• * and ? wildcards are supported in the Mbean name
Around 25KBytes per dump on Hotspot JVMs
Configuration Examples - Clusters
31
• Define clusters of JVM’s that share the same MBean definitions
• Note , in these examples, for brevity I am using “dumpAllAttributes” , but in production you’d want to pick and choose specific MBean attributes you are interested in, and perhaps split definitions over multiple files run at varying frequencies
Configuration Examples - Operations
32
• Invoke JMX operations that return a value or simply perform some action on the target JVM
• Operation definitions can take parameters
Use Case 1 : your developers might code a JMX operation that returns a CSV or JSON formatted snapshot of some metrics for Splunking
Use Case 2 : dynamically trigger HPROF dumps.The “com.sun.management:type=HotSpotDiagnostic” Mbean exposes a “dumpHeap” operation
Configuration Examples - Connecting
33
• IP Address with credentials• Hostname• Static Process ID• Process ID lookup from file• Process ID lookup from command output• Raw JMX Service URL• MX4J HTTP Connector
Custom Formatters/Transports
34
• The Splunk for JMX configuration is user extensible• You can code and configure your own Formatters and Transports
Formatters• Takes the raw MBean polled output and formats it for Splunking• A Java implementation of the "com.dtdsoftware.splunk.formatter.Formatter" interface• If the optional formatter declaration is omitted, then the default formatter will be used
Transports • Takes the formatted output and transports it to a destination• A Java implementation of the "com.dtdsoftware.splunk.transport.Transport" interface• If the optional transport declaration is omitted, then the default transport(STD out) will be used
Formatter Examples
35
Transport Examples
36
Deployment Architectures 1
37
• Simplest scenario• Monolithic Splunk installation• Splunk for JMX polling 1 or more remote/local JVMs via
the remote JMX interface• There is support for many target JVM’s in the
configuration schema but to really scale out, you need a more advanced Splunk architecture
Deployment Architectures 2
38
Splunk UF running locally
with target JVM
Splunk Indexer
Cluster
Splunk Search Head
Pool
Load Balancer • Run Splunk UF locally with target JVM.Canconnect use remote JMX interface or direct process attachment.
• Each tier scales out horizontally.• Can overcome firewall issues that are
sometimes inherent with Java RMI• Deploy Splunk for JMX components and
configurations with Splunk Deployment Server, Puppet or Chef.
Community Projects – call to arms !!
Remember this slide ?
40
SplunkBase JVM Apps
41
• I’ve already started on some, but I can’t do it all myself !• You can use Splunk for JMX as the “kernel” upon which to build Splunk for Tomcat,
Splunk for JBoss, Splunk for Mule etc..• I have found that with most of the JVM apps that I have looked at or been asked to
build a Splunk app for, that most of the useful data is in the JMX metrics and operations
• Any this can of course be augmented with any useful log data• Build Simple/Advanced XML dashboards• Bundle up the app and post it on Splunkbase, share with the community and
perhaps someone else will create an app that you can use too• Note , you are publishing a common app so you can’t take into account any custom
developer code, just the metrics and logs that are inherent to the core JVM app
Contact Details
Always more than happy to be contacted for questions, feedback, collaborations, ideas that will change the world etc…
Email : [email protected]
SplunkBase: damiend
Github: damiendallimore
Twitter : @damiendallimore
Blog : http://blogs.splunk.com/dev
Splunk Dev Platform Team : [email protected]
Links
Splunk for JMX: http://splunk-base.splunk.com/apps/25505/splunk-for-jmx
SplunkJavaLogging: https://github.com/damiendallimore/SplunkJavaLogging
Splunk Java SDK: http://dev.splunk.com/view/java-sdk/SP-CAAAECN
Oracle Java: http://www.oracle.com/us/technologies/java/overview/index.html
Open JDK : http://openjdk.java.net/
JMX : http://www.oracle.com/technetwork/java/javase/tech/javamanagement-140525.html
Azul Zing : http://www.azulsystems.com/products/zing/whatisit
JVMTI : http://docs.oracle.com/javase/6/docs/technotes/guides/jvmti/
Usage Tracker : http://docs.oracle.com/javase/products/usagetracker.html
Usage Tracker w/ Splunk : http://javalandtales.blogspot.co.uk/#!/2012/05/using-java-usage-tracker-feature-with.html
43
Thanks for coming !