lifting the veil - reading java bytecode
DESCRIPTION
Java Bytecode ExplainedTRANSCRIPT
Lifting The Veil – Reading Java Byte Code
Alexander Shopov
Alexander Shopov
By day: Software Engineer at CiscoBy night: OSS contributorCoordinator of Bulgarian Gnome TP
Contacts:E-mail: [email protected]: [email protected]: http://www.linkedin.com/in/alshopovGoogle: Just search “al_shopov“
Please Learn And Share
License: CC-BY v3.0Creative Commons Attribution v3.0
Disclaimer
My opinions, knowledge and experience!
Not my employer's.
Contents
● Why read?● How to read?
● JVM Internals;● JVM Data Types;● JVM Opcodes.
● Let's read some code.● What next?
Why Read Byte code?
● Understand your platform● It is interesting and not too hard● How does Java function? How does X function?● Job interviews● Catch compiler bugs/optimizations● Learn to read before you write● Source may not correspond to binary● C/C++ people know their assembler● Java language evolution vs. Java platform evolution
Bad News And Good News
Bad:We will be
reading assembler
Good:Easiest
assembler in world
What Is The JVM?
● Stack based, byte oriented virtual machine without registers easily implementable on 32 bit hardware.
● 206 (<256) instructions that are easy to group and there is no need to remember them all
● Some leeway in implementations (even with Oracle)
Dramatis Personæ
● The JVM● The threads● The frames● The stacks – LIFO● The local variables – array of slots● The runtime constant pool – array of values● The bytecode – the instructions● Class files – serialized form of constants and byte
code
Enter JVM
JVM OS process
Enter Threads
Thr
ead
A
Thr
ead
B
Thr
ead
C
Thr
ead
D
Enter Frames
Thr
ead
A
Thr
ead
B
Thr
ead
C
Thr
ead
D
F0
F1
F2
F3
F4
F0
F1
F2
F0
F1
F0
F1
F2
F3
Enter Frames, Really!
F0
F1
F2
F3
F4
F0
F1
F2
F0
F1
F0
F1
F2
F3
What Is A Frame Actually?
F0
Let's Peek Inside A Frame
F0
F0
Enter Local Variables
0 1 2 3 4 5 6 …
Local variables
F0
Enter Stack
0 1 2 3 4 5 6 …
Local variables
Stack
F0
Enter Pool Of Constants
Pool of constants
0 1 2 3 4 5 6 …
Local variables
Stack
F0
Where Is The Code?
Pool of constants
0 1 2 3 4 5 6 …
Local variables
Stack
JVM (heap)
F0
Where Is The Code?
Pool of constants
0 1 2 3 4 5 6 …
Local variables
Stack
JVM (heap)
F0
Where Is The Code?
Cla
ss
Pool of constants
0 1 2 3 4 5 6 …
Local variables
Stack
Method codePC
Class
JVM (heap)
F0
Where is the code?
6
Cla
ss
Pool of constants
0 1 2 3 4 5 6 …
Local variables
Stack
Method codePC
Class
JVM (heap)
F0
Load
6
Cla
ss
Pool of constants
0 1 2 3 4 5 6 …
Local variables
Stack
6
Method codePC
Class
JVM (heap)
F0
And…
6
Cla
ss
Pool of constants
0 1 2 3 4 5 6 …
Local variables
8
Stack
6
Method codePC
Class
JVM (heap)
F0
Store
6 8
Cla
ss
Pool of constants
0 1 2 3 4 5 6 …
Local variables
8
Stack
6
Method codePC
Class
JVM Datatypes
● Primitive types● Java { numeric – integral: byte (±8), short (±16),
int (±32), long (±64), char (+16), floating point: float (±32), double (±64); boolean (int or byte) }
● returnAddress – pointers to the opcodes of JVM (jumps - loops)
● Reference types● class, array, interface● null
JVM Datatypes Descriptors
Java type Type descriptor
boolean Z
char C
byte B
short S
int I
float F
long J
double D
Object Ljava/lang/Object;
byte[] [B
String[][] [[Ljava/lang/String;
void V
JVM Method Descriptors
Source Code Method declaration
Method Descriptor
void m1(int i, double d, float f)
(IDF)V
byte[] m2(String s) (Ljava/lang/String;)[B
Object m3(int[][][] i) ([[[I)Ljava/lang/Object;
boolean[] m4()
JVM Method Descriptors
Source Code Method declaration
Method Descriptor
void m1(int i, double d, float f)
(IDF)V
byte[] m2(String s) (Ljava/lang/String;)[B
Object m3(int[][][] i) ([[[I)Ljava/lang/Object;
boolean[] m4() ()[B
(Ljava/lang/Object;Ljava/lang/Long;)J
JVM Method Descriptors
Source Code Method declaration
Method Descriptor
void m1(int i, double d, float f)
(IDF)V
byte[] m2(String s) (Ljava/lang/String;)[B
Object m3(int[][][] i) ([[[I)Ljava/lang/Object;
boolean[] m4() ()[B
long m5(Object, Long) (Ljava/lang/Object;Ljava/lang/Long;)J
206 instructions
DON'T PANIC!
Level 1 – Do Nothing/1
● nop
Level 2 – Load Constants/20
● aconst_null, ● iconst_m1, iconst_0, iconst_1, iconst_2, iconst_3,
iconst_4, iconst_5● lconst_0, lconst_1, ● fconst_0, fconst_1, fconst_2● dconst_0, dconst_1● bipush, sipush – 1, 2 bytes● ldc, ldc_w, ldc2_w – load from index in constant
pool 1,2,2 bytes for index
Level 3 – Load Variables/33
● iload, lload, fload, dload, aload● iload_0, iload_1, iload_2, iload_3, lload_0,
lload_1, lload_2, lload_3, fload_0, fload_1, fload_2, fload_3, dload_0, dload_1, dload_2, dload_3, aload_0, aload_1, aload_2, aload_3
● iaload, laload, faload, daload, aaload, baload, caload, saload – consume reference to array and int index in it
Level 4 – Conversions/15
● i2l, i2f, i2d, l2i, l2f, l2d, f2i, f2l, f2d, d2i, d2l, d2f, i2b, i2c, i2s
Level 6 – Maths/37
● iadd, ladd, fadd, dadd, isub, lsub, fsub, dsub, imul, lmul, fmul, dmul, idiv, ldiv, fdiv, ddiv, irem, lrem, frem, drem, ineg, lneg, fneg, dneg, ishl, lshl, ishr, lshr, iushr, lushr, iand, land, ior, lor, ixor, lxor
● Iinc - increment local variable #index by signed byte const
Level 7 – Stores/33
● istore, lstore, fstore, dstore, astore, istore_0, istore_1, istore_2, istore_3, lstore_0, lstore_1, lstore_2, lstore_3, fstore_0, fstore_1, fstore_2, fstore_3, dstore_0, dstore_1, dstore_2, dstore_3, astore_0, astore_1, astore_2, astore_3, iastore, lastore, fastore, dastore, aastore, bastore, castore, sastore
Level 8 – No-branch Comparisons/5
● lcmp, fcmpl, fcmpg, dcmpl, dcmpg (beware NaN)
Level 9 – Objects/15
● getstatic, putstatic● getfield, putfield● invokevirtual, invokespecial, invokestatic,
invokeinterface● new, newarray, anewarray● arraylength● athrow● checkcast, instanceof (difference is treatment of
null)
Level 10 – Return/6
● ireturn, lreturn, freturn, dreturn, areturn, return
165 of 206
81%
We Have Enough Mana/Resources!
Let's dive in bytecode!
Enter Bytecode
javap – your only true friend now
javap -classpath PATH -p -c -l -s CLASS
Example 1
public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: istore_3 4: iload_3 5: iload_2 6: iadd 7: istore_3 8: iload_3 9: ireturn
Example 1
public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: istore_3 4: iload_3 5: iload_2 6: iadd 7: istore_3 8: iload_3 9: ireturn
public static int whatIsThis (int a, int b, int c) {int result = a + b;result += c;return result;}
Example 2
public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: iload_2 4: iadd 5: ireturn
Example 2
public static int whatIsThis(int, int, int); Signature: (III)I Code: 0: iload_0 1: iload_1 2: iadd 3: iload_2 4: iadd 5: ireturn
public static int whatIsThis (int a, int b, int c) {result a + b + c;}
Example 3
public static int whatIsThis(int, float, double); Signature: (IFD)I Code: 0: iload_0 1: i2f 2: fload_1 3: fadd 4: f2d 5: dload_2 6: dadd 7: d2i 8: ireturn LineNumberTable: line 6: 0 LocalVariableTable: Start Length Slot Name Signature 0 9 0 a I 0 9 1 b F 0 9 2 c D
Example 3
public static int whatIsThis(int, float, double); Signature: (IFD)I Code: 0: iload_0 1: i2f 2: fload_1 3: fadd 4: f2d 5: dload_2 6: dadd 7: d2i 8: ireturn LineNumberTable: line 6: 0 LocalVariableTable: Start Length Slot Name Signature 0 9 0 a I 0 9 1 b F 0 9 2 c D
public static int whatIsThis (int a, float b, double c) {
return (int) (a + b + c);}
Example 4
public static void main(java.lang.String[]); Signature: ([Ljava/lang/String;)V Code: 0: getstatic #16 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #22 // String BGOUG 5: invokevirtual #24 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return
More verbosity
javap -v -classpath PATH -p -c -l -s CLASS
Example 4
Constant pool: #1 = Class #2 // org/kambanaria/readbytecode/bgoug/Example4 #2 = Utf8 org/kambanaria/readbytecode/bgoug/Example4… #16 = Fieldref #17.#19 // java/lang/System.out:Ljava/io/PrintStream;… #22 = String #23 // BGOUG #23 = Utf8 BGOUG #24 = Methodref #25.#27 // java/io/PrintStream.println:(Ljava/lang/String;)V…
Example 4
public static void main(java.lang.String[]); Signature: ([Ljava/lang/String;)V Code: 0: getstatic #16 // Field java/lang/System.out:Ljava/io/PrintStream; 3: ldc #22 // String BGOUG 5: invokevirtual #24 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 8: return
public static void main (String[] args){
System.out.println("BGOUG");}
// Hello, BGOUG!
Example 5
public char[] whatIsThis(); Code: 0: aload_0 1: getfield #12 // Field content:[C 4: areturn
public static void main(java.lang.String[]); Code: 0: getstatic #22
java/lang/System.out:Ljava/io/PrintStream; 3: new #1 // class
org/kambanaria/readbytecode/bgoug/Example5 6: dup 7: invokespecial #28 // Method "<init>":()V 10: invokevirtual #29 // Method whatIsThis:()[C 13: invokestatic #31 // Method
java/util/Arrays.toString:([C)Ljava/lang/String; 16: invokevirtual #37 // Method
java/io/PrintStream.println:(Ljava/lang/String;)V 19: return
Example 5
public char[] whatIsThis(); Code: 0: aload_0 1: getfield #12 // Field content:[C 4: areturn
public static void main(java.lang.String[]); Code: 0: getstatic #22
java/lang/System.out:Ljava/io/PrintStream; 3: new #1 // class
org/kambanaria/readbytecode/bgoug/Example5 6: dup 7: invokespecial #28 // Method "<init>":()V 10: invokevirtual #29 // Method whatIsThis:()[C 13: invokestatic #31 // Method
java/util/Arrays.toString:([C)Ljava/lang/String; 16: invokevirtual #37 // Method
java/io/PrintStream.println:(Ljava/lang/String;)V 19: return
public char[] whatIsThis() { return content; }
Example 5
public char[] whatIsThis(); Code: 0: aload_0 1: getfield #12 // Field content:[C 4: areturn
public static void main(java.lang.String[]); Code: 0: getstatic #22
java/lang/System.out:Ljava/io/PrintStream; 3: new #1 // class
org/kambanaria/readbytecode/bgoug/Example5 6: dup 7: invokespecial #28 // Method "<init>":()V 10: invokevirtual #29 // Method whatIsThis:()[C 13: invokestatic #31 // Method
java/util/Arrays.toString:([C)Ljava/lang/String; 16: invokevirtual #37 // Method
java/io/PrintStream.println:(Ljava/lang/String;)V 19: return
public static void main (String[] args) { System.out.println(
Arrays.toString( new Example5(). whatIsThis()));}
Level 11 – Stack/9
● pop a ➔● pop2 ba ➔● dup a aa➔● dup_x1 ba aba➔● dup_x2 cba acba➔● dup2 ba baba➔● dup2_x1 cba bacba➔● dup2_x2 dcba badcba➔● swap ba ab➔
Example 6
public void whatIsThis(java.lang.String); Code: 0: aload_1 1: ifnonnull 12 4: new #18 // class
java/lang/NullPointerException 7: dup 8: invokespecial #20 // Method
java/lang/NullPointerException."<init>":()V 11: athrow 12: aload_0 13: aload_1 14: putfield #21 // Field
s:Ljava/lang/String; 17: return
Example 6
public void whatIsThis(java.lang.String); Code: 0: aload_1 1: ifnonnull 12 4: new #18 // class
java/lang/NullPointerException 7: dup 8: invokespecial #20 // Method
java/lang/NullPointerException."<init>":()V 11: athrow 12: aload_0 13: aload_1 14: putfield #21 // Field
s:Ljava/lang/String; 17: return
public void whatIsThis(String s) { if (null == s) { throw new NullPointerException(); } this.s = s;}
Level 12 – conditions, branches, loops/19
● ifeq, ifne, iflt, ifge, ifgt, ifle● if_icmpeq, if_icmpne, if_icmplt, if_icmpge,
if_icmpgt, if_icmple● if_acmpeq, if_acmpne● ifnull, ifnonnull● goto, jsr, ret
193 of 206
94%
Example 7
public static int parse(java.lang.String); Code: 0: aload_0 1: invokestatic #16 // Method java/lang/Integer.parseInt:(Ljava/lang/String;)I 4: ireturn 5: astore_1 6: iconst_0 7: ireturn Exception table: from to target type 0 4 5 Class java/lang/NumberFormatException
public static int parse(String s) {try {
return Integer.parseInt(s);} catch (NumberFormatException e) {
return 0;}
}
Example 8
public class org.kambanaria.readbytecode.bgoug.Example8 { static final boolean $assertionsDisabled; static {}; Code: 0: ldc #1 // class org/kambanaria/readbytecode/bgoug/Example8 2: invokevirtual #10 // Method java/lang/Class.desiredAssertionStatus:()Z 5: ifne 12 8: iconst_1 9: goto 13 12: iconst_0 13: putstatic #16 // Field $assertionsDisabled:Z 16: return
public class Example8 {private static String repeat(String s){
assert s != null;return s + s;
}}
Example 8 private static java.lang.String repeat(java.lang.String); Code: 0: getstatic #16 // Field $assertionsDisabled:Z 3: ifne 18 6: aload_0 7: ifnonnull 18 10: new #28 // class java/lang/AssertionError 13: dup 14: invokespecial #30 // Method java/lang/AssertionError."<init>":()V 17: athrow 18: new #31 // class java/lang/StringBuilder 21: dup 22: aload_0 23: invokestatic #33 // Method java/lang/String.valueOf:(Ljava/lang/Object;)Ljava/lang/String; 26: invokespecial #39 // Method java/lang/StringBuilder."<init>":(Ljava/lang/String;)V 29: aload_0 30: invokevirtual #42 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 33: invokevirtual #46 // Method java/lang/StringBuilder.toString:()Ljava/lang/String; 36: areturn
Now You Know
Beware Asserts In Public Methods!
Further resources
● Oracle: The JVM Specification, Java SE 7 Edition
● A. Arhipov: Java Bytecode For Discriminating Developers
● Wikipedia: Java Bytecode Instruction Listings● S. H. Park Understanding JVM Internals● C. McGlone:
Looking "Under the Hood" with javap● P. Haggar: Java bytecode
Presentation background
● Alexander Wilms: Hexagons