inside the kvm

78
TS-1507: Inside the K Virtual Machine 1 Inside the The K Virtual Machine (KVM) Frank Yellin Sun Microsystems, Inc.

Upload: neelakanta-kalla

Post on 18-Jan-2016

11 views

Category:

Documents


0 download

DESCRIPTION

KVM

TRANSCRIPT

Page 1: Inside the KVM

TS-1507: Inside the K Virtual Machine1

Inside the The K Virtual Machine (KVM)

Frank YellinSun Microsystems, Inc.

Page 2: Inside the KVM

2 TS-1507: Inside the K Virtual Machine

Goals of This Talk

• You should be able to• Read the source code to the KVM

without difficulty• Port the KVM• Debug your port• Understand some of the design decisions• You should want to get a copy of the

sources right away

Page 3: Inside the KVM

3 TS-1507: Inside the K Virtual Machine

KVM Design Goals

• A small footprint virtual machine for the Java™ platform (“Java virtual machine”) for resource constrained devices

• Build a Java VM that would:– Be easy to understand and maintain– Be highly portable,– Be small without sacrificing features of

the Java programming language

Page 4: Inside the KVM

4 TS-1507: Inside the K Virtual Machine

K Virtual Machine

• KVM is a “low tech” Java virtual machine– No dynamic compilation or other advanced

performance optimization techniques– Easy to read and port

• KVM is based on the Spotless system developed originally at Sun Labs

Page 5: Inside the KVM

5 TS-1507: Inside the K Virtual Machine

KVM Technical Facts

• Implemented in C– Core of the VM about 24,000 lines of well-

commented source code• Static size of VM executable

– 40-80 kbytes depending on the platform and compilation options (w/o romizing)

– On Palm and Win32 about 60 kB• Runs anywhere from 30-80% of

the speed of JDK™ 1.1 software without a JIT

Page 6: Inside the KVM

6 TS-1507: Inside the K Virtual Machine

KVM Portability

• Source code available under SCSL(Sun Community Source License)

• Release includes three KVM ports– Win32– PalmOS (3.01 or newer) – Solaris™ operating environment

• Sun’s early access partners and customers have ported the KVM to various other platforms– More than 25 ports have been done so far

Page 7: Inside the KVM

7 TS-1507: Inside the K Virtual Machine

KVM Design

• Unlike some other small Java technology implementations– KVM supports dynamic class loading and

regular class files– Supports JAR file format

• Supports full byte code set (basic word size 32 bits)

Page 8: Inside the KVM

8 TS-1507: Inside the K Virtual Machine

JVM™/KVM Compatibility

• General goal– Full Java language and Java virtual machine

specification compatibility• Main language-level difference

– Floating point not supported in CLDC 1.0• No hardware floating point support on most

devices due to space limitations

• Other differences (next slide) are mainly because– Libraries included in CLDC are limited– Can’t afford to use full J2SE™ platform

security model

Page 9: Inside the KVM

9 TS-1507: Inside the K Virtual Machine

Virtual Machine Compatibility

• VM implementation differences– No Java™ native interface (JNI)– No reflection– No thread groups– No weak references– No finalization– Limited error handling support– New implementation of bytecode verification

Page 10: Inside the KVM

10 TS-1507: Inside the K Virtual Machine

Features of the VM

• Memory model• Garbage collector• Interpreter• Frames• Threads and monitors• File loading• Verification• Security• Native methods

• Romizer• Palm-romizer• Inflater• Restartability• Port specific changes• Compiler flags• Porting dangers• In the future

Page 11: Inside the KVM

11 TS-1507: Inside the K Virtual Machine

Important Data Structures

• cell

• CLASS– INSTANCE_CLASS

– ARRAY_CLASS

• FIELD

• METHOD

• FIELDTABLE

• METHODTABLE

• OBJECT– INSTANCE

– ARRAY

– STRING_INSTANCE

• FRAME

• THREAD

• BYTEARRAY

• SHORTARRAY

• POINTERLIST

• HASHTABLE

Page 12: Inside the KVM

12 TS-1507: Inside the K Virtual Machine

Variable Length

typedef struct instanceStruct {INSTANCE_CLASS ofClass;MONITOR monitor;union {

cell *cellp;cell cell;

} data[1];}

#define SIZEOF_INSTANCE(n) \StructSizeInCells(instanceStruct) + (n - 1)

Page 13: Inside the KVM

13 TS-1507: Inside the K Virtual Machine

Hash Tables

Three system hashtables• ClassTable

– Package/base to CLASS• InternStringTable

– Char*/length to unique java.lang.String• UTFStringTable

– Char*/length to unique instance

Page 14: Inside the KVM

14 TS-1507: Inside the K Virtual Machine

UTFStringTable and UString

bucket count

next

len

key

“abcde....”

next

len

key

“java.lang”

next

len

key

“Connector”

Page 15: Inside the KVM

15 TS-1507: Inside the K Virtual Machine

UString Creation

getUString(const char *string)

getUStringX(const char *string,int length)

strcmp(x,y) == 0 iffgetUString(x) == getUString(y)

Page 16: Inside the KVM

16 TS-1507: Inside the K Virtual Machine

ClassTable and Classes

ofClass

monitor

base name

package name

next

superclass

field table

method table

constant pool

interfaces

ofClass

monitor

base name

package name

next

superclass

field table

method table

constant pool

interfaces

key key

ofClass

monitor

base name

package name

next

primitiveType

itemSize

GC type

count

Page 17: Inside the KVM

17 TS-1507: Inside the K Virtual Machine

Interned String Table

next

string

next

string

next

string

java.langString

monitor

charArray

offset

length

class [C

monitor

length

a b

c d

Page 18: Inside the KVM

18 TS-1507: Inside the K Virtual Machine

Key Spaces

• Classes (and field types)– Raw classes– Array classes

• Names– Field and method names– Class packages– Class base names

• Method signatures

Page 19: Inside the KVM

19 TS-1507: Inside the K Virtual Machine

Class Keys

• 0 - 0xff: primitive type– Integer = ‘I’; boolean = ‘Z’, etc

• 0x100 - 0x1fff: instance class– Find the item in classtable

• 0x2000 - 0xdfff: array class, dim <= 6– High three bits give the dimension,

• 0xe000 - 0xffff: array class, dim >= 7– Look up low 13 bits in the class table

Page 20: Inside the KVM

20 TS-1507: Inside the K Virtual Machine

Name Keys

1. Look up name in UTFStringTable

2. Get the key

getUTFString(name)->key

Page 21: Inside the KVM

21 TS-1507: Inside the K Virtual Machine

Method Signature

• 1. Encode signature into a string– Algorithm

• 1 byte gives number of arguments• 1-3 bytes for each argument• 1-3 bytes for the return value

– Encoding• Primitive types encoded as ‘A’ - ‘Z’• Non primitives encoded as high/low

unless high in range ‘A’ - ‘Z’, then use ‘L’2. getUTFStringX(str, len)->key

Page 22: Inside the KVM

22 TS-1507: Inside the K Virtual Machine

Class com.sun.cldc.io.j2me.datagram.Protocol {Datagram newDatagram(byte[], int, String);

}

3 0x20 ‘B’ ‘I’ 0x01 0x19 0x01 0x70

Signature Encoding

([BILjava/lang/String;)Ljavax/microedition/io/Datagram;

byte[] java.lang.String

javax.microedition.io.Datagram

Page 23: Inside the KVM

23 TS-1507: Inside the K Virtual Machine

Why Keys?

• Saves space for C strings• Makes comparisons faster• Much less spaceBut. . . .• Makes debugging more complicated• Debug functions exist to convert keys

to strings, and vice versa

Page 24: Inside the KVM

24 TS-1507: Inside the K Virtual Machine

Memory Model

• Flat address space• 32kb–64meg (but 2meg, realistically)• Special code for the Palm• Every heap-allocated object has a header• No explicit use of malloc() or free()

Page 25: Inside the KVM

25 TS-1507: Inside the K Virtual Machine

Object Header

Type24 Bits Size (in Words)ReservedMarked

32-bit Header Word

Page 26: Inside the KVM

26 TS-1507: Inside the K Virtual Machine

Header Types

• Objects that are visible to the user– GCT_INSTANCE

– GCT_ARRAY

– GCT_OBJECTARRAY

– GCT_INSTANCE_CLASS

– GCT_ARRAY_CLASS

Page 27: Inside the KVM

27 TS-1507: Inside the K Virtual Machine

More Header Types

• Internal objects allocated in heap– GCT_FIELDTABLE

– GCT_METHODTABLE

– GCT_MONITOR

– GCT_GLOBAL_ROOTS

– GCT_POINTERLIST

– GCT_FREE

– GCT_HASHTABLE

• 20 in total—we have space for 64

Page 28: Inside the KVM

28 TS-1507: Inside the K Virtual Machine

Memory Layout

FreePointer

Free Free

Free

Page 29: Inside the KVM

29 TS-1507: Inside the K Virtual Machine

Creating Roots

• Global roots (permanent)• Temporary roots (stack discipline)• Transient roots (non stack discipline)

Page 30: Inside the KVM

30 TS-1507: Inside the K Virtual Machine

Global Roots

• Created using makeGlobalRoot• Cannot be undone• Native code can create new roots

makeGlobalRoot(&globalVariable)

Page 31: Inside the KVM

31 TS-1507: Inside the K Virtual Machine

Temporary Roots

• Roots used in a stack-like manner• Can be nested

START_TEMPORARY_ROOTSMAKE_TEMPORARY_ROOT(x)...MAKE_TEMPORARY_ROOT(y)...

END_TEMPORARY_ROOTS

START_TEMPORARY_ROOT(x)code

END_TEMPORARY_ROOT

Page 32: Inside the KVM

32 TS-1507: Inside the K Virtual Machine

Transient Roots

• Non-stack like behavior• Special handling of NULL

i = makeTransientRoot(x)makeTransientRoot(y)

. . . . .

removeTransientRootByIndex(i);or

removeTransientRootByValue(y);

Page 33: Inside the KVM

33 TS-1507: Inside the K Virtual Machine

Allocating Objects

mallocBytes()

mallocHeapObject(size, type)

mallocObject(size, type)

callocObject(size, type)

instantiate(instance_class)

instantiateArray(arrayclass, count)

instantiateMultiArray(class, dims,count)

instantiateString(string, length)

Page 34: Inside the KVM

34 TS-1507: Inside the K Virtual Machine

Garbage Collecting

• Can happen any time an object is allocated

• EXCESSIVE_GARBAGE_COLLECTION

Page 35: Inside the KVM

35 TS-1507: Inside the K Virtual Machine

KVM Design

• Garbage collector– Based on a non-copying collector– Small and simple– Non-moving, non-incremental, single-space– Mark-and-sweep algorithm– Optimized for small heaps (32 - 512 kb)– Designed to limit recursion

• Will have an alternative, more advanced collector later

Page 36: Inside the KVM

36 TS-1507: Inside the K Virtual Machine

Non-Compacting Garbage Collectors• Advantages

– Does not move objects => simple and clean codebase

– Single space => uses less memory• Disadvantages

– Object allocation not so fast– Memory fragmentation can cause it to run out

of heap– Non-incremental => long GC pauses when

using large heaps

Page 37: Inside the KVM

37 TS-1507: Inside the K Virtual Machine

InterpreterCurrentThread

(current thread)

SP

FP

LP

IP

Java stack of thecurrent thread

(top of stack)

(current frame)

(locals of thecurrent frame)

(instruction pointer)

Straightforward bytecode

interpreter with five VM registers:UP thread pointerIP instruction pointerSP stack pointerFP frame pointerLP locals pointer

Runnable Threads

Page 38: Inside the KVM

38 TS-1507: Inside the K Virtual Machine

Optimizations

• Space optimizations– System class preloading (“romizing”) using

JavaCodeCompact– Runtime “simonizing” of immutable structures– Chunky stacks and segmented heap

• Performance optimizations– Quick byte codes– Monomorphic inline caching

• Optimizations configurable via conditional compilation, same as with debugging support

Page 39: Inside the KVM

39 TS-1507: Inside the K Virtual Machine

Class Loading

• Early interning of Strings• Early binding of Classes to Class

structure• Removal of all UTF Strings from

constant pool

Page 40: Inside the KVM

40 TS-1507: Inside the K Virtual Machine

Stack Frames

LP (parameters + local variables)

constPool

thisMethod

returnCode

previousFp

previousIp

syncObject

locals

FP (frame pointer)

Pointer to the current constant pool

Pointer to previous stack framePrevious instruction pointerPointer to locked object (if synchronized call)Pointer to local variables inside the frameOperand stack starts here

SP (top of Java stack)

What to do upon returning from method

‘this’

… operands …

… params …

… locals …

Page 41: Inside the KVM

41 TS-1507: Inside the K Virtual Machine

Thread Design

• KVM supports platform-independent multithreading (“green threads”)– Fully deterministic; no complex mutual

exclusion issues– This makes porting the KVM to new consumer

devices easier and faster– Substantially more portable and much cleaner

code base than with native threads– All active threads kept in a simple linked

queue, and given execution time based on priority

Page 42: Inside the KVM

42 TS-1507: Inside the K Virtual Machine

Threads and Monitors

Monitor Timer Queue

Waiters Condvars Owner

Page 43: Inside the KVM

43 TS-1507: Inside the K Virtual Machine

File Loading

• Loaderfile.c– Generic code for reading from classpath– Reads classes from directories or zip files– Supports “resources” from the classpath– Uses standard IO– Zip files read using standard IO

Page 44: Inside the KVM

44 TS-1507: Inside the K Virtual Machine

Generic File Loading

struct filePointerStruct;typedef struct filePointerStruct *FILEPOINTER;

FILEPOINTER openClassfile(const char* className);loadByte(FILEPOINTER);loadShort(FILEPOINTER);loadCell(FILEPOINTER);loadBytes(FILEPOINTER, char *buffer, int len);skipBytes(FILEPOINTER, unsigned int len);closeClassfile(FILEPOINTER);

initializeClassPath()

Page 45: Inside the KVM

45 TS-1507: Inside the K Virtual Machine

Reading Zip Files

• findJARdirectories()

• loadJARfile()

• inflate()

Page 46: Inside the KVM

46 TS-1507: Inside the K Virtual Machine

Zip Inflater

• Old implementation• Two new implementations

– FILE* contains compressed bytes – Char* containing compressed bytes

• Inflater can be used outside the KVM

Page 47: Inside the KVM

47 TS-1507: Inside the K Virtual Machine

64-Bit Support

• #DEFINE COMPILER_SUPPORTS_LONGS 1– long64, ulong64– NEED_LONG_ALIGNMENT

– NEED_DOUBLE_ALIGNMENT

• #Define COMPILER_SUPPORTS_LONGS 0– ll_mul, ll_div, ll_rem, – ll_shl, ll_shr, ll_ushr

– BIG_ENDIAN, LITTLE_ENDIAN

Page 48: Inside the KVM

48 TS-1507: Inside the K Virtual Machine

Security

• Cannot support full J2SE™ platform security model in CLDC target devices:– J2SE platform security model is much larger than the

entire CLDC implementation • CLDC security discussion consists of two parts:

!Low-level virtual machine security• An application running in the VM must not be able to harm

the device in any way• Guaranteed by the bytecode verifier (discussed later)

"Application-level security• Sandbox model

Page 49: Inside the KVM

49 TS-1507: Inside the K Virtual Machine

Security

• CLDC sandbox model requires that:– Classfiles have been properly verified and

guaranteed to be valid– Only a limited, predefined set of APIs are

available to the application programmer (as defined by the CLDC, profiles and licensee open classes)

– Downloading and management of applications takes place at the native code level, and the programmer cannot override the standard class loading mechanisms or the system classes of the virtual machine

Page 50: Inside the KVM

50 TS-1507: Inside the K Virtual Machine

Security

• Sandbox requirements continued– The set of native functions accessible to the

virtual machine is closed, meaning that the application programmer cannot download any new libraries containing native functionality, or access any native functions that are not part of the APIs provided by CLDC, profiles and licensee open classes

Page 51: Inside the KVM

51 TS-1507: Inside the K Virtual Machine

Class File Verification

• Standard JVM class file verifier is too large for a typical CLDC target device– The size of the JVM verifier is larger than

KVM itself– Dynamic memory consumption is excessive

(>100 kb for typical applications)

• CLDC/KVM introduces a new, two-pass class file verifier

Page 52: Inside the KVM

52 TS-1507: Inside the K Virtual Machine

Class File Verification

Development Workstation

MyApp.java

preverifier

MyApp.class

javac

MyApp.class

verifier

interpreter

…download...

(KVM runtime)Target device

Page 53: Inside the KVM

53 TS-1507: Inside the K Virtual Machine

Class File Verification

• Features of the new verifier– Space-intensive processing (stack map

generation, removal of JSR-RET byte codes) performed off-device

– Results checked for correctness on device; cannot be spoofed

– Code signing not required– On-device footprint ~10 kb; constant space

(<100 bytes) at runtime; linear time (one pass, no recursion)

– 5% overhead in class file size

Page 54: Inside the KVM

54 TS-1507: Inside the K Virtual Machine

Old Verifier

• Theorem prover– Consistent set of values for the stack– Consistent set of values for each register– Consistent use of jsr/ret instruction

Page 55: Inside the KVM

55 TS-1507: Inside the K Virtual Machine

Old Verifier

• Required no information outside the class

• Complex data flow analysis• Can require multiple passes• Can require large amount of space• Jsr and ret instructions cause massive

difficulties

Page 56: Inside the KVM

56 TS-1507: Inside the K Virtual Machine

New Verifier

• Theorem verifier• Single pass through the code• Very little space needed• Theorem proving occurs off line• May improve garbage collection in

the future!

Page 57: Inside the KVM

57 TS-1507: Inside the K Virtual Machine

Old vs. New Verifier

static void test(Long x) {Number y = x;while (y.intValue() != 0) {

y = nextValue(y);}return result;

}

Page 58: Inside the KVM

58 TS-1507: Inside the K Virtual Machine

Old Verifier

0. aload_0

1. astore_1

2. goto 10

5. aload_1

6. invokeStatic nextValue(Number)

9. astore_1

10. aload_1

11. invokevirtual intValue()

14. ifne 5

17. return

Long <>

Long Long

Long Long <>

Long Long <>

Long Long Long

Long Long Number

Long Long <>

Long Long Long

Long Long int

Long Long <>

Number

Number

Number

Number

Number

Number Number

Number

Number

Done!

Page 59: Inside the KVM

59 TS-1507: Inside the K Virtual Machine

New Verifier

0. aload_0

1. astore_1

2. goto 10

5. aload_1

6. invokeStatic nextValue(Number)

9. astore_1

10. aload_1

11. invokevirtual intValue()

14. ifne 5

17. return

Long <>

Long Long

Long Long <>

Long Number <>

Long Number Number

Long Number Number

Long Number <>

Long Number Number

Long Number int

Long Number <>

Done!

Page 60: Inside the KVM

60 TS-1507: Inside the K Virtual Machine

Native Methods

void <JNI name for Function> (void) {pop arguments off stack;pop <this> off stack (if instance)perform calculationspush result (if any) onto stack;return;

}

Page 61: Inside the KVM

61 TS-1507: Inside the K Virtual Machine

Native Function Table

• Automatically generated– Romized build– Non-romized build

• Full documentation in porting guide • Catches missing implementations• No support for runtime loading of

object code

Page 62: Inside the KVM

62 TS-1507: Inside the K Virtual Machine

JNI Caveats

• Object allocation functions• Using the right push/pop macro

– Don’t use type coercion!• Popping the right number of values• Garbage collection issues

Page 63: Inside the KVM

63 TS-1507: Inside the K Virtual Machine

Asynchronous JNI

• Readers and writers• Two implementations

– Call back with continuation– Start a separate thread to perform the task

• Allows KVM to use the most wanted features of OS-level threads, without the hassle

Page 64: Inside the KVM

64 TS-1507: Inside the K Virtual Machine

Event Handling

• Synchronous notification• Polling in Java programming

language code• Polling in the interpreter• Asynchronous notification

Page 65: Inside the KVM

65 TS-1507: Inside the K Virtual Machine

Romizer

• Based on romizer used in PersonalJava™ API implementation

• Generates data structures that JVM would have generated

• Classes are “loaded” but not initialized

Page 66: Inside the KVM

66 TS-1507: Inside the K Virtual Machine

Palm Romizer

• Problems– 64K limit per “resource”– Resources are relocatable– Native code is independently relocatable

• Solutions– Multiple small resources– Expandable resources– Runtime relocation– “Reset” checking

Page 67: Inside the KVM

67 TS-1507: Inside the K Virtual Machine

Restartability

• Required for embedded systems• Romizer issues• Global variable issues• Other issues

Page 68: Inside the KVM

68 TS-1507: Inside the K Virtual Machine

Starting a Port

• machine_md.h• runtime_md.c• main.c

Page 69: Inside the KVM

69 TS-1507: Inside the K Virtual Machine

Porting: Functions

• AlertUser()

• allocateHeap(), freeHeap()

• InitializeNativeCode(),FinalizeNativeCode()

• InitializeVM(), FinalizeVM()

• CurrentTime_md()

• Ulong64, long64

• BIG_ENDIAN or LITTLE_ENDIAN

Page 70: Inside the KVM

70 TS-1507: Inside the K Virtual Machine

Porting: More Functions

• C library calls• Minimal support for stdio• Native support for any generic

connections needed by your port• Asynchronous support

Page 71: Inside the KVM

71 TS-1507: Inside the K Virtual Machine

Port-Specific Changes

• Where class files are located• Where memory is located• What to do in case of severe errors• Window system startup/bringdown• Startup linking

Page 72: Inside the KVM

72 TS-1507: Inside the K Virtual Machine

Compiler Flags

• Tracing flags• Floating point• Alignment issues• Romizing• Generic networking• Generic storage

Page 73: Inside the KVM

73 TS-1507: Inside the K Virtual Machine

Directory Structure

• Api• Bin• Build• Doc• Jam• Samples• Tools• Kvm

Page 74: Inside the KVM

74 TS-1507: Inside the K Virtual Machine

• Subdirectores of tools/– tools/jcc– tools/palm

• Subdirectories of kvm/– kvm/VMCommon/h, kvm/VMCommon/src– kvm/VMExtra/h, kvm/VMExtra/h– kvm/VM<port>/h, kvm/VM<port>/h,

kvm/VM<port>/build

Page 75: Inside the KVM

75 TS-1507: Inside the K Virtual Machine

Porting

• Create machine_md.h• Create runtime_md.h• Determine values of compiler flags

Page 76: Inside the KVM

76 TS-1507: Inside the K Virtual Machine

Application Representation

• Public representation of applications– Whenever applications intended for a CLDC

device are stored publicly, compressed JAR format must be used

– Class files must have been preverified and contain the preverification information

• At the implementation or network transport level– Alternative formats can be used, as long

as the observable semantics of applications remain the same as with standard class files

Page 77: Inside the KVM

77 TS-1507: Inside the K Virtual Machine

Java™ Application Manager

• KVM/CLDC includes an optional component known as Java™ application manager– Helps in integrating KVM with a

microbrowser– Allows the downloading of

applications from the internet via HTTP– Allows the management (installing, launching,

deletion) of applications on devices without a file system

– Intended to facilitate porting efforts

Page 78: Inside the KVM

TS-1507: Inside the K Virtual Machine78