new approaches to mobile code: reconciling execution efficiency with provable security michael franz...

New Approaches to Mobile Code:Reconciling Execution Efficiencywith Provable Security

Michael Franz

University of California at Irvine

UC

Irv

ine

– pr

ojec

t tra

nsp

rose

: tra

nspo

rtin

g pr

ogra

ms

secu

rely

Technical Objective (1)

design the third line of defense in a mobile-code system

first line of defense: access control (physical, logical)

second line of defense: authentication

intrusion

falseauthenti-

cation

malicious mobile program

new third line of defense

prevent execution unless provably secure

Technical Objective (2)

make this “third line of defense” a pervasive property of every computer system, not just a luxury good afforded by only a few expensive ultra-secure high-end installations

rather than simply demonstrating the viability of mobile-code security, also make it practical across a wide spectrum of applications

in this context, practical means scalable to large applications, with excellent final code quality, at resonable just-in-time compilation speed and cost

Existing Practice: Java

“Java” is the de-facto standard format for distributing mobile programs

when we speak of “distributing mobile programs using Java”, we in fact usually mean “using the Java Virtual Machine”

the JVM has an instruction set that has been designed specifically for representing Java programs– interestingly enough, there still are JVM programs for which

no legal equivalent Java source program exists

Existing Practice: Java Security

although the Java programming environment is type-safe, programs compiled from Java into JVM-code must be re-checked upon arrival because they may have been corrupted in transit

class MyLibrary { public void NoSecret(); private void ASecret();}

class MyClient { import MyLibrary; {...MyLibrary.NoSecret();…}}

call MyLibrary.NoSecret ...

JVM-code stream


although the Java programming environment is type-safe, programs compiled from Java into JVM-code must be re-checked upon arrival because they may have been corrupted in transit

class MyLibrary { public void NoSecret(); private void ASecret();}

class MyClient { import MyLibrary; {...MyLibrary.NoSecret();…}}

call MyLibrary.ASecret ...

corrupted JVM-code stream


Java’s byte-code security model requires time-consuming static verification and/or dynamic checking while the code is being executed

systematic study of security issues is still in its infancy

IF

THEN ...

ELSE MyLibrary.Asecret()

Existing Practice: JVM Performance

upon arrival at a target machine, most JVM code is translated into the appropriate native code “just-in-time”

performance resulting from “just-in-time” compilation is not competitive with off-line compilers– compilation systems such as Sun’s HotSpot are incredibly

complex and haven’t delivered on their promise

JVM approach is unlikely to scale to large programs requiring top-level performance

Raising JVM Performance

raising the performance of JVM-code has been addressed by “annotating” the byte-code stream with compiler back-end related information

“annotated” class-files run much faster if an annotation-aware byte-code compiler is available on the target platform

security is lost: the “annotations” are not optional to the annotation-aware compiler; if an adversary falsifies them, the compiler will create a program that may be unsafe!

Emerging Practice: PCC

ship a native program along with a “proof” that it doesn’t violate a given security policy

although more general security policies are imaginable, current PCC systems essentially use type safety (and concomitant memory safety) as their security policy (our approach does the same)

PCC drastically reduces the size of the trusted computing base

Emerging Practice: PCC - Problems

PCC is based on native code– (otherwise the trusted computing base would become larger

again, defeating the main advantage of PCC)

PCC has the performance advantages of fully optimized code, but requires multiple versions for multiple platforms

also, in the long run, dynamically generated code (using feedback from dynamic profiling) will generally outperform native code

Our Technical Approach

study the interaction of security-related information, optimization-enhancing information, and compression, rather than considering them separately– use syntax-directed compression as a means of obtaining

guaranteed referential integrity

– transport compiler-related annotations to obtain top-level performance on the eventual target machine

– use a proof-based approach to guard the compiler-related annotations from falsification in transit

Our Technical Approach (2)

no single focus on security, code-quality, or encoding density, but attempt to study their interaction and make progress along all three dimensions

preliminary evidence suggests that these three topics are strongly interrelated and that representations based on adaptive compression of syntax trees are ideally suited for transporting mobile programs

this research is orthogonal and complementary to work on authentication and security policies

Our Policy Assumptions

type safety using the typing model of the source language– all of the host’s library routines are guaranteed to be called

with parameters of the correct types

– capabilities (object pointers) owned by the host can be manipulated by the mobile client application only as specified in the host’s interface definition (private, protected, …) and cannot be forged

type safety is guaranteed by our mobile code transportation scheme

Compression vs. Security

code compression and security may often be complimentary

idea: choose an encoding that can express only legal programs

example:int i, j, k, l;float r, s, t, u;{ i = j }





operator

:=





operator firstoperand(1 out of 8)

:= i






secondoperand(1 out of 3 or 4!)

:= i j

Compression vs. Security code compression and security may often be complimentary idea: choose an encoding that can express only legal programs example:

higher-level encodings: enumerate all legal assignments = at most 2 * possibilitiesint i, j, k, l;

float r, s, t, u;{ i = j }


secondoperand(1 out of 3 or 4!)

:= i j

( )n2

Virtual Machines vs. Graphs

information is lost when compiling to the “flat” representation of virtual machines

many native code optimizations require this information to be re-discovered

Virtual Machine Representation

Graph-Based Representation

bra +2... ...

Performance-Enhancing Information

compiler-related information intended for improving code quality re-introduces redundancy that can be exploited by an adversary

for example, a program can be encoded with guaranteed referential integrity using a grammar close to the semantics of the source language

but in order to allow optimizations, the grammar needs to be relaxed

the “holes” in the relaxed grammar need to be guarded by other means based on proof-carrying code concepts

General Approach Taken

use encoding-inherent security wherever possible(a well-formedness property of the encoding itself)

use proof-based security where necessary to support optimizations– transporting results of alias analysis

– removing range or type checks

this approach applies regardless of the semantic level on which the program is being transported

but the correct choice of such a semantic level must also be considered!

Highest-Level Encoding

simple and easily understood security policy based on type-safety

ultra-compact representation using grammar-based compression

guaranteed referential integrity provided essentially “for free” by the encoding– relatively small amount of proof-based security required only for

additional performance-enhancing annotations

– e.g., exceptions, alias analysis, escape analysis,dynamic type safety

time required for dynamic compilation may be a problem

Project Workflow “High Level Thread”

Com-press-

ion

Enco-ding

Proofs

arithmeticencoding

dictionaryencoding

P2K

Java abstract grammar

theorem prover

efficient annotated JAG

annotated JAG

arithmeticencoding

dictionaryencoding

JAG

combination heuristics

2. compression of Java programs

1. guarantee complete static

semantics through encoding

(enhanced)static semantics

3. reduced verification effort due to abstract grammar

encoding

well formedness

Lowest-Level Encoding

compiler-oriented intermediate representation goal is to provide much better code quality with far

less effort at the code consumer’s site requires more proof-based security than the “high-

level” approach, but still far less than the “original PCC idea” where the goal is to reduce the TCB

more voluminous transportation format could be more difficult to reason about safety because

further removed from the source language

Project Workflow “Low Level Thread”

Com-press-

ion

Enco-ding

Proofs

SSA-directed encoding

typedSSA

theorem prover

secure annotated typed SSA

annotated typed SSA

annotation encoding

2. UAST after performing all target-machine independent

optimizations

annotation encoding

1. universal (source-language neutral) abstract syntax tree representation

4. provably secure target-machine independent low-level

representation

3. encoding for the proofs required to guard the TASSA

Third Way: Core Calculus

two-stage mapping of the mobile code– source constructs are mapped to the core calculus

– mapping may be transported as well,or assumed global shared knowledge

simple and easily understood security policy only approach that is easily extensible even by third

parties not clear if this approach will yield adequate native code

quality at the consumer’s site the relative trade-offs are as of yet unknown

Current Status and Rationale developed a comprehensive library of stream

compressors in Java “high-level” encoding prototype is up and running

– working on a contribution to PLDI 2001 on Java compression

“low-level” encoding and “core calculus” prototypes will be operational over the summer

the relative trade-offs (encoding density vs. decoding/dynamic compilation speed vs. code quality) can only be determined by collecting experience with actual prototypes

Quantitative Metrics

security– publish complete design specification and rationale and open

the design to public scrutiny and external validation

efficiency– measure by comparing generated code quality with that of

existing on-the-fly compilers

code density– measure by comparing with competing proof-carrying code

and mobile-code distribution formats

Expected Major Achievements

demonstrate that graph-based encoding formats are superior to virtual machines

explore the relative trade-offs that can only be determined by building an actual prototype– encoding density/network transfer speed vs.

– decoding/dynamic compilation speed

– code quality, especially when using the core-calculus approach

publish a design rationale that can form the basis of a subsequent standardization effort

Long-Term Impact

enable an educated choice of a replacement technology at the end of the Java Virtual Machine’s life-cycle

royalty-free and free of particular proprietary intellectual property claims

developed under the scrutiny of and in dialogue with the security community

Task Schedule

1999 2000 2001 2002

Y1 Milestones:•source-level representation => Java compression•low-level representation•core calculus representation

Y2 Milestones:•3 system prototypes•trade-off analysis•encoding format comprehensive definition

End of Project:•system deliverable•comprehensive documentation

investigate:•multiple source languages•graph-based encoding schemes•proof-carrying code

investigate:•requirements ofoptimizing code generators •integration of security vs. compiler-related data

investigate:•mutual interaction of security, efficiency, and compression density•security of system

Transition of Technology

the final design rationale document will provide enough detail that unrelated third parties will be able to replicate our code-transportation scheme(s)

our prototype implementation(s) will be made available in source form

the graduate students involved in this work are likely to transfer into the industrial sector

Thank You

new approaches to mobile code: reconciling execution efficiency with provable security michael franz...

Documents

jvm code

mobile code

jvm programs

java securityalthough

bytecode stream

java virtual machinethe

performance of jvm

appropriate native code