new approaches to mobile code: reconciling execution efficiency with provable security michael franz...
TRANSCRIPT
New Approaches to Mobile Code:Reconciling Execution Efficiencywith Provable Security
Michael Franz
University of California at Irvine
UC
Irv
ine
– pr
ojec
t tra
nsp
rose
: tra
nspo
rtin
g pr
ogra
ms
secu
rely
Technical Objective (1)
design the third line of defense in a mobile-code system
first line of defense: access control (physical, logical)
second line of defense: authentication
intrusion
falseauthenti-
cation
malicious mobile program
new third line of defense
prevent execution unless provably secure
Technical Objective (2)
make this “third line of defense” a pervasive property of every computer system, not just a luxury good afforded by only a few expensive ultra-secure high-end installations
rather than simply demonstrating the viability of mobile-code security, also make it practical across a wide spectrum of applications
in this context, practical means scalable to large applications, with excellent final code quality, at resonable just-in-time compilation speed and cost
Existing Practice: Java
“Java” is the de-facto standard format for distributing mobile programs
when we speak of “distributing mobile programs using Java”, we in fact usually mean “using the Java Virtual Machine”
the JVM has an instruction set that has been designed specifically for representing Java programs– interestingly enough, there still are JVM programs for which
no legal equivalent Java source program exists
Existing Practice: Java Security
although the Java programming environment is type-safe, programs compiled from Java into JVM-code must be re-checked upon arrival because they may have been corrupted in transit
class MyLibrary { public void NoSecret(); private void ASecret();}
class MyClient { import MyLibrary; {...MyLibrary.NoSecret();…}}
call MyLibrary.NoSecret ...
JVM-code stream
Existing Practice: Java Security
although the Java programming environment is type-safe, programs compiled from Java into JVM-code must be re-checked upon arrival because they may have been corrupted in transit
class MyLibrary { public void NoSecret(); private void ASecret();}
class MyClient { import MyLibrary; {...MyLibrary.NoSecret();…}}
call MyLibrary.ASecret ...
corrupted JVM-code stream
Existing Practice: Java Security
Java’s byte-code security model requires time-consuming static verification and/or dynamic checking while the code is being executed
systematic study of security issues is still in its infancy
IF
THEN ...
ELSE MyLibrary.Asecret()
Existing Practice: JVM Performance
upon arrival at a target machine, most JVM code is translated into the appropriate native code “just-in-time”
performance resulting from “just-in-time” compilation is not competitive with off-line compilers– compilation systems such as Sun’s HotSpot are incredibly
complex and haven’t delivered on their promise
JVM approach is unlikely to scale to large programs requiring top-level performance
Raising JVM Performance
raising the performance of JVM-code has been addressed by “annotating” the byte-code stream with compiler back-end related information
“annotated” class-files run much faster if an annotation-aware byte-code compiler is available on the target platform
security is lost: the “annotations” are not optional to the annotation-aware compiler; if an adversary falsifies them, the compiler will create a program that may be unsafe!
Emerging Practice: PCC
ship a native program along with a “proof” that it doesn’t violate a given security policy
although more general security policies are imaginable, current PCC systems essentially use type safety (and concomitant memory safety) as their security policy (our approach does the same)
PCC drastically reduces the size of the trusted computing base
Emerging Practice: PCC - Problems
PCC is based on native code– (otherwise the trusted computing base would become larger
again, defeating the main advantage of PCC)
PCC has the performance advantages of fully optimized code, but requires multiple versions for multiple platforms
also, in the long run, dynamically generated code (using feedback from dynamic profiling) will generally outperform native code
Our Technical Approach
study the interaction of security-related information, optimization-enhancing information, and compression, rather than considering them separately– use syntax-directed compression as a means of obtaining
guaranteed referential integrity
– transport compiler-related annotations to obtain top-level performance on the eventual target machine
– use a proof-based approach to guard the compiler-related annotations from falsification in transit
Our Technical Approach (2)
no single focus on security, code-quality, or encoding density, but attempt to study their interaction and make progress along all three dimensions
preliminary evidence suggests that these three topics are strongly interrelated and that representations based on adaptive compression of syntax trees are ideally suited for transporting mobile programs
this research is orthogonal and complementary to work on authentication and security policies
Our Policy Assumptions
type safety using the typing model of the source language– all of the host’s library routines are guaranteed to be called
with parameters of the correct types
– capabilities (object pointers) owned by the host can be manipulated by the mobile client application only as specified in the host’s interface definition (private, protected, …) and cannot be forged
type safety is guaranteed by our mobile code transportation scheme
Compression vs. Security
code compression and security may often be complimentary
idea: choose an encoding that can express only legal programs
example:int i, j, k, l;float r, s, t, u;{ i = j }
Compression vs. Security
code compression and security may often be complimentary
idea: choose an encoding that can express only legal programs
example:int i, j, k, l;float r, s, t, u;{ i = j }
operator
:=
Compression vs. Security
code compression and security may often be complimentary
idea: choose an encoding that can express only legal programs
example:int i, j, k, l;float r, s, t, u;{ i = j }
operator firstoperand(1 out of 8)
:= i
Compression vs. Security
code compression and security may often be complimentary
idea: choose an encoding that can express only legal programs
example:int i, j, k, l;float r, s, t, u;{ i = j }
operator firstoperand(1 out of 8)
secondoperand(1 out of 3 or 4!)
:= i j
Compression vs. Security code compression and security may often be complimentary idea: choose an encoding that can express only legal programs example:
higher-level encodings: enumerate all legal assignments = at most 2 * possibilitiesint i, j, k, l;
float r, s, t, u;{ i = j }
operator firstoperand(1 out of 8)
secondoperand(1 out of 3 or 4!)
:= i j
( )n2
Virtual Machines vs. Graphs
information is lost when compiling to the “flat” representation of virtual machines
many native code optimizations require this information to be re-discovered
Virtual Machine Representation
Graph-Based Representation
bra +2... ...
Performance-Enhancing Information
compiler-related information intended for improving code quality re-introduces redundancy that can be exploited by an adversary
for example, a program can be encoded with guaranteed referential integrity using a grammar close to the semantics of the source language
but in order to allow optimizations, the grammar needs to be relaxed
the “holes” in the relaxed grammar need to be guarded by other means based on proof-carrying code concepts
General Approach Taken
use encoding-inherent security wherever possible(a well-formedness property of the encoding itself)
use proof-based security where necessary to support optimizations– transporting results of alias analysis
– removing range or type checks
this approach applies regardless of the semantic level on which the program is being transported
but the correct choice of such a semantic level must also be considered!
Highest-Level Encoding
simple and easily understood security policy based on type-safety
ultra-compact representation using grammar-based compression
guaranteed referential integrity provided essentially “for free” by the encoding– relatively small amount of proof-based security required only for
additional performance-enhancing annotations
– e.g., exceptions, alias analysis, escape analysis,dynamic type safety
time required for dynamic compilation may be a problem
Project Workflow “High Level Thread”
Com-press-
ion
Enco-ding
Proofs
arithmeticencoding
dictionaryencoding
P2K
Java abstract grammar
theorem prover
efficient annotated JAG
annotated JAG
arithmeticencoding
dictionaryencoding
JAG
combination heuristics
2. compression of Java programs
1. guarantee complete static
semantics through encoding
(enhanced)static semantics
3. reduced verification effort due to abstract grammar
encoding
well formedness
Lowest-Level Encoding
compiler-oriented intermediate representation goal is to provide much better code quality with far
less effort at the code consumer’s site requires more proof-based security than the “high-
level” approach, but still far less than the “original PCC idea” where the goal is to reduce the TCB
more voluminous transportation format could be more difficult to reason about safety because
further removed from the source language
Project Workflow “Low Level Thread”
Com-press-
ion
Enco-ding
Proofs
SSA-directed encoding
typedSSA
theorem prover
secure annotated typed SSA
annotated typed SSA
annotation encoding
2. UAST after performing all target-machine independent
optimizations
annotation encoding
1. universal (source-language neutral) abstract syntax tree representation
4. provably secure target-machine independent low-level
representation
3. encoding for the proofs required to guard the TASSA
Third Way: Core Calculus
two-stage mapping of the mobile code– source constructs are mapped to the core calculus
– mapping may be transported as well,or assumed global shared knowledge
simple and easily understood security policy only approach that is easily extensible even by third
parties not clear if this approach will yield adequate native code
quality at the consumer’s site the relative trade-offs are as of yet unknown
Current Status and Rationale developed a comprehensive library of stream
compressors in Java “high-level” encoding prototype is up and running
– working on a contribution to PLDI 2001 on Java compression
“low-level” encoding and “core calculus” prototypes will be operational over the summer
the relative trade-offs (encoding density vs. decoding/dynamic compilation speed vs. code quality) can only be determined by collecting experience with actual prototypes
Quantitative Metrics
security– publish complete design specification and rationale and open
the design to public scrutiny and external validation
efficiency– measure by comparing generated code quality with that of
existing on-the-fly compilers
code density– measure by comparing with competing proof-carrying code
and mobile-code distribution formats
Expected Major Achievements
demonstrate that graph-based encoding formats are superior to virtual machines
explore the relative trade-offs that can only be determined by building an actual prototype– encoding density/network transfer speed vs.
– decoding/dynamic compilation speed
– code quality, especially when using the core-calculus approach
publish a design rationale that can form the basis of a subsequent standardization effort
Long-Term Impact
enable an educated choice of a replacement technology at the end of the Java Virtual Machine’s life-cycle
royalty-free and free of particular proprietary intellectual property claims
developed under the scrutiny of and in dialogue with the security community
Task Schedule
1999 2000 2001 2002
Y1 Milestones:•source-level representation => Java compression•low-level representation•core calculus representation
Y2 Milestones:•3 system prototypes•trade-off analysis•encoding format comprehensive definition
End of Project:•system deliverable•comprehensive documentation
investigate:•multiple source languages•graph-based encoding schemes•proof-carrying code
investigate:•requirements ofoptimizing code generators •integration of security vs. compiler-related data
investigate:•mutual interaction of security, efficiency, and compression density•security of system
Transition of Technology
the final design rationale document will provide enough detail that unrelated third parties will be able to replicate our code-transportation scheme(s)
our prototype implementation(s) will be made available in source form
the graduate students involved in this work are likely to transfer into the industrial sector
Thank You