secure compiler seminar 4/11 visions toward a secure compiler
DESCRIPTION
Secure Compiler Seminar 4/11 Visions toward a Secure Compiler. Toshihiro YOSHINO < [email protected] > (D1, Yonezawa Lab.). Talk Agenda. Brief Introduction about TAL and PCC Introduction of my Master Thesis Visions toward a Secure Compiler. Brief Introduction about TAL and PCC. - PowerPoint PPT PresentationTRANSCRIPT
Secure Compiler Seminar 4/11
Visions toward a Secure Compiler
Toshihiro YOSHINO<[email protected]>
(D1, Yonezawa Lab.)
Talk Agenda
Brief Introduction about TAL and PCC Introduction of my Master Thesis Visions toward a Secure Compiler
Brief Introduction about TAL and PCC
Background
Program verification= Mathematically assure a program has
certain properties Useful for security
• Memory access safety, information flow analysis, …
Verifying low-level code directly reduces TCB TCB: Trusted Computing Base High-level code must be compiled after verified
We must trust the compiler⇒ Assemblers are much simpler than compilers
Current Techniques and Problems Code signing
Based on public key cryptography Can prove the genuineness of code Cannot prove the safety by itself
Signature matching Use a dictionary of malicious patterns and
match target programs against it Employed in many antivirus systems Pass does NOT mean safety
• Often unable to detect very new virus
Proof-Carrying Code[Necula et al. 1997]
Technique for safe execution of untrusted codeCode consumer does not need to trust
the producer
Code distributed with the proof of its safetyProducer creates a proofConsumer verifies the proof against hi
s security policy
Proof-Carrying Code[Necula et al. 1997]
Low consumer’s costConsumer has only to verify the proof
• For example, by typechecking Tamper-proof
If passed the check, code does NOT harm even if modified
• If modification makes the code fail the check, the code will not run and it is safe
• Otherwise code still obeys the consumer’s security policy
Typed Assembly Language[Morrisett et al. 1999]
Extends a conventional assembly language with static type checking An instance of Proof-Carrying Code
By type checking, it can guarantee Memory access safety
• Program never accesses outside the memory area allocated for it
Interface consistency• Type agreement of arguments / return value of
functions
etc.
TAL System Illustrated
Type Checker
AssemblerLinker
TAL System
Code withtype
informationCode
Consumer
A Brief Example ofTAL Program
fact:movl %eax, %ecxmovl $1, %eax
loop:mull %ecxdecl %ecxcmpl $0, %ecxjg loop
end:
{eax: B4}
{eax: B4, ecx: B4}
{eax: B4}
Program Code(Same as conventional assembly languages)
Type Information(Used to typechecking
a program)
Related Work:
TALK, TOS [Maeda, 2005]
TALK: TAL for KernelMorrisett et al. uses garbage collector
for memory management in TALFor OS, GC cannot be assumed
• Must implement memory management (malloc/free)
TOS: Typed Operating SystemAn experimental OS written in TALK
Introduction ofMy Master Thesis
My Work for Master Thesis
“A Framework Using a Common Language to Build Program Verifiers for Low-Level Languages”To help developers of program verifiersTo be a common basis for verification
of low-level programs• Such as assembly and machine languages
Motivation:Verifiers are Hard to Develop
Especially in low-level languages… Complex semantics
Semantics of each instruction is complexThere are many instructions in a language
Low portabilityLow-level languages heavily depend on
the underlying architectureAccordingly, entire verifier also depends
on the underlying architecture
Our Idea
Split a verifier into three parts1. Design a common language,2. Translate the target program into that
language, and3. Verify the translated program
These parts are explicitly independent from each other
Thus we can replace them easily
Our Idea
TranslatedProgram
Target Program
Translator
Semantics ofCommon Language
ResultSuccess
/Fail
Verifier
(1)
(2)
(3)Verification
Logic
How Do We Solve the Problems?
Coping with complex semanticsOnly translators care the semantics of the
source languageTranslator is reusable
• Once description is done, we can reuse it
Improving portabilityVerification logic is also reusable
• Once implemented, it can be used for other architectures simply by replacing translators
How Do We Solve the Problems?
TranslatedProgram
Translator
Semantics ofCommon Language
ResultSuccess
/Fail
Verifier
VerificationLogic
Target Program
Program in Another Language
Overview of the Work
Designed a framework to build program verifierDesigned a common language ADLDiscussed the correctness of translatorsProved that the properties assured are
preserved throughout translation
Implemented the framework using Java
ADL: A Common Language
TranslatedProgram
Target Program
Translator
Semantics ofCommon Language
ResultSuccess
/Fail
Verifier
VerificationLogic
ADL: A Common Language
Design Concept
ADL: Architecture Description Language
From observation of many architectures Data is stored in registers and memory, and mani
pulates it according to program Only jumps are sufficient for control flow structure
Expressiveness Arithmetics, logical operations, … C-like expressions
Conservative semantics No need to describe indecent programs To simplify semantics
ADL: A Common Language
Overview of the Language
Imperative language which manipulates registers and memory5 kinds of commands
• nop, error, assignment, goto, if-then-elseMuch like C than assembly
• Infix operators, parenthesized formulae• Conditional execution by arbitrary condition usi
ng if commandOnly goto modifies control flow
• Unconditional branch
ADL: A Common Language
A Brief Example
data: ...
main:%ebx = &data;%eax = 0;goto &lp;
lp:%eax = %eax + *[4](%ebx);%ebx = *[4](%ebx + 4);if %ebx == &null then goto &endelse goto &lp;
end:goto &end;
data: ...
main: movl $data, %ebx movl $0, %eax
lp: addl 0(%ebx), %eax movl 4(%ebx), %ebx cmpl $0, %ebx je end jmp lp
end: jmp end
ADL x86
ADL: A Common Language
Restrictions
ADL has a few restrictions by design Code and data are completely separated We assume NOTHING about memory
layout of a program
To simplify the semantics Some programs cannot be expressed
• However, most of decent programs can be written even under these restrictions
• To be discussed in the next slide
ADL: A Common Language > Restrictions
Separation of Code and Data
Do not treat code as dataADL programs cannot read / write code
We cannot express the programs which uses dynamic code generationBut, patterns of the generated code is
fixed in many cases Other solution is possible⇒
• For example, prepare a function for each pattern of code
ADL: A Common Language > Restrictions
Not Assume Memory Layout
Casting is prohibitedADL distinguishes integers and pointers
• In real architectures, pointers are not distinguished from integers
Pointer arithmetic is restrictedOnly pointer+integer, pointer-pointer are d
efined• Other operations returns ‘undetermined’
Sufficient for array/structure operations and offset calculation
Program Translator
TranslatedProgram
Target Program
Translator
Semantics ofCommon Language
ResultSuccess
/Fail
Verifier
VerificationLogic
Program Translator
Translates low-level programs into ADL
We must assure that program translators are correctOtherwise, we cannot trust the entire
verifierCorrectness is defined in the following
discussion
State
Program Translator
What Is Correctness of Program Translation? Instruction = Function over machine states Correctness =
Correspondence between states of twomachines are preserved in translation
OriginalProgram
TranslatedProgram
State
State’ State’
State
State’
Program Translator
How to ConfirmCorrectness of Translation
Any programs result in corresponding states for any input ⇒ CorrectnessTotal inspection is NOT realisticTheorem prover would be useful
• Automatic proving is one of future work• But how to confirm the correctness of the desc
ription of the source language?
At this time, we take empirical approachTest several cases using an interpreter
Verification Logic
TranslatedProgram
Target Program
Translator
Semantics ofCommon Language
ResultSuccess
/Fail
Verifier
VerificationLogic
Verification Logic
Verifies the properties of translated programs Function that takes a program and returns success
or fail Soundness must be assured
• This is the task for the creator of a verification logic• Here we do not discuss any further
Definition: Soundness of a verification logic Verification logic V: State → Bool The set {S | V(S)} is closed about step execution
• If V(S), execution never falls into error state, and• If V(S) and S→T (→ means step execution), then V(T)
Verification Logic
Soundness of Verification Logic
Machine States
S such that V(S)
Soundness =V(S) S→T∧
then V(T)
Verification Logic
Program Translation and VerificationWe proved the following theorem
If program translator is correct, and Verification logic is sound, then
⇒ Verification on original program and translated program are equivalentClosed subset can be defined on the
states of translation source language
Implementation
Framework ADL data structures ADL interpreter
• Used to confirm the correctness of translators Translator, verification logic interfaces Translation rule compiler
• Compiles translation rule into Java implementation of a translator
And for proof of concept, Translator from Intel x86 and SPARC A simple type checker
Related Works
Foundational TAL [Crary, 2003]
TAL type checker is still large TALx86 type checker consists of approx. 23k LoC
in O’Caml (!) TCB is reduced by using a logical framework
Designed a language called TALT on Twelf logical framework [Pfenning et al., 1999]
Proved GC safety of TALT by machine
Correspondence between TALT and realistic architectures are not discussed
TALT type system is fixed Our work allows replacement of verification logics
Future Work
Automatically confirm the correctness of translation Automatic testing
• Cooperating with emulators or debuggers Or, build a model and use a theorem prover
Support dynamic memory allocation Currently all memory must be allocated statically
Support concurrent programs Concurrency is not taken into consideration To apply for OSes, etc., concurrency takes an im
portant role
Visions towarda Secure Compiler
What Is Secure Compiler?
A compiler which produces certified codeFor example, TAL code as outputLike Popcorn compiler in TALx86
• Safe dialect of C → TALx86 A compiler which assures correct com
pilation (optionally)Like credible compiler [Rinard, 1999]Reduces TCB
Motivation
Infrastructure has been builtTALK, TOS [Maeda, 2005]Verifier framework [Yoshino, 2006]
Next we have to build a house on it!Most people do not want to write low-
level code directly
⇒ Secure Compiler
Toward Secure World
If we built a secure compiler…
Memory-error-free systems Prevent memory-error-based attacks
• OS kernel, core libraries, network server…
Writing secure code Vulnerable code will result in verification
failure So code security will be improved
Rest to be discovered…
Tasks to Do
Determine what properties to assure Memory access safety? Information flow? Must be mechanically checkable
Design the verification logic Use verifier framework?
Design the language Target: TAL-base? ADL?
• ADL can be used as certified language• Register allocation is done, so simple mapping will
be possible… Source: ???