advanced topics: reasoning about code pointers & self
TRANSCRIPT
![Page 1: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/1.jpg)
Advanced Topics: Reasoning About Code Pointers & Self-
Modifying Code
Zhong ShaoYale University
July 25, 2012
![Page 2: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/2.jpg)
L1
L2
L3
L4
Building fully certified systems• One logic for all code
– Consider all possible interactions.– Very difficult!
• Reality– Only limited combinations of
features are used.– It’s simpler to use a specialized
logic for each combination.– Interoperability between logics
![Page 3: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/3.jpg)
OCAP
Our solution
Ln…L1
Mechanized Meta-Logic (CiC)
Modeling of the machine
…C1 Cn
C1C1Cn
…OS
Cn
…C1 Cn
TCB
![Page 4: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/4.jpg)
A toy machine
I1f1 :
I2f2 :
I3f3 :
…
(code heap) C
0
r1
1 2 …
r2 r3 … rn
(data heap) H
(register file) R
(state) S
addu …lw …sw ……
j f
(instr. seq.) I
(program) P::=(C,S,pc)
::=(H,R)::={f
I}*
pc
![Page 5: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/5.jpg)
Program specifications
I1f1 :
I2f2 :
I3f3 :
…
(code heap) C
0
r1
1 2 …
r2 r3 … rn
(data heap) H
(register file) R
(state) S
addu …lw …sw ……
j f
(instr. seq.) I
(program) P::=(C,S,pc)
::=(H,R)::={f
I}*
pc
1
2
3
(spec) ::= {f }*
![Page 6: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/6.jpg)
Invariant- based verification
Initial condition: Inv
(P0 )
P0c1 P1
c2 P2c3 … cn Pn
Progress:if Inv
(P), then P’. P c P’.
Preservation:if Inv
(P) and P c P’, then Inv
(P’).
![Page 7: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/7.jpg)
Mechanized Meta-Logic (CiC)
OCAP Rules
Ln…
“Domain-specific” logics
Modeling of the machine
L1
…C1 Cn
may use different How to link modules?
![Page 8: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/8.jpg)
OCAP Rules
The OCAP framework [TLDI'07]
Ln…L1
…C1 Cn
( )L1 ( )LnSoundSound
OCAPSoundness
Mechanized Meta-Logic (CiC)
Modeling of the machine
XCAP SCAPTAL …AIM
![Page 9: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/9.jpg)
Verification of low-level code• Motivation
– everything eventually runs as machine-level binaries– some code must be written at the low level (e.g., context switch,
interaction w. devices) – most compilers cannot be trusted– some high-level language features are not well-understood (very
complex semantic models)• Challenges
– arbitrary control-flow, aliasing, stored programs– no type system (any code is fine)
• Objectives– certifying both user- and system-level code– modular specification & certification is crucial– embedding of high-level systems (to reuse high-level proofs)
![Page 10: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/10.jpg)
A toy target machine TM
![Page 11: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/11.jpg)
Syntax of TM
![Page 12: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/12.jpg)
Operational semantics of TM
![Page 13: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/13.jpg)
• Hoare logic in CPS
• Mechanized in a proof assistant (Coq) with a very rich meta logic plus inductive definitions
• The meta logic also serves as the assertion language!!
example:
• Entailment between two assertions (P |= Q) are “semantic”, i.e., just implication in the meta logic
• Program language syntax, semantics, and correctness theorem are all represented and reasoned using the same meta logic
• Like Lamport’s TLA except our meta logic is mechanized
• Hoare-style assertions & inference rules enforce both the correctness & type safety properties
• No need of a separate type system; not a “refinement”
Certified Assembly Programming (CAP)[Yu03, Hamid04, Yu04, Feng05]
![Page 14: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/14.jpg)
CAP inference rules for instructions
![Page 15: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/15.jpg)
CAP inference rules (cont’d)
![Page 16: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/16.jpg)
How CAP works
…
Can be used to prove simple safety and partial correctness properties.
![Page 17: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/17.jpg)
ECP problem w. Hoare logic• Embedded code pointers (ECP)
Examples: computed GOTOs, higher-order functions, indirect jumps, continuations, return addresses
• Previous approaches– Ignore ECP [Necula98, Yu04]
– Limit ECP specifications to types [Hamid04]
– Sacrifice modularity [Yu03]
– Use complex indexed semantic models [Appel01]
![Page 18: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/18.jpg)
User-level code: list appendAdapted from [Reynolds02]
11
22
n-1n-1
n-2 ……
![Page 19: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/19.jpg)
User-level code: list appendAdapted from [Reynolds02]
11
22
n-1n-1
n-2 ……
![Page 20: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/20.jpg)
Type-based Logic-basedInductive definitions
(correctness of list append) - +
Strong update (Separation logic)(allocation, de-allocation, mutation) - +
Embedded code pointers (continuation) + -
Impredicative polymorphisms (closure) + -
Adapted from [Reynolds02]User-level code: list append
![Page 21: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/21.jpg)
The ECP problem
cptr(f, a) = ?
![Page 22: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/22.jpg)
• Internalize Hoare-derivation for ECP
Previous approach
Circularity!
• Stratification[OHearn97, Naumann01]
– Works for simple case– Hard for assembly– Hard for polymorphism
• Step-Indexing[Appel01, Appel02, Schneck03]
– Works for polymorphism– Heavyweight– Not standard Hoare logic
![Page 23: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/23.jpg)
CAP’s approach• Specify ECP by checking against code spec
• Verify all code specs are indeed valid
• Modularity problem
![Page 24: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/24.jpg)
The XCAP approach• Specify ECP independent of code spec
• Check ECP against global code spec
• Verify global code spec is indeed valid
![Page 25: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/25.jpg)
Extended propositions
![Page 26: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/26.jpg)
XCAP rules
![Page 27: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/27.jpg)
How XCAP supports ECP
(SEQ)
(ECP)
(JMP)
(JD)
![Page 28: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/28.jpg)
Verification of append()
![Page 29: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/29.jpg)
Impredicative polymorphisms
• Important for ECP
• Naïve interpretation function fails
![Page 30: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/30.jpg)
New interpretation [POPL’06]
Soundness of interpretation
Interpretation
Consistency
![Page 31: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/31.jpg)
Soundness of XCAP
…
![Page 32: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/32.jpg)
Case study: context switch?swapcontext:
• Runs thousands of time per second• Used by assembly, C, MSIL, JVML, etc.• Basis of multi-tasking, OS, and software• Safety and correctness taken for granted
![Page 33: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/33.jpg)
Context switch on x86 (cont’d)swapcontext:; store old contextmov eax, [esp+4]mov [eax+0], OKmov [eax+4], ebxmov [eax+8], ecxmov [eax+12], edxmov [eax+16], esimov [eax+20], edimov [eax+24], ebpmov [eax+28], esp
; load new contextmov eax, [esp+8]mov esp, [eax+28]mov ebp, [eax+24]mov edi, [eax+20]mov esi, [eax+16]mov edx, [eax+12]mov ecx, [eax+8]mov ebx, [eax+4]mov eax, [eax+0]ret
![Page 34: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/34.jpg)
swapcontext:
old
Context switch (cont’d)
eaxebxecxedxesiediebpesp
retp
…
…call swapcontext
…
retp’
…
………
a1a2a3a4a5a6a7a8
b1b2b3b4b5b6b7b8
OKnew
a8
![Page 35: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/35.jpg)
Context switch (cont’d)swapcontext:
• Simple code, complex reasoning!– stack / heap / memory mutation– procedure call / first-class code pointer– protection / polymorphism
• Lack specification and verification that are– formal (machine checkable in sound logic)– general (allows all possible usage of context)– realistic (usable from assembly and C level)
![Page 36: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/36.jpg)
Buggy context code today
![Page 37: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/37.jpg)
Certifying context-switch [Ni et al TPHOLs 2007]
• The first to verify machine-context code– realistic, no rewriting, no performance penalty
• Based on realistic hardware– variable length instruction decoding, finite word, etc.
• Uses language-based techniques– modular specification and proof
• Fully mechanized– code, machine, meta theory, specification, proof
![Page 38: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/38.jpg)
The codetypedef struct mctx_st *mctx_t;struct mctx_st {int eax, int ebx, int ecx, int edx,
int esi, int edi, int ebp, int esp};
void swapcontext (mctx_t old, mctx_t new);
void loadcontext (mctx_t mctx);mov eax, [esp+8] // load address of the new contextmov esp, [eax+_esp] // load the new stack pointermov ebp, [eax+_ebp] // load the new registersmov edi, [eax+_edi]mov esi, [eax+_esi]mov edx, [eax+_edx]mov ecx, [eax+_ecx]mov ebx, [eax+_ebx]mov eax, [eax+_eax]ret // invoke the new context
![Page 39: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/39.jpg)
The code (continued)void makecontext (mctx_t mctx, char *sp, void *lnk,
void *func, void *arg);mov eax, [esp+4] // load address of the contextmov ecx, [esp+8] // load stack top pointer
// for the new stack framemov edx, [esp+20] // load the function's argumentmov [ecx-4], edx // push it onto new stackmov edx, [esp+12] // load the function's return linkmov [ecx-8], edx // push it onto new stackmov edx, [esp+16] // load the function addressmov [ecx-12], edx // push it as return IP onto new stacksub ecx, 12mov [eax+_esp], ecx // all useful info for fresh context
// is on new stackret
![Page 40: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/40.jpg)
Challenges
1. Polymorphism over arbitrary shape of data2. Multiple explicit stacks and flexible handling3. Strong-update (separation logic)4. General embedded code pointers
Higher-order functions and continuations
5. Partial correctness
TAL Hoare Logic Index-based FPCCProblem 1, 2, 3, 5 4 5
![Page 41: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/41.jpg)
Applying XCAP to x86
• Finite machine word• Stack push / pop• Function call / return• Variable-length instruction• Word-aligned memory
![Page 42: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/42.jpg)
Reasoning about memory
• Strong updates (separation logic)• Shallow embedding in PropX, e.g.,
![Page 43: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/43.jpg)
Reasoning about control flow
• Direct jump
• Indirect jump
• Function call
• Return
![Page 44: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/44.jpg)
Stack and calling convention
local storage
return addressargument 1argument 2
…argument n
caller frames
excess space
esp
![Page 45: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/45.jpg)
What is a machine context?
…
………
retvbxcxdxsidibpsp
cs
mctxpublic
private
typedef struct mctx_st *mctx_t;struct mctx_st { int eax,int ebx,int ecx,int edx,
int esi, int edi, int ebp,int esp };
ret
![Page 46: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/46.jpg)
swapcontext()void swapcontext (mctx_t old, mctx_t new);
mov eax, [esp+4] mov [eax+ 0], OK mov [eax+ 4], ebxmov [eax+ 8], ecxmov [eax+12], edxmov [eax+16], esimov [eax+20], edimov [eax+24], ebpmov [eax+28], espmov eax, [esp+8] mov esp, [eax+28]mov ebp, [eax+24]mov edi, [eax+20]mov esi, [eax+16]mov edx, [eax+12]mov ecx, [eax+ 8]mov ebx, [eax+ 4]mov eax, [eax+ 0]ret
![Page 47: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/47.jpg)
First half of the proof
![Page 48: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/48.jpg)
Second half of the proof
![Page 49: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/49.jpg)
Other routinesvoid loadcontext (mctx_t mctx);void makecontext (mctx_t mctx, char *sp, void *lnk, void *func, void *arg);
![Page 50: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/50.jpg)
Coq implementation
![Page 51: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/51.jpg)
Self‐Modifying Code (SMC)
Definition: Any program that loads, generates, or
mutates code at runtime.
SMC is important to verify
Intrinsically natural under von Neumann architecture
Many applications, including
Runtime code generationCommonly used for improving performanceJava Library: gnu.bytecode, org.apache.bcelC#(.NET) Library: System.Reflection.Emit
Runtime code modificationCode obfuscationMalicious softwareShellcode
![Page 52: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/52.jpg)
Example ‐ A Typical OS Bootloader
Memory Disk
0x1000
bootloaderbootloaderbootloaderbootloader
kernelkernel
Sector 1
Sector 2 kernelkernelcopied by BIOS before start-up
copied by bootloader
Load kernel
jmp
0x1000
kernel
(in mem)
0x0000
0x7c00
![Page 53: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/53.jpg)
Verifying SMC ‐ the Challenge
For OS bootloader(runtime
code generation)
The kernel is accessed as regular data
But also executes as program code
For General SMC scenario
Code and data are stored in the same memory
Program code alters at runtime
Unbounded times of code modification
Control flow is difficult to represent
All the existing verification techniques:
have to assume a fixed code heap
stop working in the presence of SMC!Memory
0x1000
Load kernel
jmp
0x1000
kernel
(in mem)
0x0000
kernel
(in mem)
kernel
(in mem)
0x7c00
![Page 54: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/54.jpg)
Our New Idea ‐ Machine model
Machine model used before
Generalized model
Extend the assertion language: expressing code body
![Page 55: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/55.jpg)
Intuition of CAP
The program is stored in the code heap
Code Heap
![Page 56: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/56.jpg)
Intuition of CAP
The program is stored in the code heap
Code blocks and control flow
For every code block,
Assign a precondition
Obtain intermediate conditions
Reason through the code body
Each code block’s export condition must derive target preconditions
Code Heap
…
…
…
…
…
…
…
…
…
…
……
……
……
……
f1 : f2 :
f3 :
{a1
}{a1
} {a2
}{a2
}
{a3
}{a3
}
![Page 57: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/57.jpg)
Intuition of CAP
The program is stored in the code heap
Code blocks and control flow
For every code block,
Assign a precondition
Obtain intermediate conditions
Reason through the code body
Each code block’s export condition must derive target preconditions
Code specification
Partial correctness
Whenever fi
is reached, ai
is satisfied
Code Specification
f1 : {a1
}{a1
}f2 : {a2
}{a2
}f3 : {a3
}{a3
}
![Page 58: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/58.jpg)
Intuition of GCAPUnified Heap
![Page 59: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/59.jpg)
Intuition of GCAP (cont’d)
Generalized code blocks
Executing sequences of code
Can have overlap in memory
For every code block,
Precondition and intermediate conditions
carry program code
Export condition must derive target conditions disjunction
Parametric code
Solves the verification of unbounded
code modification
Local reasoning
Eliminate irrelevant code
…
…
…
…
…
…
…
…
…….
……
…….
……
….
….
….
….
…….…….
a3
a3
a1
a1 a2
a2
a4
a4
a5
a5
Unified Heap
![Page 60: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/60.jpg)
Example ‐ Certifying the bootloader• Two code modules
• Control Flow
• Assign the pre‐conditions
Memory
0x1000
Load kernel
jmp
0x1000
kernel
0x0000
B1
B2
0x7c00
B2
in memoryB2
in memory
B2
in disk &
B1
in memory
B2
in disk &
B1
in memory
![Page 61: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/61.jpg)
Formalization
Assertion Logic
A higher‐order logic
We use Calculus of induction construction (CiC)
Parametric code is expressed with existential quantifiers
Axiomatic inference rules for judgments
Well‐formed world
Well‐formed code heaps
Well‐formed code blocks
Three‐level systems
GCAP0: Verifying non‐self‐modifying code
GCAP1: Verifying runtime code generation
GCAP2: Verifying general self‐modifying code
![Page 62: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/62.jpg)
Soundness & Expressiveness
• Soundness and partial correctness (GCAP0,1,2)– Theorem: any well‐formed world is safe to
execute for arbitrary steps without violating its specification
• Expressiveness (GCAP2)– Theorem: Any invariant‐based proof can be
translated into GCAP2.– Expressiveness: GCAP2 > GCAP1 > GCAP0
![Page 63: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/63.jpg)
What we have verified [PLDI’07]
Basic SMC Constructs Important Applications
opcode
modification self‐growing code
control flow modification polymorphic code
unbounded code rewriting code optimization
runtime code checking code compression
runtime code generation code obfuscation
multilevel RCG code encryption
self‐mutating code block OS bootloaders
mutual modification shellcode
![Page 64: Advanced Topics: Reasoning About Code Pointers & Self](https://reader031.vdocuments.us/reader031/viewer/2022012423/61781afdee9a360e60343932/html5/thumbnails/64.jpg)
Implementation
• Implementation (Under Coq)– General machine model GTM– Encoding of x86 and MIPS architectures– Assertion language and separation logic– GCAPs
with the complete proof of soundness
– Certified examples• A real OS bootloader
(under Bochs)
• MIPS code examples (under SPIM)