r enabling trusted software integrity darko kirovski microsoft research milenko drinić miodrag...
TRANSCRIPT
Enabling Trusted Software IntegrityDarko Kirovski
Microsoft ResearchMilenko DrinićMiodrag Potkonjak
Computer Science Department University of California, Los
Angeles
Problem Description
HIGH LOWR
ET
UR
N A
DD
RE
SS
LOC
AL
VA
RIA
BLE
S
BU
FF
ER
HIGH PRIORITYPROCESS
LOW PRIORITYPROCESS
HIGH LOWR
ET
UR
N A
DD
RE
SS
LOC
AL
VA
RIA
BLE
S
BU
FF
ER
AT
TA
CK
CO
DE
HIGH PRIORITYPROCESS
LOW PRIORITYPROCESS
DATA
Buffer Overrun Goal
– Explore improperly implemented I/O– Divert execution to attack code
Simplest variant – Stack smashing– “Smashing The Stack For Fun And Profit” by Aleph
One ([email protected]), Phrack 49, 1996.
Numerous variants explore different vulnerabilities– Tutorials on the Web with bug descriptions– setuid() – Chen, Wagner, Dean, 2002.
What Can Be Done?
StackGuard – Cowan et al., 1998– Dummy value next to return address
Bounds checking for all pointers – Jones, Kelly, 1995– Slow in pointer-intensive software
Static analysis – Wagner, 2000– Verify all buffers – promising idea– Too many false alarms– Need to be resolved manually
Intrusion Prevention Current approaches
– Intrusion detection PREVENT rather than DETECT is
easier Intrusion prevention system
– Adversary must solve a computationally difficult task to run programs in high priority
Two types of binaries– Ordinary– Touched with a security wand
Run-time verification
Outline
How the system works? Software installation Example of constraint embedding Run-time verification How to break the system? Effect on performance
Outline
How the system works? Software installation Example of constraint embedding Run-time verification How to break the system? Effect on performance
An Intrusion Prevention System
Software
PUBLIC MODE
INSTALLATION MODE
Executes any code.Restricted access to resources.Script interpreters, distrustedprograms, P2P networking, etc.
Single process. Interruptsdisabled. Input = software.Output = software with additionalCPUID-dependent constraints.
CPUIDAtomicexecution
unit.
KeyedMAC
Softwaretrusted
TRUSTED MODE
Runs only trusted processes: OS+ user defined. Full or controlledaccess.
KeyedMAC
AbortRun
Burnt-in. Not a privacyissue, because it is neverrevealed externally.
Outline
How the system works? Software installation Example of constraint embedding Run-time verification How to break the system? Effect on performance
Software Installation Installer is on-
chip or on an EPROM with verified contents
Single process I/O – memory
mapped Interrupts
disabled Used registers,
memory overwritten
~ BOOT on PCsGOAL: embed constraints
w/o revealing CPUID.
CPUID
Softwareworking-copy
I-block
SPEF InstallationSoftwaremaster-copy
I-block
TIhash
Encrypt(3DES)
Domainordering
Constraintembedding
Ran
dom
bits
tre
am
Outline
How the system works? Software installation Example of constraint
embedding Run-time verification How to break the system? Effect on performance
Example: Instruction Scheduling
SPEF - Instruction Rescheduling
Softwaremaster-copy
TIhash
Encrypt(3DES)
Domainordering
Constraintembedding -verification
Randombitstream
SUB...
MOV...
MOV...
DIV...
MULT...
XOR...
SUB...
JUMP...
ADD...
MOV...
SUB...
MOV...
MOV...
DIV...
MULT...
XOR...
SUB...
JUMP...
ADD...
MOV...
SUB...
MOV...
MOV...
DIV...
MULT...
XOR...
SUB...
JUMP...
ADD...
MOV...
CPUID
Domainordering
How the Bitstream Reorders Ops?
3 - (1)
0x0080e0 LDR r1,[r8,#0]0x0080e4 LDR r0,[r9,#0]0x0080e8 MOV r3,r50x0080ec MUL r2,r0,r10x0080f0 MOV r1,#10x0080f4 LDR r0,[r6,#0]
(3)
(1) (2)
(5) (6)
(4)
b) Dependency graph
(1)(2)(3)(4)(5)(6)
Instructions Possible positions(1) (2) (3) (4) (5) (6)
a) Initial order of instructionsand their possible positions
initial position
possible position
conditional possible position
Controlstep
Availableinstructions
Part of bit-stream used
Selectedinstruction
1 10 (3)2 1 (2)
4 - (4)5 0 (5)
d) Instruction ordering procedure
0x0080e0 LDR r1,[r8,#0]0x0080e4 LDR r0,[r9,#0]0x0080e8 MOV r3,r5
0x0080ec MUL r2,r0,r10x0080f0 MOV r1,#10x0080f4 LDR r0,[r6,#0]
e) Final order of instructions
Instructions
1010...0110
c) Sample bit-stream
encoding 00 01 10 11(1) (2) (3) -(1) (2) - -
(4) (5)* -(1)* (4)* - -
(6)*(5) (6) - -
– Examples• Instruction rescheduling • Register assignment• Basic block reordering• Conditional branch selection• Filling unused opcode fields• Toggling signs of operands
Constraint Embedding Techniques Entropy of program representation is high Reduce entropy w/ constraints for 50+ bits
with preserved performance Exact entropy reduction unique for each
CPUID Constraint types– Requirements
• High entropy• Functional transparency• Transformation invariance• Effective implementation• Low performance
overhead
Outline
How the system works? Software installation Example of constraint embedding Run-time verification How to break the system? Effect on performance
CPU + SPEF Verifier
CPU ID
I-blockbuffer
Traditional CPUarchitecture
Softwareworking-copy
I-block
Encrypt(3DES)
TIhash
Domainordering
Constraintverification
Randombitstream
AB
OR
T o
r R
UN
Run-time Code Verification ARM instruction set and
simulated system 50 cycles 20K gates HW support?
Cach
e lin
e
Outline
How the system works? Software installation Example of constraint embedding Run-time verification How to break the system? Effect on performance
How to Break the System? Cryptographically secure keyed MAC
– Hard to extract CPUID from working-copies– Hard to create an I-block with CPUID
constraints satisfied w/o the CPUID Patch low entropy instruction blocks
– I-block with low entropy? Example:• I-block = one instruction and all other NOPS
– Hardware must detect I-blocks with low entropy
• Count and limit domain cardinality• Done during domain ordering
Patch I-blocks from working copies– Difficult? Hard to evaluate w/o a lot of software
Outline
How the system works? Software installation Example of constraint embedding Run-time verification How to break the system? Effect on performance
Performance Embedded bits of
entropy
Performance effect– 13-25% overhead– 7-17% with a cache
that logs TI-hashes
0 100 200 300
0
25
50
75
100
0 100 200 300
0
15
30
45
60
0 100 200 300
0
25
50
75
100
0 100 200 300
0
40
80
120
160
0 100 200 300
0
15
30
45
60
Cummulative Degrees of Freedom For All Constraint Types
Mpeg Encode Mpeg Decode Jpeg Encode Jpeg Decode Pegwit
64-
Inst
ruct
ion
Blo
ck C
ou
nt
35 68 5621 22
343 blocksMean: 136
380 blocksMean: 146
242 blocksMean: 142
330 blocksMean: 152
216 blocksMean: 140
0
2
4
6
8
10
12
14
16
18
1K, FA 1K, DM 2K, FA 2K, DM 4K, FM 4K, DM
Cache size
Eff
ecti
ve C
PI
No Verification
Verification without TIH cache
Verification with TIH cache
Simulated w/ ARMulator ARM instruction set
MediaBench suite
Summary Intrusion prevention On-line software verification for
authenticity Keyed message authentication code
– Stored as footer– Stored as constraints
• 50% decrease in code size overhead
Public and trusted execution mode Relatively hi/lo performance overhead
– No hardware acceleration– 20% - sets back Moore’s Law 4.5 months