efficient software-based fault isolation

Efficient Software-Based Fault Isolation

By Robert Wahbe, Steven Lucco, Thomas E. Anderson, and Susan L. Graham

Presented by Pehr Collins

Background: Tannenbaum-Torvalds Debate

Linus Torvalds Andrew Tannenbaum

Monolithic versus Microkernel

• Monolithic OS– Co-locates modules in same address space– Faults in extension code could bring down

whole OS or corrupt data– Not safe! – Many developers choose performance over

safety

• Microkernel OS– Core functions handled by microkernel– Additional functionality added by means of

modules in separate address spaces– Faults are isolated– Safe but has a performance cost– Calls between modules required a full

context switch– Three orders of magnitude more expensive

than normal procedure call to same address space

Resolving Conflict Between Safety and Performance

• Last week we looked at “Improving IPC by Kernel Design” by J. Liedtke – Optimization techniques could decrease context switch performance

penalty to two orders of magnitude for microkernel IPC– Simultaneous with this paper, but still not enough to tip the balance

• Enter software-based fault isolation– No more conflict: OS extension code can be both safe and efficient

How to Resolve Conflict? Sandboxing

•Fault domains are contiguous memory segments used for untrusted modules Distinguished by unique identifiers

•Protection is handled by software in the same address space for all modules

Isolating the Fault Domain

• Distrusted module code in a fault domain is modified to prevent writing and jumping to outside addresses

• This prevents distrusted module from harming other domains

• Two ways to accomplish this– Segment matching which pinpoints fault locations– Sandbox addressing which provides no data on source of faults

Segment Matching• Most control transfer instructions can be statically verified as

address is known at compile time• Checks are added to all other potentially unsafe instructions

– Jumps to register address– Stores to register address

• Illegal addresses prevented via segment matching– Check if unsafe instruction’s target address has correct segment

identifier– If check fails, trap to system error routine outside distrusted module’s

fault domain

Segment ID

Target Address

= UpperAddress Bits

Segment Matching• Requires four dedicated registers

1. Holds addresses in the code segment2. Holds addresses in the data segment3. Holds the segment shift amount4. Holds the segment identifier

• These registers are used only by inserted code, never modified by distrusted module code

• Dedicated registers are used to perform the checks on untrusted code• Performance impact of assigning some registers to become dedicated

registers is minimal on a RISC system

Segment ID

Target Address

= UpperAddress Bits

Address Sandboxing• Even better performance than segment matching• Cost: lose the information about the source of the faults• Before each unsafe instruction insert code that sets the upper

bits of the target address to the correct segment identifier• Does not catch illegal addresses• Prevents illegal addresses from affecting any other fault

domain• But what happens when there is an illegal address?

– It just jumps/writes to a garbage location within the fault domain

Segment ID

Target Address

overwrite

Address Sandboxing

• Requires five dedicated registers1. Holds the segment mask2. Holds the code segment identifiers3. Holds the data segment identifiers4. Holds the sandboxed code address5. Holds the sandboxed data address

Segment ID

Target Address

overwrite

Both Techniques Require Dedicated Registers

• Segment Checking: 4 dedicated registers• Address Sandboxing: 5 dedicated registers• What happens if all registers are already

allocated by the compiler?

Trust/Performance Tradeoff

• Only distrusted modules incur performance penalty

• Trusted modules can run at full speed

• We have covered write and jump, but what about load?

• Security can be ramped up to prevent distrusted modules from reading data outside their fault domain– Increases execution time

overhead (by quite a bit)!

Resource Protection• Fault domains share the same virtual address space• Problem: if a fault domain made system calls it can close or

delete files needed by other code in the address space• Could cause crash• Potential solution: modify the OS to know about fault

domains• Not portable• Their solution: resource arbitration

Resource Arbitration• Require distrusted modules to access resources through

cross-fault-domain RPC• Reserve a fault domain to hold trusted arbitration code• Arbiter determines safeness of system calls by other fault

domains• System calls in object code of distrusted modules are

transformed to use the arbiter RPC call• Trusted modules make system calls as normal and share fault

domain with arbiter

How Do Modules Communicate?Cross-fault-domain RPC

• Since the whole idea of fault domains is to provide better IPC performance, this is essential

• Trusted stubs used for fault domains to call outside their domain• Stubs run unprotected outside caller and callee domains• Stubs copy cross-domain arguments (marshal) and manage machine state• Trustworthiness of stub allows caller and callee to communicate via a

shared buffer• This creates a LRPC as only a single shared copy of the data is necessary• Stubs are created manually for now

Cross-fault-domain RPC

Jump Table•Allows the untrusted module to call into a stub outside its fault domain•Each entry in the jump table is a legal entry point to a stub outside the untrusted fault domain•Is read only to untrusted module•Is written to by trusted modules to set the entry point addresses

Performance Testing

• Prototype running on DEC-MIPS and DEC-ALPHA

• Considered:1. How much overhead incurred by software

encapsulation?2. How fast is cross-default domain RPC?3. Performance impact of using software enforced

fault isolation on an application

Encapsulation Overhead

RPC and Fault Isolation Costs

Fault Domain RPC Cost Fault Isolation Overhead in POSTGRES

Results Analysis

• Savings can be represented by the following formula

• Function of: – Time spent in distrusted code (td)– Percentage of time spent crossing fault domains (tc)– Overhead of encapsulation (h)– Ratio (r) of fault domain crossing time to the crossing time of

competing hardware based RPC

Performance Analysis with Entire Application Encapsulated

Performance Analysis with 50% of Application Encapsulated

Conclusion• Results are impressive at first glance• Suggest that software based fault isolation is the way to go in many cases

where crossing time is sufficiently quicker than standard RPC• However, security, security, security!• When security for reads is desired, overhead shoots way up

– from 4.3% on average to 21.8%!• Errors from sandbox addressing difficult to track

– Could generate a garbage address inside the fault domain

• Stubs are manually generated• Requirement to dedicate 4 or 5 registers could be problematic

– Solution is geared towards RISC architecture– Authors mention that CISC systems like 8086 would suffer performance

penalties due to dedicated register requirements

Thanks for your attention!

• Diagrams on Monolithic/Microkernel from Wikipedia• Photos of Linus Torvalds and Andrew Tannenbaum from Wikipedia• Segment Matching and Sandboxing Addressing figures from Tony Bock’s presentation on the same

paper (Winter 2006)

Quick RecapFault Isolation in cooperating modules: what is the

problem?

• Existing schemes place each module in own address space

• This isolates faults• Major context switch overhead for tightly-

coupled modules

Quick RecapA Solution in Two Parts

1. Load code and data for distrusted module into own fault domain

2. Modify object code of this module to prevent writing jumping to addresses outside fault domain

• Portable and language agnostic solutions• Cost is slight increase in execution time for distrusted

modules• Yields significant boost in inter-fault domain performance

and hence overall performance

efficient software-based fault isolation

Documents

address spacefaults

fault locationssandbox

fault domainbut

segment matchingcost

segment matchingcheck

segment maskholds

code segmentholds addresses

separate address spacesfaults