cs711: reference monitors part 1: os & sfi

CS711: Reference MonitorsPart 1: OS & SFIGreg MorrisettCornell University

Lang. Based Security

A Reference MonitorObserves the execution of a program and halts the program if its going to violate the security policy.

Common Examples:operating system (hardware-based)interpreters (software-based)firewalls

Claim: majority of todays enforcement mechanisms are instances of reference monitors.


Reference Monitors OutlineAnalysis of the power and limitations.What is a security policy?What policies can reference monitors enforce?Traditional Operating Systems.Policies and practical issuesHardware-enforcement of OS policies.Software-enforcement of OS policies.Why?Software-Based Fault Isolation Java and CLR Stack InspectionInlined Reference Monitors


Requirements for a MonitorMust have (reliable) access to information about what the program is about to do.e.g., what instruction is it about to execute? Must have the ability to stop the programcant stop a program running on another machine that you dont own.really, stopping isnt necessary, but transition to a good state.Must protect the monitors state and code from tampering.key reason why a kernels data structures and code arent accessible by user code. In practice, must have low overhead.


What Policies?Well see that under quite liberal assumptions:theres a nice class of policies that reference monitors can enforce (safety properties).there are desirable policies that no reference monitor can enforce precisely.rejects a program if and only if it violates the policyAssumptions:monitor can have access to entire state of computation.monitor can have infinite state.but monitor cant guess the future the predicate it uses to determine whether to halt a program must be computable.


Schneider's FormalismA reference monitor only sees one execution sequence of a program.So we can only enforce policies P s.t.:(1) P(S) = S.P ()where P is a predicate on individual sequences.A set of execution sequences S is a property if membership is determined solely by the sequence and not the other members in the set.


More Constraints on MonitorsShouldnt be able to see the future.Assumption: must make decisions in finite time.Suppose P () is true but P ([..i]) is false for some prefix [..i] of . When the monitor sees [..i] it cant tell whether or not the execution will yield or some other sequence, so the best it can do is rule out all sequences involving [..i] including .So in some sense, P must be continuous: (2) .P () (i.P([..i]))


Safety PropertiesA predicate P on sets of sequences s.t. (1) P(S) = S.P () (2) .P () (i.P([..i]))is a safety property: no bad thing will happen.

Conclusion: a reference monitor cant enforce a policy P unless its a safety property. In fact, Schneider shows that reference monitors can (in theory) implement any safety property.


Safety vs. SecuritySafety is what we can implement, but is it what we want?lack of info. flow isnt a property. Safety ensures something bad wont happen, but it doesnt ensure something good will eventually happen:program will terminateprogram will eventually release the lockuser will eventually make paymentThese are examples of liveness properties. policies involving availability arent safety prop. so a ref. monitor cant handle denial-of-service?


Safety Is NiceSafety does have its benefits:They compose: if P and Q are safety properties, then P & Q is a safety property (just the intersection of allowed traces.)Safety properties can approximate liveness by setting limits. e.g., we can determine that a program terminates within k steps.We can also approximate many other security policies (e.g., info. flow) by simply choosing a stronger safety property.


Practical IssuesIn theory, a monitor could:examine the entire history and the entire machine state to decide whether or not to allow a transition.perform an arbitrary computation to decide whether or not to allow a transition.In practice, most systems:keep a small piece of state to track historyonly look at labels on the transitionshave small labelsperform simple testsOtherwise, the overheads would be overwhelming.so policies are practically limited by the vocabulary of labels, the complexity of the tests, and the state maintained by the monitor.


Reference Monitors OutlineAnalysis of the power and limitations.What is a security policy?What policies can reference monitors enforce?Traditional Operating Systems.Policies and practical issuesHardware-enforcement of OS policies.Software-enforcement of OS policies.Why?Software-Based Fault IsolationInlined Reference Monitors


Operating Systems circa 75Simple Model: system is a collection of running processes and files.processes perform actions on behalf of a user.open, read, write filesread, write, execute memory, etc.files have access control lists dictating which users can read/write/execute/etc. the file.(Some) High-Level Policy Goals:Integrity: one users processes shouldnt be able to corrupt the code, data, or files of another user. Availability: processes should eventually gain access to resources such as the CPU or disk.Secrecy? Confidentiality? Access control?


What Can go Wrong?read/write/execute or change ACL of a file for which process doesnt have proper access.check file access against ACLprocess writes into memory of another processisolate memory of each process (& the OS!) process pretends it is the OS and execute its codemaintain process ID and keep certain operations privileged --- need some way to transition.process never gives up the CPUforce process to yield in some finite timeprocess uses up all the memory or diskenforce quotasOS or hardware is buggy...


Key Mechanisms in HardwareTranslation Lookaside Buffer (TLB)provides an inexpensive check for each memory access.maps virtual address to physical addresssmall, fully associative cache (8-10 entries)cache miss triggers a trap (see below)granularity of map is a page (4-8KB)Distinct user and supervisor modescertain operations (e.g., reload TLB, device access) require supervisor bit is set.Invalid operations cause a trapset supervisor bit and transfer control to OS routine.Timer triggers a trap for preemption.


Steps in a System CallTimecalls f=fopen(foo)User Processlibrary executes breakKerneltrapsaves context, flushes TLB, etc.checks UID against ACL, sets up IO buffers & file context, pushes ptr to context on users stack, etc.restores context, clears supervisor bitcalls fread(f,n,&buf)library executes breaksaves context, flushes TLB, etc.checks f is a valid file context, doesdisk access into local buffer, copiesresults into users buffer, etc.restores context, clears supervisor bit


Hardware TrendsThe functionality provided by the hardware hasnt changed much over the years. Clearly, the raw performance in terms of throughput has.Certain trends are clear:small => large # of registers: 8 16-bit =>128 64-bitsmall => large pages: 4 KB => 16 KBflushing TLB, caches is increasingly expensivecomputed jumps are increasingly expensivecopying data to/from memory is increasingly expensive So a trap into a kernel is costing more over time.


OS TrendsIn the 1980s, a big push for microkernels:Mach, Spring, etc.Only put the bare minimum into the kernel.context switching code, TLB managementtrap and interrupt handlingdevice accessRun everything else as a process.file system(s)networking protocolspage replacement algorithmSub-systems communicate via remote procedure call (RPC)Reasons: Increase Flexibility, Minimize the TCB


A System Call in MachTimef=fopen(foo)User ProcessbreakKernelsaves contextchecks capabilities, copies argumentsswitches to Unix server contextUnix Serverchecks ACL, sets up buffers, etc. returns to user. saves contextchecks capabilities,copies resultsrestores userscontext


MicrokernelsClaim was that flexibility and increased assurance would win out. But performance overheads were non-trivial Many PhDs on minimizing overheads of communicationEven highly optimized implementations of RPC cost 2-3 orders of magnitude more than a procedure call.

Result: a backlash against the approach.Windows, Linux, Solaris continue the monolithic tradition. and continue to grow for performance reasons (e.g., GUI) and for functionality gains (e.g., specialized file systems.) Mac OS X, some embedded or specialized kernels (e.g., Exokernel) are exceptions. VMware achieves multiple personalities but has monolithic personalities sitting on top.


Performance MattersThe hit of crossing the kernel boundary:Original Apache forked a process to run each CGI: could attenuate file access for sub-processprotected memory/data of server from rogue scripti.e., closer to least privilegeToo expensive for a small script: fork, exec, copy data to/from the server, etc. So current push is to run the scripts in the server. i.e., throw out least privilegeSimilar situation with databases, web browsers, file systems, etc.


The Big Question?From a least privilege perspective, many systems should be decomposed into separate processes. But if the overheads of communication (i.e., traps, copying, flushing TLB) are too great, programmers wont do it.

Can we achieve isolation and cheap communication?


Reference Monitors OutlineAnalysis of the power and limitations.What is a security policy?What policies can reference monitors enforce?Traditional Operating Systems.Policies and practical issuesHardware-enforcement of OS policies.Software-enforcement of OS policies.Why?Software-Based Fault IsolationJava Stack InspectionInlined Reference Monitors


Software Fault Isolation (SFI)Wahbe et al. (SOSP93)Keep software components in same hardware-based address space.Use a software-based reference monitor to isolate components into logical address spaces.conceptually: check each read, write, & jump to make sure its within the components logical address space.hope: communication as cheap as procedure call.worry: overheads of checking will swamp the benefits of communication.Note: doesnt deal with other policy issuese.g., availability of CPU


One Way to SFIvoid interp(int pc, reg[], mem[], code[], memsz, codesz) { while (true) { if (pc >= codesz) exit(1); int inst = code[pc], rd = RD(inst), rs1 = RS1(inst), rs2 = RS2(inst), immed = IMMED(inst); switch (opcode(inst)) { case ADD: reg[rd] = reg[rs1] + reg[rs2]; break; case LD: int addr = reg[rs1] + immed; if (addr >= memsz) exit(1); reg[rd] = mem[addr]; break; case JMP: pc = reg[rd]; continue; ... } pc++;}}0: add r1,r2,r31: ld r4,r3(12)2: jmp r4


Pros & Cons of InterpreterPros:easy to implement (small TCB.)works with binaries (high-level language-independent.)easy to enforce other aspects of OS policyCons:terribly execution overhead (x25? x70?)but its a start.


Partial Evaluation (PE)A technique for speeding up interpreters.we know what the code is.specialize the interpreter to the code.unroll the loop one copy for each instructionspecialize the switch to the instructioncompile the resulting codeFor a cool example of this, see Fred Smith's thesis (hanging off my web page.)


Example PESpecialized interpreter:

reg[1] = reg[2] + reg[3]; addr = reg[3] + 12; if (addr >= memsz) exit(1); reg[4] = mem[addr]; pc = reg[4] 0: add r1,r2,r31: ld r4,r3(12)2: jmp r4 ...Original Binary:while (true) { if (pc >= codesz) exit(1); int inst = code[pc]; ...}Interpreter0: add r1,r2,r31: addi r5,r3,122: subi r6,r5,memsz3: jab _exit4: ld r4,r5(0) ...Resulting Compiled Code


SFI in PracticeUsed a hand-written specializer or rewriter.Code and data for a domain in one contiguous segment.upper bits are all the same and form a segment id.separate code space to ensure code is not modified.Inserts code to ensure stores [optionally loads] are in the logical address space.force the upper bits in the address to be the segment idno branch penalty just mask the addressmay have to re-allocate registers and adjust PC-relative offsets in code.simple analysis used to eliminate unnecessary masksInserts code to ensure jump is to a valid targetmust be in the code segment for the domainmust be the beginning of the translation of a source instructionin practice, limited to instructions with labels.


More on JumpsPC-relative jumps are easy:just adjust to the new instructions offset.Computed jumps are not:must ensure code doesnt jump into or around a check or else that its safe for code to do the jump.for this paper, they ensured the latter:a dedicated register is used to hold the address thats going to be written so all writes are done using this register.only inserted code changes this value, and its always changed (atomically) with a value thats in the data segment.so at all times, the address is valid for writing.works with little overhead for almost all computed jumps.


More SFI DetailsProtection vs. Sandboxing:Protection is fail-stop:stronger security guarantees (e.g., reads)required 5 dedicated registers, 4 instruction sequence20% overhead on 1993 RISC machinesSandboxing covers only storesrequires only 2 registers, 2 instruction sequence5% overheadRemote Procedure Call:10x cost of a procedure call10x faster than a really good OS RPCSequoia DB benchmarks: 2-7% overhead for SFI compared to 18-40% overhead for OS.


QuestionsWhat happens on the x86?small # of registersvariable-length instruction encodingWhat happens with discontiguous hunks of memory? What would happen if we really didnt trust the extension?i.e., check the arguments to an RPC?timeouts on upcalls?Does this really scale to secure systems?


cs711: reference monitors part 1: os & sfi

Documents

policies p

policy p

os policies

reference monitorspart

safety property

based securityrequirements

monitors state

desirable policies