Transcript
Page 1: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 1

CLR Reliability under Memory Exhaustion

Solomon Boulos

Page 2: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 2

Temporary Memory Exhaustion causes failures

• Out of Memory (OOM) is temporary• Shouldn’t cause failure

– Just wait for memory to become available– System take action to free up memory

• All managed code depends on CLR• Testing is difficult

– Exceptions are objects– Boxing (casting value type to object)– JIT compilation

Page 3: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 3

Overview

• Previous Work– Reliability Working Group– Improvements for Whidbey

• OOM behavior– Everett (CLR v1.1)– Whidbey (CLR v2.0)– WinFX

• Solutions– Transactions– Recovery

Page 4: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 4

Reliability Working Group

• Discussion of CLR reliability issues

• Interaction with Yukon and Avalon teams

• FailFast Behavior

• Controversial Decisions

• Fault Injection

Page 5: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 5

Improvements for Whidbey

• CLR hardened to Out of Memory (OOM)

• Constrained Execution Regions (CERs)– Eagerly Prepared (No JIT Compiling)– Blocks ThreadAbort

• Reliability Contracts– Describes reliability attributes of code– Allows for function calls within CER

• Unhandled Exception Policy

Page 6: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 6

My Approach

• Exhaust Memory (Not fault injection)

• Find failure points

• Consistently reproduce results

• Examine underlying causes

• Develop solutions

Page 7: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 7

Everett OOM Behavior

• Different classes of failures– Catchable Out of Memory (OOM) Exception– Type Initialization Exception– Invalid Program exception from JIT compiler– Fatal OOM Error– Fatal Execution Engine error

Page 8: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 8

Supporting Datavoid ManagedFunction(){

Regex* myReg = new Regex("*");

}Available Memory Observed Behavior

0-5860K Fatal Error

5892-5912K InvalidProgram

5924-5960K TypeInit

5890-Above Success

Page 9: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 9

Fault Injection Examplestatic void Main(string[] args){try

{ // operations in here

}catch ( OutOfMemoryException ){Console.WriteLine(“Nothing should get past me.");}

}

Page 10: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 10

Whidbey OOM Behavior

• See OOM Exception instead of– TypeInit– InvalidProgram

• Exception to Native host is COMPlusException– Not very helpful

• Fatal OOM only during initialization– Initialization can be large though (e.g. 10MB)

• CERs provide defense, but dangerous– CER { for (;;) } cannot be stopped

• Reliability Contracts = Honor System

Page 11: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 11

• Swallows exceptions

• Shell– Crashes and restarts

• WinFS– Silent Process Failure

• Indigo– False Completion

WinFX Case Studies

Base OSBase OS

Whidbey

WinFX

Page 12: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 12

Shell Failure

• Exhaust System Memory

• CLR throws OOM Exception

• Shell doesn’t catch

• Escalates to unhandled Win32 exception

• Shell crashes and restarts– Major disruption to user

Page 13: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 13

WinFS Test

• Simple Contact Store Functions– AddContact– RenameContact– RemoveContact– ListContacts– ReachMemory

Page 14: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 14

WinFS Test Normal Execution

• ListContacts() : “No Contacts Found”• AddContact(“Shane”) : Shane is added• ListContacts(): “Shane”• RenameContact(“Shane”, “Bob”): Shane is now

Bob• ListContacts(): “Bob”• RemoveContact(“Bob”): Bob is now deleted• ListContacts(): “No Contacts Found”

Page 15: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 15

WinFS Test Stressed Execution

• ListContacts() : “No Contacts Found”

• ReachMemory(8MB): 8MB Available

• AddContact(“Shane”) : Shane should be added

• ListContacts(): “No Contacts Found”

• Process Exits

Page 16: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 16

Indigo Test Specifications

• Client::SendMessage(): – Sends message to server and prints confirmation of

sending.

• Client::ReceiveMessage(): – Prints received message.

• Server::SendMessage(): – Sends message to client and prints confirmation of

sending.

• Server::ReceiveMessage(): – Prints message and responds with SendMessage()

Page 17: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 17

Indigo Test Behavior

• Normal Execution– Client::SendMessage()– Server::ReceiveMessage()– Server::SendMessage()– Client::ReceiveMessage()

• Execution with Memory Pressure– Client::SendMessage()– Server::ReceiveMessage()– Server::ExhaustMemory()– Server::SendMessage()– Client never receives message

Page 18: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 18

Solutions

• Transactions– In Memory– Durable (backed by disk)

• Recovery– Creates Recovery Log– Allows state restore

Page 19: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 19

Transaction Participantpublic TransactionParticipant(String _originalValue)

{ originalValue = _originalValue;

result = originalValue;}

public void Prepare(IPreparingEnlistment pe){ // do work for transactionresult = "New Value";// all is well, vote preparedpe.Prepared();

}

Page 20: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 20

Transaction Participant Continuedpublic void Commit(IEnlistment e){

// no work to do, vote done e.EnlistmentDone();}public void Rollback(IEnlistment e){

// restore originalValue result = originalValue; if ( null != e ) e.EnlistmentDone();}

Page 21: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 21

Simple Transaction ExampleTransactionParticipant tp = new TransactionParticipant(txtInput.Text);

try

{

using (TransactionScope s = new TransactionScope()){

Transaction.Current.VolatileEnlist(tp,false);

s.Consistent = true;

}

}

catch (TransactionAbortedException){}

txtInput.Text = tp.Result;

Page 22: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 22

rNotepad Techniques

• Log user work– KeyPressed Records– Resize Records

• Write work to log file every second

• Write checkpoint every 30 seconds

• Upon startup, recover– Checkpoint speeds up recovery

Page 23: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 23

Conclusion

• Testing is difficult but possible

• Temporary memory pressure shouldn’t cause failures

• Transactions and Recovery can provide resilient and recoverable solutions

Page 24: CLR Reliability under Memory Exhaustion

07/09/04 Windows Reliability Team 24

Questions?

• More info athttp://windows/sites/reliavuls/CLR/default.aspx


Top Related