atomicity: a powerful concept for analyzing concurrent software shaz qadeer microsoft research
Post on 19-Dec-2015
213 views
TRANSCRIPT
Atomicity: A powerful concept for analyzing concurrent software
Shaz Qadeer
Microsoft Research
Concurrent programs
Processor 1
Processor 2
Thread 1
Thread 2
Thread 3
Thread 4
• Operating systems, databases, web servers,
browsers, GUIs, web services
• Modern languages: Java, C#
• Cost of multiprocessing desktop < $2000
Reliable concurrent software?
• Correctness problem– does program behaves correctly for all
inputs and all interleavings?– very hard to ensure with testing
• Bugs due to concurrency are insidious – non-deterministic, timing dependent– data corruption, crashes– difficult to detect, reproduce, eliminate
Multithreaded program executionThread 1 ... int t1 = hits; hits = t1 + 1 ...
t2=hits hits=t2+1t1=hits hits=t1+1
hits=0 hits=2
t2=hits hits=t2+1t1=hits hits=t1+1
hits=0 hits=1
t2=hits hits=t2+1t1=hits hits=t1+1
hits=0 hits=1
Thread 2 ... int t2 = hits; hits = t2 + 1 ...
Races in action
• Power outage in northeastern grid in 2003
• Affected millions of people
• Race in Alarm and Event Processing code
• “We had in excess of three million online operational hours in which nothing had ever exercised that bug. I'm not sure that more testing would have revealed it.”-- GE Energy's Mike Unum
Race conditions
A race condition occurs if two threads access a shared variable at the same time, and at least one of the accesses is a write
Thread 1 ... int t1 = hits; hits = t1 + 1 ...
Thread 2 ... int t2 = hits; hits = t2 + 1 ...
Preventing race conditionsusing locks
• Lock can be held by at most one thread
• Race conditions are prevented using locks– associate a lock with each shared variable– acquire lock before accessing variable
Thread 1 synchronized(lock) { int t1 = hits; hits = t1 + 1}
Thread 2 synchronized(lock) { int t2 = hits; hits = t2 + 1}
hits=0 hits=2
acq t1=hits hits=t1+1 rel acq t2=hits hits=t2+2 rel
Race detection
• Static:– Sterling 93, Aiken-Gay 98, Flanagan-Abadi
99, Flanagan-Freund 00, Boyapati-Rinard 01, von Praun-Gross 01, Boyapati-Lee-Rinard 02, Grossman 03
• Dynamic: – Savage et al. 97 (Eraser tool)– Cheng et al. 98 – Choi et al. 02
int balance;
Race-free bank account
void deposit (int n) { synchronized (this) { balance = balance + n; }}
int balance;
Race-free bank account
void deposit (int n) { synchronized (this) { balance = balance + n; }}
int read( ) { int r; synchronized (this) { r = balance; } return r;}
void withdraw(int n) { int r = read( ); synchronized (this) { balance = r – n; }}
Race-freedom not sufficient!
Thread 1
deposit(10);
Thread 2
withdraw(10);
balance = 10
int balance;
Atomic bank account
void deposit (int n) { synchronized (this) { balance = balance + n; }}
int read( ) { int r; synchronized (this) { r = balance; } return r;}
void withdraw(int n) { synchronized (this) { balance = balance – n; }}
java.lang.StringBuffer (jdk 1.4)
“String buffers are safe for use by multiple threads. The methods are synchronized so that all the operations on any particular instance behave as if they occur in some serial order that is consistent with the order of the method calls made by each of the individual threads involved.”
java.lang.StringBuffer is buggy!public final class StringBuffer { private int count; private char[ ] value; . . public synchronized StringBuffer append (StringBuffer sb) { if (sb == null) sb = NULL; int len = sb.length( ); int newcount = count + len; if (newcount > value.length) expandCapacity(newcount); sb.getChars(0, len, value, count); //use of stale len !! count = newcount; return this; }
public synchronized int length( ) { return count; }
public synchronized void getChars(. . .) { . . . }}
Atomicity• A method is atomic if it seems to execute “in one
step” even in presence of concurrently executing threads
• Common concept– “(strict) serializability” in databases– “linearizability” in concurrent objects– “thread-safe” multithreaded libraries
• “String buffers are safe for use by multiple threads. …”
• Fundamental semantic correctness property
Definition of atomicity
• deposit is atomic if for every non-serialized execution, there is a serialized execution with the same behavior
acq(this) r=bal bal=r+n rel(this)x y z
Serialized execution of deposit
acq(this) r=bal bal=r+n rel(this)x y z
acq(this) r=bal bal=r+n rel(this)x y z
Non-serialized executions of deposit
blue thread holds lock red thread does not hold lock operation y does not access balance operations commute
S0 S1 S2 S3 S4 S7S6S5
acq(this) r=bal bal=r+n rel(this)x y z
S0 T1 T2 T3 S4 S7T6S5
acq(this) r=bal bal=r+n rel(this)x y z
S0 T1 S2 T3 S4 S7S6S5
y r=bal bal=r+n rel(this)x acq(this) z
S0 T1 T2 T3 S4 S7S6S5
acq(this) r=bal bal=r+n rel(this)x y z
Reduction (Lipton 75)Reduction (Lipton 75)
S0 S1 S2 T3 S4 S7S6S5
acq(this) r=bal bal=r+n rel(this)y zx
blue thread holds lock after acquire operation x does not modify lock operations commute
B: both right + left commutes– variable access holding lock
N: atomic action, non-commuting
– access unprotected variable
Four atomicities
•R: right commutes– lock acquire
S0 S1 S2
acq(this) x
S0 T1 S2
x acq(this)
S7T6S5
rel(this) z
S7S6S5
rel(this)z
L: left commutes– lock release
S2 S3 S4
r=bal y
S2 T3 S4
r=baly
S2 T3 S4
r=bal x
S2 S3 S4
r=balx
S0. S5
R* N L*x Y. . .
S0. S5
R* N L*x Y. . .
Sequential composition Use atomicities to perform reduction Lipton: sequence (R+B)*;(N+); (L+B)* is
atomic
CCCCCCCCCNNNCCCLLLCNRNRRCNRLBBCNRLB; R; B ; N; L
; N
NR
R;N;L ; R;N;L ; N
CN
int balance;
Bank account
/*# atomicity N */void deposit (int x) { acquire(this); int r = balance; balance = r + x; release(this);}
/*# atomicity N */int read( ) { int r; acquire(this); r = balance; release(this); return r;}
/*# atomicity N */void withdraw(int x) { int r = read( ); acquire(this); balance = r – x; release(this);}
RBBL
RBLB
NRBL
NN
N
/*# guarded_by this */
int balance;
Bank account
/*# atomicity N */void deposit (int x) { acquire(this); int r = balance; balance = r + x; release(this);}
/*# atomicity N */int read( ) { int r; acquire(this); r = balance; release(this); return r;}
/*# atomicity N */void withdraw(int x) { acquire(this); int r = balance; balance = r – x; release(this);}
RBBL
RBLB
RBBL
NN
N
/*# guarded_by this */
Soundness theorem
• Suppose a non-serialized execution of a well-typed program reaches state S in which no thread is executing an atomic method
• Then there is a serialized execution of the program that also reaches S
Atomicity checker for Java
• Leverage Race Condition Checker to check that protecting lock is held when variables accessed
• Found several atomicity violations – java.lang.StringBuffer – java.lang.String– java.net.URL
“String buffers are safe for use by multiple threads. The methods are synchronized so that all the operations on any particular instance behave as if they occur in some serial order that is consistent with the order of the method calls made by each of the individual threads involved.”
“String buffers are atomic”
More work on atomicity checking
• Dynamic analysis (Wang-Stoller03, Flanagan-Freund 04)
• Model checking (Robby et al. 04)
So far…
• Atomicity as a lightweight and checkable specification
Now…
• Atomicity for precise and efficient analysis of concurrent programs
Why is precise analysis of concurrent programs difficult?
The problem
• Given a concurrent boolean program with assertions, does the program ever go wrong by failing an assertion?
Abstract interpretation
• Cousot-Cousot 77, Graf-Saidi 97
• unbounded data boolean data
Concurrent boolean program without procedures
• k = size of CFG of the program
• n = number of threads
• Need to analyze all interleavings of various threads – complexity proportional to kn
Concurrent boolean program with procedures
• Ramalingam 00: The problem is undecidable, even with only two threads– two unbounded stacks– reduction from the undecidable problem “Is
the intersection of two context-free languages empty?”
Atomic blocks to the rescue!
S0. S5
R* N L*x Y. . .
S0. S5
R* N L*x Y. . .
Other threads need not be scheduled in the middle of an atomic block
Lipton: any sequence (R+B)*; (N+) ; (L+B)* is an atomic block
First idea
• Infer maximal atomic blocks
Concurrent boolean program without procedures
• k = size of CFG of the program
• n = number of threads
• a = size of CFG of inferred atomic blocks
• Need to analyze all interleavings of various threads but only at atomic block boundaries– complexity proportional to (k/a)n
Second idea
• Summarize inferred atomic blocks
• Inspired by summarization of procedures in sequential programs
Summarization for sequential programs(Sharir-Pnueli 81, Reps-Horwitz-Sagiv 95)
• Bebop, ESP, Moped, MC, Prefix, …
int x;
void incr_by_2() { x++; x++;}
void main() { … x = 0; incr_by_2(); … x = 0; incr_by_2(); … x = 1; incr_by_2(); …}
x x’
0 21 3
Assertion checking for sequential programs
• Given a sequential boolean program with assertions, does the program ever go wrong by failing an assertion?
• Boolean program with:– g = number of global vars– l = max. number of local vars in any scope– k = size of the CFG of the program
• Complexity is O( k 2 O(g+l)
), linear in the size of CFG
• Summarization enables termination in the presence of recursion
Summarization in concurrent programs
• Unarticulated so far
• Naïve extension of summaries for sequential programs do not work
Call P Return P
Second idea
• Summarize inferred atomic blocks
• Summary of procedure = summary of constituent atomic blocks
• Often procedure is single atomic block– In Atomizer benchmarks (Flanagan-Freund
04), majority of procedures are atomic
Concurrent boolean program with procedures
• Ramalingam 00: The problem is undecidable, even with only two threads.
• Qadeer-Rajamani-Rehof (POPL 04): The problem is decidable, if all recursive procedures are atomic.
– For a sequential program, the whole execution is an atomic block
– Algorithm behaves exactly like classic interprocedural dataflow analysis (Sharir-Pnueli 81)
• Model checker for concurrent software• Implementation of atomic block inference
and summarization• Applications
– Concurrent systems code, e.g., device drivers– Web services– Spec# (C# + specifications)
Atomicity as a language primitive
• First proposed in the 70s– Tony Hoare– David Lomet
• Hardware implementation – Rajwar-Goodman 02
• Software implementation – Herlihy-Luchangco-Moir-Scherer 03– Harris-Fraser 03– Welc-Jagannathan-Hosking 04– MIT (Martin Rinard’s group)
Conclusions
• Atomicity is a useful concept for analyzing concurrent programs– Lightweight specification– Simplifies formal and informal reasoning– Enables precise and efficient analysis
• Perhaps the right synchronization primitive for future concurrent languages?