STM in Managed Runtimes: High-Level Language Semantics
(MICRO 07 Tutorial)
Dan GrossmanUniversity of Washington
2 December 2007
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 2
So…
Hopefully you’re convinced high-level language semantics is needed for transactions to succeed
First session: focus on various notions of isolation• A taxonomy of ways weak isolation can surprise you• Ways to avoid surprises
– Strong isolation– Restrictive type systems
Second session:• Formal model for high-level definitions & correctness proofs• Memory-model problems• Integrating exceptions, I/O, and multithreaded transactions
}3 slide review
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 3
Notions of isolation
• Strong-isolation: A transaction executes as though no other computation is interleaved
• Weak-isolation?– Single-lock (“weak-sla”): A transaction executes as
though no other transaction is interleaved– Single-lock + abort (“weak undo”): Like weak-sla,
but a transaction can abort/retry, undoing changes– Single-lock + lazy update (“weak on-commit”): Like
weak-sla, but buffer updates until commit– Real contention: Like “weak undo” or “weak on-
commit”, but multiple transactions can run at once– Catch-fire: Anything can happen if there’s a race
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 4
Partition
Surprises arose from the same mutable locations being used inside & outside transactions by different threads
Hopefully sufficient to forbid that– But unnecessary and probably too restrictive
• Bans publication and privatization– cf. STM Haskell [PPoPP05]
For each allocated object (or word), require one of:1. Never mutated 2. Only accessed by one thread 3. Only accessed inside transactions4. Only accessed outside transactions
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 5
Static partition
Recall our “what is a race” problem:
atomic { if(x<y) ++z;}
atomic { ++x; ++y;}
initially x=0, y=0, z=0
r = z; //race?assert(z==0);
So “accessed on valid control paths” is not enough– Use a type system that conservatively assumes
all paths are possible
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 6
So…
Hopefully you’re convinced high-level language semantics is needed for transactions to succeed
First session: focus on various notions of isolation• A taxonomy of ways weak isolation can surprise you• Ways to avoid surprises
– Strong isolation– Restrictive type systems
Second session:• Formal model for high-level definitions & correctness proofs• Memory-model problems• Integrating exceptions, I/O, and multithreaded transactions
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 7
Why formal models
Some really smart people didn’t anticipate the surprises
So maybe there are other surprises even with the partitioning type system
Increase our confidence by modeling (mini-languages) various forms of isolation and prove them equivalent given the type system– So far: weak-sla, weak undo– Future work: weak on-commit, real contention,
thread-local, immutable
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 8
A formal program state
a; H; e1 || … || en
e: a thread (an expression that runs and terminates)
H: a heap (maps mutable labels to values)
a: either o or means one thread is in a transaction
o means no thread is in a transaction
• A high-level model for programmers & compiler-writers – No TM implementation details!
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 9
Operational semantics
Execution is a series of steps from one state to another– At each step, one thread runs some “instruction”
a;H;e1|| … ||en a’;H’;e1’|| … ||en’
Isolation amounts to using the a to restrict interleavings:• strong: If a = , only the transaction can touch H• weak-sla: If a = , no other thread can start a
transaction• weak undo: Like weak-sla, but transactions log
updates and can abort/retry by undoing them– a returns to o after the abort is complete
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 10
A family of languages
So “strong”, “weak-sla”, and “weak undo” are similar languages with different semantic rules– The AtomsFamily – Lots of Greek letters in the paper
Theorem:
If e1, …, en type-check with our partition rules, then
the set of states reachable from a;H;e1|| … ||en
is the same for strong, weak-sla, and weak undo.– Not quite, weak undo has more transient states
and can produce more garbage
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 11
Type-checking
Code can be used inside transactions, outside transactions, or both
Each memory location can be accessed only inside transactions or only outside transactions
Form of type-checking: ::= ot | wt | both
::= int | * | … ::= • | , x:
; ├ e : “Assuming variables in have those types, e has type
and stays on the side of the partition required by ”
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 12
Type-checking
; ├ e : “Assuming variables in have those type, e has type
and stays on the side of the partition required by ”
Three example rules (C-style syntax): (specialized slightly to emphasize the partition)
(x) = *’ ; ├ e : *
; ├ x : *’ ; ├ *e :
; wt ├ e :
; ├ atomic{e} :
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 13
The proof
The proofs are dozens of pages and a few person-months (lest a skipped step hold a surprise)
But the high-level picture is illuminating…
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 14
The proof
The proofs are dozens of pages and a few person-months (lest a skipped step hold a surprise)
But the high-level picture is illuminating…
If possible in strong, then possible in weak-sla– trivial: don’t ever violate isolation
strong weak-sla weak undo
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 15
The proof
The proofs are dozens of pages and a few person-months (lest a skipped step hold a surprise)
But the high-level picture is illuminating…
If possible in weak-sla, then possible in weak undo– trivial: don’t ever abort
strong weak-sla weak undo
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 16
The proof
The proofs are dozens of pages and a few person-months (lest a skipped step hold a surprise)
But the high-level picture is illuminating…
If possible in weak-sla, then possible in strong– Current transaction is serializable thanks to the
type system (can permute with other threads)– Earlier transactions serializable by induction
strong weak-sla weak undo
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 17
The proof
The proofs are dozens of pages and a few person-months (lest a skipped step hold a surprise)
But the high-level picture is illuminating…
If possible in weak undo, then possible in weak-sla?– Really need that abort is correct– And that’s hard to show, especially with
interleavings from weak isolation…
strong weak-sla weak undo
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 18
The proof
The proofs are dozens of pages and a few person-months (lest a skipped step hold a surprise)
But the high-level picture is illuminating…
If possible in weak undo, then possible in weak-sla?– Define strong undo for sake of the proof– Can show abort is correct without interleavings
strong weak-sla weak undo
strong undo
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 19
Why we formalize, redux
Thanks to the formal semantics, we:
• Had to make precise definitions
• Know we did not skip cases (at least in the model)
• Learned the essence of why the languages are equivalent under partition– Weak interleavings are serializable– Abort is correct– And these two arguments compose
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 20
So…
Hopefully you’re convinced high-level language semantics is needed for transactions to succeed
First session: focus on various notions of isolation• A taxonomy of ways weak isolation can surprise you• Ways to avoid surprises
– Strong isolation– Restrictive type systems
Second session:• Formal model for high-level definitions & correctness proofs• Memory-model problems• Integrating exceptions, I/O, and multithreaded transactions
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 21
Relaxed memory models
Modern languages don’t provide sequential consistency1. Lack of hardware support2. Prevents otherwise sensible & ubiquitous compiler
transformations (e.g., copy propagation)
So safe languages need two complicated definitions1. What is “properly synchronized”?2. What can compiler and hardware do with “bad code”?(Unsafe languages need (1))
A flavor of simplistic ideas and the consequences…
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 22
Ordering
Can get “strange results” for bad code– Need rules for what is “good code”
x = 1;
y = 1;
r = y;
s = x;assert(s>=r);//invalid
initially x==0 and y==0
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 23
Ordering
Can get “strange results” for bad code– Need rules for what is “good code”
x = 1;sync(lk){}y = 1;
r = y;sync(lk){} //same locks = x;assert(s>=r);//valid
initially x==y==0 initially x==0 and y==0
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 24
Ordering
Can get “strange results” for bad code– Need rules for what is “good code”
x = 1;atomic{}y = 1;
r = y;atomic{} s = x;assert(s>=r);//???
If this is good code, existing STMs are wrong
initially x==0 and y==0
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 25
Ordering
Can get “strange results” for bad code– Need rules for what is “good code”
x = 1;atomic{z=1;}y = 1;
r = y;atomic{tmp=0*z;} s = x;assert(s>=r);//???
“Conflicting memory” a slippery ill-defined slope
initially x==0 and y==0
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 26
Lesson
It is not clear when transactions are ordered, but languages need memory models
Corollary: This could/should delay adoption of transactions in well-specified languages
I wish I had more answers.
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 27
Other operations
So far every atomic block we have considered only:• read/wrote/allocated memory• called functions
What about:
1. I/O
2. Exceptions (or first-class continuations)
3. Spawn a thread
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 28
I/O
Can’t have irreversible actions in
transactions that might abort
Need pragmatic, partial solutions such as:• Forbid irreversible actions in transactions
– Trivial extension of our partition type system• Have unabortable transactions• Make actions reversible
– Buffer output– Buffer (idempotent) input
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 29
I After O
The real problem is input after output in a transaction
atomic{ write_to_file(); read_from_file();}
Contents read cannot depend on how external world sees write if the write is buffered
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 30
Native mechanism
Can generalize: Require native code to have 2 versions– Runtime calls 1 in transactions, 1 not in transactions– Native code responsible for 2 versions “the same”
Transactional versions also need callbacks for pre-commit, post-commit, and pre-abort– Sufficient for buffering input and output– Sufficient for external transaction systems
If “in transaction” version causes abort, that just encodes “safe dynamic failure/retry”
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 31
Exceptions
If code in atomic throws exception to outside atomic:
A. Does the transaction commit or abort?
B. Where does control transfer to?
Three “obvious” answers:
1. Commit transaction, transfer to exception handler– My preference; exceptions in most HLLs are
semantically just “non-local jumps”– Preserves design goal that atomic has no effect
on single-threaded programs
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 32
Exceptions
If code in atomic throws exception to outside atomic:
A. Does the transaction commit or abort?
B. Where does control transfer to?
Three “obvious” answers:
2. Abort transaction, transfer to retry the exception– Turns exceptions into aborts– Useful if exceptions due to shared-memory state– But programmer can encode thisatomic { try { s } catch (Throwable e) { abort; }}
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 33
Exceptions
If code in atomic throws exception to outside atomic:
A. Does the transaction commit or abort?
B. Where does control transfer to?
Three “obvious” answers:
3. Abort transaction, transfer to exception handler– But the transaction never happened?!– What if the exception value uses memory
allocated/written by the aborted transaction?!
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 34
Beyond exceptions
Other non-local jumps even harder to deal with
Example: Perhaps a coroutine jumps out of an atomic and then jumps back in– Then probably the jump out should continue the
transaction (commit or abort) later
It depends “what you’re trying to do” which is a problem if the same language feature (exceptions, continuations, etc.) is used for multiple idioms.– Tough policy questions; mechanism pretty easy
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 35
Multithreaded transactions
What if code in atomic creates a new thread?
Easy answers:1. Dynamic failure2. Thread not runnable unless/until transaction
commits
More interesting:3. Parallelism within transaction
– Isolation and concurrency are orthogonal– Controversial(?) claim: Necessary due to
Amdahl’s Law as core-count increases
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 36
Multithreaded transactions
(Semantics done; implementation is work in progress)
When does multithreaded transaction commit?– After all spawned threads terminate
What is hard for programmers?– Nested transactions now crucial for isolating
parallel computations inside a larger transaction
What is hard for implementors?– Transactional bookkeeping must be parallel– Unclear how hardware could best help
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 37
So…
Hopefully you’re convinced high-level language semantics is needed for transactions to succeed
First session: focus on various notions of isolation• A taxonomy of ways weak isolation can surprise you• Ways to avoid surprises
– Strong isolation– Restrictive type systems
Second session:• Formal model for high-level definitions & correctness proofs• Memory-model problems• Integrating exceptions, I/O, and multithreaded transactions
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 38
If I had another 2 hours
Plenty more semantics to consider:• Open-nesting semantics• Message-passing within transactions
– See recent work from Oregon, Purdue, UW• atomic {s1} orelse {s2}
– Try s2 if s1 aborts• Fairness guarantees• Obstruction-freedom• …
2 December 2007 Dan Grossman, MICRO Tutorial (STM Semantics) 39
Conclusions
• Weak isolation without type restrictions is surprising• Interaction with other language features non-trivial• PL-style semantics has a huge role to play in
bringing transactions to high-level languages– An essential complement to the core algorithms,
compiler, hardware work– Need “cross-cultural understanding” of the issues
wasp.cs.washington.edu