correctness-preserving derivation of concurrent garbage collection algorithms
DESCRIPTION
Correctness-Preserving Derivation of Concurrent Garbage Collection Algorithms. Martin T. Vechev. David F. Bacon. Eran Yahav. University of Cambridge. IBM T.J. Watson Research Center. PLDI – June 2006. Why Concurrent Garbage Collection ?. Java and C# - PowerPoint PPT PresentationTRANSCRIPT
Correctness-Preserving Derivation of Concurrent Garbage Collection Algorithms
Martin T. Vechev Eran Yahav David F. Bacon
University of Cambridge IBM T.J. Watson Research Center
PLDI – June 2006
Why Concurrent Garbage Collection ?
Java and C# Garbage-collected languages are prevalent
Multicore Concurrency is becoming prevalent
Cheap RAM Large heaps are becoming prevalent
Real-Time Systems More widely used
Memory ModelThread Model
Concurrency PrimitivesCPU primitives
Tracing/reference countingmoving
Allocate White / BlackDijkstra / Steele / Yuasa Barrier
Atomic / Incremental Stack SnapshotWrite Barrier Atomic / Non-atomic
Color toggle, stackletsetc etc etc
Implementation
Existing Way to Create a Concurrent GC
ENVIRONMENT
REQUIREMENTS
TECHNIQUES
??
• Hard to verify/test• Often buggy• Did the monkey choose well??
ThroughputMemory Consumption
Pause Time
Ben-Ari Base ‘84
Dijkstra(C) ‘78
Doligez(C) ‘93
Azatchi ‘03
Domani ‘03
Yuasa ‘90
Pixley ‘88
Ben-Ari Base ‘84
Doligez ‘94
Ben-Ari Extended ‘84
Steele(C) ‘75
Boehm ‘91
BarabashBarabash ‘ ‘0303
AL
GO
RIT
HM
SA
LG
OR
ITH
MS
PR
OO
FS
PR
OO
FS
Concurrent GC algorithms and proofs are hard
Incorrect
Correct
(C) Corrected
FA
MIL
YF
AM
ILY
THEOREM PROVING
Optimal Correct Implementation
Our Research Vision
Memory ModelThread Model
Concurrency PrimitivesCPU primitives
ENVIRONMENT (Declarative
Specification)
FormallyDefined Techniques
Automated System
ThroughputMemory Consumption
Pause Time
REQUIREMENTS
In This Work
Memory ModelThread Model
Concurrency PrimitivesCPU primitives
FIXED ENVIRONMENT
Formally Defined Techniques
forTracing Non- Moving GC
Automated System
REQUIREMENTS
Throughput
Pause Time
Memory Consumption
Algorithm 1 Algorithm 2 Algorithm 3 Algorithm N< < < …
Problem : Interference
A
CB
Traced
Not Traced
1. GC traced B
SYSTEM = MUTATOR || GC
Problem : Interference
A
CB
A
CB
Traced
Not Traced
1. GC traced B 2. Mutator links C to B
SYSTEM = MUTATOR || GC
Problem : Interference
A
CB
A
CB
A
CB
X
Traced
Not Traced
1. GC traced B 2. Mutator links C to B
3. Mutatorunlinks C from A
SYSTEM = MUTATOR || GC
Problem : Interference
A
CB
A
CB
A
CB
A
CB
Traced
Not Traced
C LOST
1. GC traced B 2. Mutator links C to B
3. Mutatorunlinks C from A
4. GC traced A
SYSTEM = MUTATOR || GC
The 3 Families of Concurrent GC Algorithms
A
CB
1. Marks C when C is linked to B (DIJKSTRA)
A
CB
2. Marks C when link to C is removed (YUASA)
X A
CB
3. Rescan B when C is linked to B (STEELE)
• Solutions are applied uniformly for all objects
C CB
Contributions
Systematic Exploration A new parametric model of concurrent GC Better understanding New algorithms – potentially useful
Formal Relationship between algorithms Space - Relative precision between algorithms
Sharing Proof Burden Correctness-preserving “transformations”
A Parametric Concurrent GC Skeleton
Intuition : Common out as much as possible
Record interaction history between collector and mutator during tracing
Collector exposes “hidden objects” based on entire interaction history
mark … reclaim
Complete Garbage Collection
Expose(L,D)
Change Heap
COLLECTOR
MUTATOR
mark Expose(L,D)
Change Heap
A Parametric Concurrent GC Skeleton
Dimensions: an intuition
The effect of each Mutator/GC action is controlled by a dimension
Collector Scans Pointer Wavefront Granularity
Mutator Allocates Object Allocation Color
A B
Mutator Creates Pointer Counting
Mutator Overwrites Pointer Snapshot
X
C
Implementation Choice: Wavefront
Per-Field Wavefront• Exact information• One bit per field• More expensive• More synchronization• More garbage collected
Per-Object Wavefront• Approximate Information• One bit per object• Less expensive• Less synchronization • Less garbage collected
Choice: Record on Link or Unlink
Record on Link• More synchronization• More garbage collected
Record on Unlink• Less synchronization • Less garbage collected
X
Combined Choices
Record on Link Record on Unlink
Per
-Fie
ld W
FP
er-O
bjec
t W
F A B
X
X
A B
A B
A B
Combined Choices Per Object
Rec. Link ARec. Link B
Rec. Link A Unlink B
Per
-Fie
ld A
Per
-Fie
ld B
Rec. Unlink ARec. Link B
Rec. Unlink ARec. Unlink B
Per
-Fie
ld A
Per
-Obj
BP
er-O
bj A
Per
-Fie
ld B
Per
-Obj
AP
er-O
bj B
X
X
X
X
X
X
X
X
A B
Correctness
• Transformations = Proof Steps
APEX (U, U, U, U, {}) APEX (U, U, U, U, {})
STEELE
DIJKSTRA(stacks,U,{},U,{})
STEELE-DSTEELE-D STEELE-YCSTEELE-YC
STEELE-D-YCSTEELE-D-YC
DIJKSTRA-OLDDIJKSTRA-OLD DIJKSTRA-YCDIJKSTRA-YC STEELE-BCSTEELE-BC
HYBRID-YC(stacks,A,{},{},{}) HYBRID-YC(stacks,A,{},{},{})
STEELE-D-BCSTEELE-D-BC DIJKSTRA-BCDIJKSTRA-BC
YUASA (stacks, A, {}, {}, U)
START WITH A CORRECT ALGORITHM
RETAIN LESS GARBAGE
RETAIN MORE GARBAGE
Intuition: an algorithm is more precise than another if it collects more garbage
An algorithm that is less precise (more conservative) than a correct algorithm is guaranteed to be correct
Should be a reference point for practical comparisons no ad-hoc methods
Hard to do manually: need a tool to provide insights Finding the “right” definition was harder than
proving safety, yet simpler than “relative concurrency”
Relative Precision
Precision
APEX (U, U, U, U, {}) APEX (U, U, U, U, {})
STEELE
DIJKSTRA(stacks,U,{},U,{})
STEELE-DSTEELE-D STEELE-YCSTEELE-YC
STEELE-D-YCSTEELE-D-YC
DIJKSTRA-OLDDIJKSTRA-OLD DIJKSTRA-YCDIJKSTRA-YC STEELE-BCSTEELE-BC
HYBRID-YC(stacks,A,{},{},{}) HYBRID-YC(stacks,A,{},{},{})
STEELE-D-BCSTEELE-D-BC DIJKSTRA-BCDIJKSTRA-BC
YUASA (stacks, A, {}, {}, U)
MORE PRECISE
LESS PRECISE
Conclusions
Systematic exploration of an algorithm space Useful new algorithms
Formal definition of Relative precision between algorithms
A first step towards automatic derivation of concurrent garbage collectors