a parallel, real-time garbage collector author: perry cheng, guy e. blelloch presenter: jun tao

29
A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Upload: simon-bryan

Post on 22-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

A Parallel, Real-Time Garbage Collector

Author: Perry Cheng,Guy E. Blelloch

Presenter: Jun Tao

Page 2: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Outline

• Introduction• Background and definitions• Theoretical algorithm• Extended algorithm• Evaluation• Conclusion

Page 3: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Introduction

• First garbage collectors:– Non-incremental, non-parallel

• Recent collector– Incremental– Concurrent– Parallel

Page 4: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Introduction

• Scalably parallel and real-time collector– All aspects of the collector are incremental– Parallel• Arbitrary number of application and collector threads

– Tight theoretical bounds on• Pause time for any application• Total memory usage

– Asymptotically but not practically efficient

Page 5: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Introduction

• Extended collector algorithm– Work with generations– Increase the granularity of the incremental steps– Separately handle global variables– Delay the copy on write– Reduce the synchronization cost of copying small

objects– Parallelize the processing of large objects– Reduce double allocation during collection– Allow program stacks

Page 6: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Background and Definitions

• A semispace Stop-Copy Collector– Divide heap memory into two equally-sized• From-space and to-space

– Suspend mutator and copy reachable objects to the to-space when from-space is full

– Update root values and reversing the role of from-space and to-space

Page 7: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Background and Definitions

• Types of Garbage Collectors

Page 8: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Background and Definitions

• Type of Garbage Collector (continued)

Page 9: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Background and Definitions

• Real-time Collector

– Maximum pause time– Utilization• The fraction of time that the mutator executes

– Minimum Mutator Utilization• A function of window size• Minimum utilization at all windows of that size• = 0 when window size <= maximum pause time

Page 10: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Theoretical Algorithm

• A Parallel, incremental and concurrent collector– Base on Cheney’s simple copying collector– All objects are stored in a shared global pool of

memory– Two atomic instruction

• FetchAndAdd• CompareAndSwap

– Collector interfaces with the application• Allocating space for a new object• Initializing the fields of a new object• Modifying the field of an existing object

Page 11: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Theoretical Algorithm

• Scalable Parallelism– Maintain the set of gray objects– Cheney’s technique• Keeping them in contiguous locations in to-space• Pros

– Simple

• Cons– Restricts the traversal order to breadth-first– Difficult to implement in a parallel setting

Page 12: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Theoretical Algorithm

• Scalable Parallelism (continued)– Explicitly managed local stack

• Each processor maintains a stack• A shared stack of gray objects• Periodically transfer gray objects between local and shared

stack• Avoid idleness

– Pushes (or pops) can proceed in parallel• Reserve a target region before transfer• Pushes and pops are not concurrent• Room sychronization

Page 13: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Theoretical Algorithm

• Scalable Parallelism (continued)– Avoid white objects being copied twice• Exclusive access by atomic instructions• Copy-copy synchronization

Page 14: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Theoretical Algorithm

• Incremental and Replicating Collection– Baker’s incremental collector• Copy k units of data when allocate a unit of data

– Bound the pause time

• Mutator can only see copied objects in to-space– A read barrier is needed

– Modification to avoid the read barrier• Mutator can only see the original objects in from-space

– A write barrier is needed

Page 15: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Theoretical Algorithm• Concurrency– Program and collector execute simultaneously– Program manipulate primary memory graph– Collector manipulate replica graph– A copy-write synchronization is needed

• Replica objects should be modified correspondently• Avoid race condition

– Mark objects being copied– Mutator’s update to replica should be delay

– A write-write synchronization is needed• Prohibit different mutator threads from modifying the same

memory location concurrently

Page 16: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Theoretical Algorithm

• Space and Time Bounds– Time bounds on each memory operation• ck

– C : a constant– K: the number of words we collect per word allocated

– Space bounds• 2(R(1+1.5/k)+N+5PD) ≈ 2(R(1+1.5/k)

– R: reachable space– N: maximum object count– P: P-way multiprocessor– D: maximum memory graph depth

Page 17: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Extended Algorithm• Globals, Stacks and Stacklets– Globals

• Updated when collection ends• Arbitrary many -> unbound time• Replicate globals like other heap objects• Every global has two location• A single flag is used for all globals

– Stacks and Stacklets• Divided stacks into fixed-size stacklets• At most one stacklet is active and the other can be replicated

savely• Also bound the waste space per stack

Page 18: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Extended Algorithm

• Granularity– Block Allocation and Free Initialization

• Avoid calling FetchAndAdd for every memory allocation• Each processor maintain a local pool in from-space and a

local pool in to-space when collector is on• Using a FetchAndAdd when allocating a local pool

– Write Barrier• Avoid updating copied objects every time• Record a triple <x, i, y> in a write log and defer• Invoke the collector when the write log is full• Eliminating frequent context switches

Page 19: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Extended Algorithm

• Small and Large Objects– Original Algorithm

• One field at a time– Reinterpretation of the tag word– Transferring the object from and to the local stack

– Extended Algorithm• Small objects

– Locked down and copied at a time

• Large objects– Divided into segments– One segment at a time

Page 20: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Extended Algorithm

• Algorithmic Modifications– Reducing double allocation

• One allocation by mutator and one by collector• Deferring the double allocation

– Rooms and Better Rooms• A push room and a pop room• Only one room can be non-empty• Rooms

– Enter the pop room, fetch work and perform, transition to the push room, push objects back to the shared stack

– Graying objects is time-consuming– Wait for entering the push room

Page 21: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Extended Algorithm

• Algorithm modifications– Rooms and Better Rooms (continued)

• Better rooms– Leave the pop room after fetching work from shared stack– Detect the shared stack is empty by maintaining a borrow counter

– Generational Collection• Nursery and tenured space• Trigger a minor collection when nursery space is full• Trigger a major collection when tenured space is full• Tenured references might not be modified during collection• Hold two fields for mutable pointer

– one for mutator to use, the other for collector to update

Page 22: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Evaluation

Page 23: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Evaluation

Page 24: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Evaluation

Page 25: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Evaluation

Page 26: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Evaluation

Page 27: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Evaluation

Page 28: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Evaluation

Page 29: A Parallel, Real-Time Garbage Collector Author: Perry Cheng, Guy E. Blelloch Presenter: Jun Tao

Conclusion

• Implements a scalably parallel, concurrent, real-time garbage collector

• Thread synchronization is minimized