garbage collection in an uncooperative environment hans-juergen boehm computer science dept. rice...

21
GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo Alto Presented by Srilakshmi Swati Pendyala

Upload: amy-bond

Post on 18-Jan-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT

Hans-Juergen BoehmComputer Science Dept. Rice University, Houston

Mark WieserXerox Corporation, Palo Alto

Presented by Srilakshmi Swati Pendyala

Page 2: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Outline

Introduction Garbage Collection In Different Languages

Problem Domain Need for conservative garbage collection in

uncooperative environments Overview of the proposed Garbage

Collector Use of the proposed GC as a debugging tool Implementation Results Conclusion

Page 3: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Introduction

Garbage Collection Different Languages ? JAVA

http://www.folgmann.com/en/gc.html

Page 4: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Introduction

Garbage Collection Different Languages : .NET, VB, C# ? Perl , Python ? C, C++ ?

Page 5: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Introduction

Garbage Collection Different Languages ? .NET, VB, C# – Mark and Sweep, Generational Perl , Python – Reference Counting C, C++ – No garbage collection, managed

options available. ADA, Modula 3 – Manual & Automated

Garbage Collection

Page 6: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Introduction

JAVA, .NET etc. Automatic Garbage Collection No memory management effort for the

programmer In the run-time, the program should tell the GC

which memory objects are still in use C, C++ etc.

Program should “free” the allocated memory Prone to memory leaks etc. Both cases lead to additional effort from

program/compiler. GC affects the performance of the program Better performance can be achieved (in some cases) when the program doesn’t worry about GC at all.

Page 7: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

What is the need to avoid cooperation?

Programmers don’t want to pay for GC unless needed

Disadvantage in tagging the integers Reduction available number of bits

Difficulty in manipulating standard machine representation of data. Need for interfacing routines

To implement specific programming language like Russell

To enable garbage collection in conjunction with C, Pascal etc.

Difficult to design compilers that always preserve garbage collection invariants

Need for a Garbage Collectorthat expects less from the

program/compiler

Page 8: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Uncooperative Environment

Program/compiler does not provide information to recognize pointers

Every register/word potential pointer All the storage that is accessible by the stack,

registers etc., may not be needed by the program Compiled code may fail to destroy the references (for

performance issues/because of bugs) Particular run-time representations may involve

unnecessary references not intended by the programmer Difficult to tell if an object is actually required by the

program Can lead to program failure if necessary objects are

deleted Need for CONSERVATIVE Garbage Collection

Page 9: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Imagine doing a mark and sweep GC, but not knowing for sure if a cell has a pointer in it or some other data.

If it looks like a pointer (that is, is a valid word-aligned address within heap memory bounds), assume that it IS a pointer, and trace that and other pointers in that record too.

Any heap data that is not marked in this way is garbage and can be collected. (There are no pointers to it.)

Conservative Garbage Collection

Page 10: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Discussion

Is conservative Garbage Collection needed in Cooperative systems ?

Disadvantages of Conservative Garbage Collection ? Some amount of inaccessible memory is not

reclaimed. How can we reduce memory lost because

of Conservative Garbage Collection ? Better checks to detect false pointers

Page 11: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

How does the Garbage Collector work? Uses Mark-Sweep Stop-the-World Garbage

Collection Algorithms Procedure:

Scan all objects referenced directly by pointer variables (roots) from stack & registers

Verify that pointers are actually pointing to intended objects (validity check) and mark the objects referenced by validated pointers

Mark objects directly reachable from newly marked objects.

Finally identify unmarked objects and free them (sweep) E.g. put them in free lists. Reuse to satisfy allocation requests.

Objects are not moved.

Page 12: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Mark/Sweep illustration

Stack w/ pointer variables

Page 13: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Mark/Sweep illustration (2)

Stack w/ pointer variables

Page 14: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Allocator design

Allocation scheme obtains “chunks” of memory.

Chunks are always multiples of 4k in size.

Separate free lists for each object size. Characteristics:

No per object space overhead (except mark bits)

Partial sweeps are possible.

Page 15: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Heap layout

Freelists

.

.

.

Heap Data

4k size chunks

Page 16: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Data Structure for Chunks

A list of allocated chunks contains pointers to the beginning of each chunk

Contents of a chunk C:

Size of objects in the chunk

A pointer to the entry for C in list of allocated chunks

An area reserved for mark bits corresponding to the objects in the chunk

Data Objects

Is it better than “tagging” integers ?

Page 17: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Finding Roots & Pointers Possible roots: registers, stack, static areas No cooperation from compiler

treat every word as potential pointer ignore interior pointers (standard) prefer marking from false pointers over ignoring valid

pointers

Conservative Pointer Identification: given word p; does p refer to the collected heap? does it point into heap block allocated by collector? does it point to the beginning of an object in that block?

if yes, mark object in block header push object onto mark stack

Sweep: If a chunk is completely empty, return it to the chunk

allocator

Page 18: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Pointer Validity Check

Goal: To minimize the marking of false pointers

The pointer “p” should reference to a proper

heap-address range for it to correspond to an object

If it corresponds an object, the pointer contained in the chunk header should correspond to the actual address of pointer “p” in the list of allocated chunks

The offset of the supposed object from the chunk header should be a multiple of of the object size given by chunk header and it should be within the end of the chunk

Page 19: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Garbage Collector as a Debugging Tool Use GC to identify allocated memory that is no

longer needed by the program, but not yet freed by it.

Use a tracer to track the memory leaks back to the subroutine responsible for them.

Procedure: An allocation-and-free tracer. Subroutine names are recorded on a stack with

every call to “malloc”. Mark the storage as freed when ‘free’ calls are made. When collector runs, storage having no pointers to it

and that was never explicitly deallocated with ‘free’ call is likely for storage leak.

Collector running with the tracer could find most of the storage unmarked by the collector, but never been explicitly “free”d.

Page 20: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Experimental Results

Mark phase of Russell collector took 1.9 seconds per megabyte of accessible memory in the heap. Sweep phase took 0.4 seconds per megabyte.

Garbage Collection was added to TimberWolf and SDI.

The systems were re-linked so that calls to Unix allocation routines instead called the allocator.

SunView presented problems because of dynamic allocated memory remapping and ‘notifier’.

Programming styles involving disguised pointers will not work with the collector method.

Use of the proposed GC as debugging tool has also been demonstrated on SunView system.

Page 21: GARBAGE COLLECTION IN AN UNCOOPERATIVE ENVIRONMENT Hans-Juergen Boehm Computer Science Dept. Rice University, Houston Mark Wieser Xerox Corporation, Palo

Conclusions GC effective for traditional imperative languages

with minimum cooperation from program/compiler

Realistic alternative to explicit memory management for most applications

May not suitable for real-time applications No big constraints to coding style, except hidden

pointer problem GC’ing allocators competitive even with code not

written for GC The same GC can be used as debugging tool for

programs that do manual garbage collection An implementation of this garbage collector can

be downloaded online