net gc internals

48
.NET GC Internals Mark phase @konradkokosa / @dotnetosorg 1 / 36

Upload: others

Post on 26-May-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NET GC Internals

.NET GC Internals

Mark phase@konradkokosa / @dotnetosorg

1 / 36

Page 2: NET GC Internals

.NET GC Internals

(Non-Concurrent) Mark phase

2 / 36

Page 3: NET GC Internals

.NET GC Internals Agenda

Introduction - roadmap and fundamentals, source code, ...Mark phase - roots, object graph traversal, mark stack, mark/pinned �ag, marklist, ...Concurrent Mark phase - mark array/mark word, concurrent visiting, �oatinggarbage, write watch list, ...Plan phase - gap, plug, plug tree, brick table, pinned plug, pre/post plug, ...Sweep phase - free list threading, concurrent sweep, ...Compact phase - relocate references, compact, ...Generations - physical organization, card tables, ...Allocations - bump pointer allocator, free list allocator, allocation context, ...

3 / 36

Page 4: NET GC Internals

.NET GC Internals Agenda

Introduction - roadmap and fundamentals, source code, ...Mark phase - roots, object graph traversal, mark stack, mark/pinned �ag, marklist, ...Concurrent Mark phase - mark array/mark word, concurrent visiting, �oatinggarbage, write watch list, ...Plan phase - gap, plug, plug tree, brick table, pinned plug, pre/post plug, ...Sweep phase - free list threading, concurrent sweep, ...Compact phase - relocate references, compact, ...Generations - physical organization, card tables, ...Allocations - bump pointer allocator, free list allocator, allocation context, ...Roots internals - stack roots, GCInfo, partially/full interruptible methods, statics,Thread-local Statics (TLS), ...

3 / 36

Page 5: NET GC Internals

.NET GC Internals Agenda

Introduction - roadmap and fundamentals, source code, ...Mark phase - roots, object graph traversal, mark stack, mark/pinned �ag, marklist, ...Concurrent Mark phase - mark array/mark word, concurrent visiting, �oatinggarbage, write watch list, ...Plan phase - gap, plug, plug tree, brick table, pinned plug, pre/post plug, ...Sweep phase - free list threading, concurrent sweep, ...Compact phase - relocate references, compact, ...Generations - physical organization, card tables, ...Allocations - bump pointer allocator, free list allocator, allocation context, ...Roots internals - stack roots, GCInfo, partially/full interruptible methods, statics,Thread-local Statics (TLS), ...Q&A - "but why can't I manually delete an object?", ...

3 / 36

Page 6: NET GC Internals

02. .NET GC Internals - Mark phase

This module agenda:

introductionobject graphobject graph traversal

implementationtraversalpin/mark �agmark stack & mark listvectorized mark list sorting "story"

inside code .NET runtime 😍

4 / 36

Page 7: NET GC Internals

Mark phase

We need to know which objects are "live"...

5 / 36

Page 8: NET GC Internals

Object graph

In memory:

6 / 36

Page 9: NET GC Internals

Type data:

record A(B b, D d);record B(int X);record C(B b, F f);record D(E e);record E(G g);record F(int X);record G(int Z);

Object graph

In memory:

7 / 36

Page 10: NET GC Internals

Type data:

record A(B b, D d);record B(int X);record C(B b, F f);record D(E e);record E(G g);record F(int X);record G(int Z);

Current "state":

var a = new A(..., ...);var d = new D(...);...we are here...

Object graph

In memory:

8 / 36

Page 11: NET GC Internals

Type data:

record A(B b, D d);record B(int X);record C(B b, F f);record D(E e);record E(G g);record F(int X);record G(int Z);

Current "state":

var a = new A(..., ...);var d = new D(...);...we are here...

Object graph:

Object graph

In memory:

9 / 36

Page 12: NET GC Internals

Object graph traversal

10 / 36

Page 13: NET GC Internals

Object graph traversal

11 / 36

Page 14: NET GC Internals

Object graph traversal

12 / 36

Page 15: NET GC Internals

Object graph traversal

13 / 36

Page 16: NET GC Internals

Object graph traversal

14 / 36

Page 17: NET GC Internals

Object graph traversal

15 / 36

Page 18: NET GC Internals

Object graph traversal

16 / 36

Page 19: NET GC Internals

Object graph traversal

17 / 36

Page 20: NET GC Internals

Object graph traversal

18 / 36

Page 21: NET GC Internals

Object graph traversal

19 / 36

Page 22: NET GC Internals

Object graph traversal

20 / 36

Page 23: NET GC Internals

Object graph traversal

21 / 36

Page 24: NET GC Internals

Object graph traversal

22 / 36

Page 25: NET GC Internals

Object graph traversal

22 / 36

Page 26: NET GC Internals

Object graph traversal

We have just discovered reachability of the objects (from at least one root) bymarking algorithm.

22 / 36

Page 27: NET GC Internals

Object graph traversal

We have just discovered reachability of the objects (from at least one root) bymarking algorithm.

Reachability is the closest we can get to true "usability" - we don't know the future.22 / 36

Page 28: NET GC Internals

Object graph traversal - roots

stackregistersstatic/thread-local static data�nalization queueinter-generational references ("cards", "card tables") - we will return to that......

23 / 36

Page 29: NET GC Internals

Mark phase implementation

Sequentially for every root type (like stack, �nalization, ...):

1. Collect the roots into the "to visit list" (the mark stack)2. For each given target address addr from the mark stack:

set pinning �ag (in the Header) - if the runtime says sostart traversal:

skip already visited objectmark an object (in the MT)add outgoing references to the mark stack

24 / 36

Page 30: NET GC Internals

Mark phase implementation

Sequentially for every root type (like stack, �nalization, ...):

1. Collect the roots into the "to visit list" (the mark stack)2. For each given target address addr from the mark stack:

translate it to the proper address of a managed object - we will return to that...set pinning �ag (in the Header) - if the runtime says sostart traversal:

skip already visited objectmark an object (in the MT)add outgoing references to the mark stack

25 / 36

Page 31: NET GC Internals

Mark phase implementation

Sequentially for every root type (like stack, �nalization, ...):

1. Collect the roots into the "to visit list" (the mark stack)2. For each given target address addr from the mark stack:

translate it to the proper address of a managed object - we will return to that...set pinning �ag (in the Header) - if the runtime says sostart traversal:

skip already visited objectmark an object (in the MT)add its address to the mark list (if not over�owed)add outgoing references to the mark stack

26 / 36

Page 32: NET GC Internals

Mark stack: Mark list:

Mark phase - let's draw!

27 / 36

Page 33: NET GC Internals

Mark phase

Findings:

mark and pinned �ags are added only during the GC - and cleared afterwardsat Plan phase

ie. by looking at a regular memory dump in-between GCs, we won't see thosebits setdiagnostics tool needs to traverse the graph from roots to notice an object ispinned

mark stack is used as sa�er approach that recursionmark list will help us later, if sortedit is pretty a lot of work to do (and non-sequential memory access...)!

28 / 36

Page 34: NET GC Internals

Dan Schechter - .NET Intrinsics inCoreCLR 3.0

Peter Sollich - The .NET GarbageCollector

Mark phase - "mark list sorting story"

29 / 36

Page 35: NET GC Internals

Dan Schechter - .NET Intrinsics inCoreCLR 3.0

Peter Sollich - The .NET GarbageCollector

Mark phase - "mark list sorting story"

Oct 2019 - "Dan, we could improve our mark list sorting with that..." - Peter

29 / 36

Page 36: NET GC Internals

Dan Schechter - .NET Intrinsics inCoreCLR 3.0

Peter Sollich - The .NET GarbageCollector

Mark phase - "mark list sorting story"

Oct 2019 - "Dan, we could improve our mark list sorting with that..." - Peter

Jul 2020 - https://github.com/dotnet/runtime/pull/37159 - "Vxsort"faster sorting code from Dan Shechter, and bigger mark list, used forMarking, using AVX2/AVX512F👉 shorter GC pauses 😍 29 / 36

Page 37: NET GC Internals

Mark phase - sidenote #1

30 / 36

Page 38: NET GC Internals

Mark phase - sidenote #1

"translate it to the proper address of a managed object" - aka interior pointers

30 / 36

Page 39: NET GC Internals

Mark phase - sidenote #1

"translate it to the proper address of a managed object" - aka interior pointers

30 / 36

Page 40: NET GC Internals

Mark phase - sidenote #1

"translate it to the proper address of a managed object" - aka interior pointers

void GCHeap::Promote(Object** ppObject, ..., uint32_t flags){ // ... if (flags & GC_CALL_INTERIOR) { if ((o = hp->find_object (o)) == 0) { return; } 30 / 36

Page 41: NET GC Internals

Mark phase - sidenote #1

"translate it to the proper address of a managed object" - aka interior pointers

void GCHeap::Promote(Object** ppObject, ..., uint32_t flags){ // ... if (flags & GC_CALL_INTERIOR) { if ((o = hp->find_object (o)) == 0) // based on bricks - although not yet available at Mark phase { return; } 31 / 36

Page 42: NET GC Internals

Mark phase - sidenote #2

32 / 36

Page 43: NET GC Internals

Mark phase - sidenote #2

32 / 36

Page 44: NET GC Internals

Mark phase - sidenote #2

shortest root pathdependency subgraph – total sizeretained subgraph – retained size

32 / 36

Page 45: NET GC Internals

Mark phase - inside code

Mark phase starts in the gc_heap::mark_phase and has calls to:

GCScan::GcScanRoots that calls Thread::StackWalkFrames with GCHeap::PromoteCFinalize::GcScanRoots using GCHeap::PromoteGCScan::GcScanHandles (with GCHeap::Promote callback) methods that callsRef_TracePinningRoots (for types HNDTYPE_PINNED and HNDTYPE_ASYNCPINNED),Ref_TraceNormalRoots (fe. for type HNDTYPE_STRONG) etc.gc_heap::mark_through_cards_for_segments (for SOH) andgc_heap::mark_through_cards_for_large_objects (for LOH)GCScan::GcDhInitialScan and scan_dependent_handles handle scanning"dependent handles"

GCHeap::Promote method calls the go_through_object_cl macro that triggerstraversal through objects’ references. The main work is done ingc_heap::mark_object_simple1 that realizes depth-�rst object graph traversalusing "mark stack" called mark_stack_array (with mark_stack_bos andmark_stack_tos indexes pointing to the bottom and the top of the stack).Setting "mark bit" happens in gc_mark(o)/gc_mark1(o) methods.

33 / 36

Page 46: NET GC Internals

Mark phase - inside code

Additionally, gc_heap::mark_object_simple/gc_heap::mark_object_simple1methods, while traversal, are using m_boundary macro to populare mark_list.

Mark list is maintained/sorted only for "non concurrent" ephemeral GC("multiple segments are more complex to handle and the list is likely toover�ow"). Sorting happens in:

mark_phase (only iff PARALLEL_MARK_LIST_SORT & MULTIPLE_HEAPS) callinggc_heap::sort_mark_list (using do_vxsort if USE_VXSORT)plan_phase (only iff !MULTIPLE_HEAPS) using do_vxsort (if USE_VXSORT)

If mark list over�ows, we expand it up to maximum in gc_heap::grow_mark_list:

// with vectorized sorting, we can use bigger mark lists#ifdef USE_VXSORT#ifdef MULTIPLE_HEAPS const size_t MAX_MARK_LIST_SIZE = IsSupportedInstructionSet (InstructionSet::AVX2) ? 1000 * 1024 : 200 * 102#else //MULTIPLE_HEAPS const size_t MAX_MARK_LIST_SIZE = IsSupportedInstructionSet (InstructionSet::AVX2) ? 32 * 1024 : 16 * 1024;#endif //MULTIPLE_HEAPS#else... 34 / 36

Page 47: NET GC Internals

Thank you! Any questions?!

35 / 36

Page 48: NET GC Internals

36 / 36