1 gc advantage: improving program locality xianglong huang, zhenlin wang, stephen m blackburn,...

Post on 18-Jan-2018

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

3 Marksweep vs. Copying pseudojbb

TRANSCRIPT

1

GC Advantage: Improving Program Locality

Xianglong Huang, Zhenlin Wang,Stephen M Blackburn, Kathryn S McKinley,

J Eliot B Moss, Perry Cheng

2

Motivation

Memory gapHow are Java programs affected?

3

Marksweep vs. Copying

pseudojbb

4

Motivation

Javac with perfect L1 and L2 cache.

16K L1 256K L2 Appel, GCTk. Breadth first

0

5

10

15

20

25

_213_javac (10̂ 9 cycles)

originalperfect L2perfect L1

5

Motivation

Copying collector can reorder objectsGoal: take advantage of copying collectors

reorder objects to improve locality

6

Exploring The Space

Different policies for traversing rootsClass-oblivious traversal orders

Which traversing order is the best?Class-based traversal orders

How to find the “important” data structure?

7

Different Root Traversal Policies

Two different types of roots: Stack, global variables Remember sets (for generational)

Different traversal orders Copy all roots before traversing any children Copy each root and its children (root-by-root) Split roots

Stack first and the children Remset first and the children

8

Experiment Setup

JikesRVM, JMTkGenerational copying collector with

bounded nursery size of 4MBPseudoAdaptive 2nd iteration

9

Different Root Traversal Policies

•RxR has the best mutator locality

10

Different Root Traversal Policies

•Total execution time

11

Exploring The Space

Different policies for traversing rootsClass-oblivious traversal orders

Which traversing order is the best?Class-based traversal orders

How to find the “important” data structure?

12

Different Traversal Orders

Breadth first 1,2,3,4,5,6,7Pure depth first 1,2,6,3,4,7,5Pure depth first, LIFO 1,5,4,7,3,2,6

1

4

76

2 3 5

13

Different Traversal Orders

Breadth first 1,2,3,4,5,6,7Pure depth first 1,2,6,3,4,7,5Pure depth first, LIFO 1,5,4,7,3,2,6Partial depth first, 2 children 1,2,6,3,4,5,7

1

4

76

2 3 5

14

Class Oblivious Type

Different traversal policies Partial DF is the best

15

Exploring The Space

Different policies for traversing rootsClass-oblivious traversal orders

Which traversing order is the best?Class-based traversal orders

How to find the “important” data structure?

16

Class-based Traversal

Class-oblivious traversal orders inflexibleClass-based object traversal

Static profiling Dynamic sampling

17

Static Profiling

Profile object accesses Find hot pairs with strong correlation Example

(1,4), (4,7) and (2,6) have strong correlation Order: 1,4,7,2,6,3,5

1

4

76

2 3 5

18

Online Profiling

Use the adaptive compiler sampling Hot method Hot basic block

Use field accesses to indicate hot fields Example: (In a hot method)

{Class A a;a.b=…;

… }

A

B

b…..

19

Online Profiling

Micro benchmark results

20

Online Profiling

Geometric mean

21

Reasons

No advice for most of the objects copied For jess, db and raytrace, we only pick <<1% of

the objects as hot objects 5% for javac

The hot fields are within the first 2 pointers 90% of the advised objects for javac

22

Online Profiling

PseudoJBB mutator results Generate advice for 23% of the copied objects 75% of the objects have adviced hot fields

other than first 2

23

Questions

Have we found all the hot objects? Not all hot objects are connected?

Is class-base good enough? For pseudojbb, we need instance-based?

Locality for the nursery objects?

24

Future Work

Sampling technique Catch more hot objects access

Lower the threshold Hot objects that are not connected

Dynamically change the advice for phase changing

Nursery localityDifferent traversal orders for cold objectsInstance-based

25

Conclusion

Reorder objects during copying collection can improve locality

In class-oblivious traversal orders partial depth first order is the best

Online profiling, class-based traversal is more flexible, up to 50% better. very low overhead, ~0%

Still mysteries

26

Questions?

27

Answers?

Lower the threshold of the sampling, not only the hot methods

For objects with only 1 or 2 pointers, it maybe easier just depth first

Maybe the nursery locality is more important

Instance-based advice

28

Online Profiling

Execution overhead

-6.00%-5.00%-4.00%-3.00%-2.00%-1.00%0.00%1.00%2.00%3.00%4.00%5.00%

overhead

29

Online Profiling

Micro benchmark results for mutator time

30

Different Root Traversal Policies

_227_mtrt

31

Static Profiling

Results

32

Answers?

Most objects have only one pointerPercentage of objects copied by advice

(whether it is really hot?) For pseudojbb ~50%, for jess <<1%, for our

micro benchmark ~16%Change! Half of the pairs do not form

chains longer than 2Maybe the nursery locality is more

important

33

Class Oblivious Orderings

Different traversal policies Partial DF is better

pseudoJBB

34

Motivation

MarkSweep vs. Copying Collector

Mutator time of_213_javac

35

Motivation

Mutator L2 misses_213_javac

top related