department of computer sciences ismm 20081 no bit left behind: the limits of heap data compression...

18
ISMM 2008 1 Department of Computer Sciences No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel , Kathryn S. McKinley* *U Texas at Austin, IBM Watson

Upload: wesley-ball

Post on 18-Jan-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

ISMM 2008 1

Department of Computer Sciences

No Bit Left Behind: The Limits of Heap Data

Compression

Jennifer B. Sartor*Martin Hirzel†, Kathryn S.

McKinley**U Texas at Austin, †IBM Watson

Page 2: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 2

Current State Managed languages ubiquitous

Embedded devices

Multicore

Need memory efficiency!

CPU L1

L2

CPUCPU

L1L1

L2

Page 3: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 3

Memory Efficiency of Managed Languages

X COSTX 8-94% information content in heap in 37

benchmarks. [Mitchell & Sevitsky, OOPSLA 07]X Boxed objectsX Trailing zeros in arraysX Redundant objectsX Extra bit-widthX Data structure back-bones

bzip2

86% OPPORTUNITY Memory layout abstraction (Location + size) != identity

Page 4: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 4

Related WorkAnanian & Rinard. LCTES 03 Dom value field hash

Appel & Goncalves. Tech Report 93 Eql obj sharing, Const field elide, Bit-width reduction

Chen, Kandemir & Irwin. VEE 05 Dom value field elide

Chen, et al. OOPSLA 03 Zero compr, Trail zero trim

Cooprider & Regehr. PLDI 07 Value set indirection

Marinov & O’Callahan. OOPSLA 03 Eql obj sharing

Stephenson, Babb & Amarasinghe. PLDI 00

Const field elide, Bit-width reduction

Titzer, et al. PLDI 07 Value set indirection

Zilles. ISMM 07 Bit-width reduction

Page 5: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 5

Limit Study

Quantitatively compare heap data compression Surveyed literature Savings equations Methodology for evaluation Apples-to-apples comparison Future work: implementation

Hybrid techniques

Findings: array & hybrid compression

58%

Page 6: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 6

Hybrid Array Compression

x0001 x0001 x0058 x0001 x0004 x0001 x0000 x0001

Redundancy Equal array sharing

x0001 x0001 x0058 x0001 x0004 x0001 x0000 x0001

Page 7: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 7

Equal Object Sharing Marinov & O’Callahan. OOPSLA 03;

Appel & Goncalves. Tech Report 93

Two objects are equal if both Same class & all fields have same

value Strictly-equal: pointer fields identical Deep: objects pointer targets are equal

JVM store only 1 copy in hashtable

(N −D)×sizeof (C)−hashTableSize(D,pntrSize)

14%

Class C, N objects, D distinct; save:

Page 8: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 8

Hybrid Array Compression

x0001 x0001 x0058 x0001 x0004 x0001 x0000 x0001

Redundancy Equal array sharing Value set indirection

x0001 x0001 x0058 x0001 x0004 x0001 x0000 x0001

Dictionary: x0001 x0058 x0004 x0000

0 0 1 0 2 0 3 0

Page 9: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 9

Value Set Indirection & Caching

Cooprider & Regehr/ Titzer, et al. PLDI 07 For object field or array elements

with large range of values Dictionary (or cache) of 256 most frequent

values, instance stores small 1 byte indices

14%

If > 256 values, 255 in dictionary, 256th says to store rest (M) in hashtable w/ objectID

a.length×(sizeof (T)−1)

a∈T[]

∑ −arrayHdrSize−256×sizeof (T)

−hashTableSiz ′ e (M,sizeof (T))

Page 10: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 10

Hybrid Array Compression 2

x00A0 x0073 x0002 x0001 x0101 x0000 x0000 x0000

Remove zeros Trim trailing zeros

Bit width reduce

Zero compress

x00A0 x0073 x0002 x0001 x0101 8 5

x0A0 x073 x002 x001 x101 8 5

x0A x73 x2 x001 x101 8 5 101011118 5 xAF

Page 11: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 11

Zero-based Object Compression

Chen, et al. OOPSLA 03 Remove bytes that are entirely

zero Per object bit-map: 1 bit per

byte Store only non-zero bytes

45%

Savings:

zeroBytes(o)− totalBytes(o)8

⎢ ⎢ ⎢ ⎢

⎥ ⎥ ⎥ ⎥o∈objects

Page 12: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 12

Hybrid Array Compression 2

x00A0 x0073 x0002 x0001 x0101 x0000 x0000 x0000

Remove zeros Trim trailing zeros

Bit width reduce

Zero compress

x00A0 x0073 x0002 x0001 x0101 8 5

x0A0 x073 x002 x001 x101 8 5

x0A x73 x2 x001 x101 8 5 xAF

Page 13: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 13

Methodology

Program run Heap dumpseries

Analysisrepresentation

t

Model 1

–Model n

…s

Limit savings

Garbage

Collection

snapshot

Page 14: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 14

Experimental Details Jikes Research Virtual Machine

Java-in-Java DaCapo benchmarks + pseudojbb 20-25 heap snapshots per benchmark

MarkSweep with 2x min heap Analysis

Per class Objects and arrays separated JVM+app vs application (separated in

paper) Per heap snapshot, and over all snapshots

Page 15: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 15

Technique Class Array GC/RunLempel-Ziv compression X GC

Strictly-equal object sharing Obj Type GC

Deep-equal object sharing Obj Type GC

Zero-based object compression Obj Inst GC

Trailing zero array trimming Inst GC

Bit-width reduction Fld Inst GC/Run

Dominant-value field hashing Fld GC

Lazy invariant computation Fld GC

Value set indirection Fld Type GC

Value set caching Fld Type GC

Constant field elision Fld Run

Dominant-value field elision Fld Run

Page 16: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 16

Value Indirection & Cache

Deep Equal Sharing

Zero Compression

Hybrid Compression

Page 17: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 17

Stability of Savings

0%

5%

10%

15%

20%

25%

combArr

maxArr

zeroArr

combCls

maxCls

bitwArr

zeroCls

hashFld

indirFld

cacheArr

strEqArr

bitwFld

charArr

trailArr

strEqCls

lazyFld

cacheFld

indirArr

boolArr

fop: snapshots over time

Page 18: Department of Computer Sciences ISMM 20081 No Bit Left Behind: The Limits of Heap Data Compression Jennifer B. Sartor* Martin Hirzel †, Kathryn S. McKinley*

Department of Computer Sciences

ISMM 2008 18

Conclusions Limit study compare apples-to-

apples heap data compression techniques

Potential to reduce memory inefficiencies in managed languages Arrays Hybrids

Future: save space Challenge: efficient detection &

recovery

Thank you!