1 improving productivity with fine-grain compiler-based checkpointing chuck (chengyan) zhao prof....

Post on 29-Dec-2015

220 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Improving Productivity With Fine-grain Compiler-based

Checkpointing

Chuck (Chengyan) ZhaoProf. Greg Steffan

Prof. Cristiana AmzaAllan Kielstra*

Dept. of Electrical and Computer EngineeringUniversity of Toronto

IBM Toronto Lab*

Nov. 10, 2011

2

Productivity and CompilersProgrammer’s Productivity: important

computers: fast, cheapprogrammers: slow (relatively), expensive

new way for compiler to help?automatic fine-grain checkpointing (CKPT)optimizations to reduce checkpoint overhead

applications of checkpointingaccelerate bug-finding processautomated support for backtracking algorithms

a compiler can improve programmer’s productivity via automatic CKPT

Annotatedsource

Enable Checkpointing

Optimize Checkpointing

LLVM frontend

Callsite Analysis

Inter-procedural Transformations

Intra-procedural Transformations

Special Cases Handling

Source code

C/C++

LLVM IR

BackendProcess

Compiler Checkpointing (CKPT) Framework

x86

x64

…POWER

C/C++

2. Pre Optimize

3. Redundancy Eliminations

4. Hoisting

6. Non Rollback Exposed Store Elimination

1. CKPT Inlining

7. Heap Optimize

8. Array Optimize

9. Post Optimize

5. Aggregation

3

4

compiler-based checkpointing basics

…a = 5;b = 7;

main program

a:

b:

checkpoint buffer

failurerecovery

(&a, 0)(&b, 0)

main

mem

ory

0

05

7

5

start_ckpt(); …

backup(&a, sizeof(a));

a = …;

handleMemcpy(…);

memcpy(d, s, len);

foo_ckpt();

foo();

stop_ckpt(cond);

foo(…){ /* body of foo() */}

foo_ckpt(…){

/* body of foo_ckpt() */ }…

Transformations to Enable Checkpointing

3 Steps:

1.Callsite analysis

2.Intra-procedural transformation

3.Inter-procedural transformation

Optimize Checkpointing

Checkpointing Optimization Framework

2. Pre Optimization

3. Redundancy Eliminations (3 REs)

4. Hoisting

6. Non Rollback Exposed Store Elimination

1. CKPT Inlining

7. DynMem (Heap) Optimization

8. Array Optimization

9. Post Optimization

5. Aggregation

6

start_ckpt();…

if (C){ backup(&a, sizeof(a)); a = …; } … backup(&a, sizeof(a)); a = …; … backup(&a, sizeof(a)); a = …; … …stop_ckpt(cond);

Redundancy Elimination OptimizationAlgorithm

establish dominating relationship

stop_ckpt() marker

promote leading backup call

re-establish dominating relationship

among backup calls

eliminate all non-leading backup call(s)

7RE1: remove all non-leading backup call(s)

dom

dom

int a, b;…start_ckpt();

… b = … a op …; … backup(&a, sizeof(a)); a = …;…

…stop_ckpt(cond);

8

Definition: Rollback Exposed Store

must backup 'a' because the prior load of 'a' must access the"old" value on rollback---i.e., 'a' is "rollback exposed"

Rollback Exposed Store:a store to a location with a possible previous load of that location

Rollback Exposed Store needs backup

int a, b;…start_ckpt();

… backup(&a, sizeof(a)); a = …;…

…stop_ckpt(cond);

Algorithm Description

no use of the address (&a) on any path

the backup address (&a) isn’t aliased to anything

empty points-to set

9NRESE is a new, checkpoint-specific optimization

Non-Rollback Exposed Store Elimination (NRESE)

no prior use of 'a', hence it is non-rollback-exposed

we can eliminate the backup of 'a'

Applications

10

11

Q: place where the bug manifests

(a user or programmer notices the bug at this point)

T: safe point, literally earlier than P, the program can reach through checkpoint recovery

CKPT Region

P: root cause of a bug

App1: CKPT enabled debugging

11

Key benefitsexecution rewindingarbitrarily large regionunlimited # of retriesno restart from beginning

12

Q: keep swap if improvement, discard otherwise

T: pick a pair of blocks to swap

CKPT Region

App2: CKPT enabled backtracking

12

Proceed with VPR’s random/simulated-annealing based algorithm

Key benefitsautomate support for backtracking

backup actionsabortcommit

cover arbitrarily complex algorithmcleaner code, simplify programming

programmer focus on algorithm

Evaluation

13

Platform and BenchmarksEvaluation Platform

Core i7 920, 12GB DDR3, 200GB SATADebian6-i386, gcc/g+-4.4.5LLVM-2.9

BenchmarksBugBench: 1.2.0

5 programs with buffer-overflow bugs3 CKPT regions per program: Small . Medium . Large

VPR: 5.0.2FPGA CAD tool, 1 CKPT region

CKPT ComparisonlibCKPT: U. TennesseeICCSTM: Intel ICC based STM

14

15

Compare with Coarse-gain Scheme: libCKPT

HUGE gain over coarse-grain libCKPT

16

Compare with Fine-gain Scheme: ICCSTM

better than best-known fine-grain ICCSTM

17

0

20

40

60

80

100

120% of Buffer Size Reduction

INLINE

+RE1

%

%

%

%

%

RE1 Optimization: buffer size reduction

RE1 is the single most-effective optimization

18

0

1

2

3

4

5

6

7

8

9% of Buffer Size Reduction +RE2

+RE3

+Hoist

+Aggr

+NRESE

+HeapOpti

+ArrayOpti

%

%

%

%

%

%

%

%%

Post RE1 Optimization: buffer size reduction

Other optimizations also contribute

ConclusionCKPT Optimization Framework

compiler-drivenautomaticsoftware-onlycompiler analysis and optimizations100-1000X less overhead: over coarse-grain scheme4-50X improvement: over fine-grain scheme

CKPT-supported Appsdebugger: execution rewind in time

up to: 98% of CKPT buffer size reductionup to: 95% of backup call reduction

VPR: automatic software backtrackingonly 15% CKPT overhead

19

20

Questions and Answers

?

Algorithm: Redundancy Elimination 1

1. Build dominating relationship (DOM) among backup calls

2. Identify leading backup call

3. Promote suitable leading backup call

4. Remove non-leading backup call(s)

21

Algorithm: NRESE

Backup address is NOT aliased to anything

points-to set is empty

AND

On any path from begin of CKPT to the respective write, there is no use of the backup address

the value can be independently re-generated without the need of it self

22

1D array vs. Hash Tables Buffer Schemes

23

24

10X

100X

1KX

10KX

100KX

Compare with Coarse-gain Scheme: libCKPT

HUGE gain over coarse-grain libCKPT

Annotatedsource

Enable Checkpointing

Optimize Checkpointing

Source code

C/C++ LLVM IR

BackendProcess

Compiler Checkpointing (CKPT) Framework

x86

x64

…Power

C/C++

2. Pre Optimize

3. Redundancy Eliminations

4. Hoisting

6. Non Rollback Exposed Store Elimination

1. CKPT Inlining

7. Heap Optimize

8. Array Optimize

9. Post Optimize

5. Aggregation

25

CKPT Enabled Debugging

Key benefitsexecution rewindingarbitrarily large regionunlimited # of retriesno restart

26

27

Compare with Fine-gain Scheme: ICCSTM

better than best-known fine-grain solution

start_ckpt();… backup(&a, sizeof(a)); a = …; … backup(&a, sizeof(a)); a = …; … if (C){ backup(&a, sizeof(a)); a = …; … } …

…stop_ckpt(c);

Redundancy Elimination Optimization 1Algorithm

establish dominating relationship

among backup calls

promote leading backup call

eliminate all non-leading backup call(s)

28

D

RE1: keep only dominating backup call

29

initial guess

obtain a new result (manual CKPT)

check result

commit and continue

good

abort and try next

bad

CKPT Support for Automatic Backtracking (VPR)

CKPT automates the process, regardless of backtracking complexity

30

31

Key benefitsautomate support for backtracking

backup actionsabortcommit

cover arbitrarily complex algorithmcleaner code, simplify programming

programmer focus on algorithm

3232

App2: CKPT enabled backtracking

Evaluate (manual CKPT)

Initial Guess

badReset Data

goodCommit Data

Finish

stop condition reached

Key benefitsautomate support for backtracking

backup actionsabortcommit

cover arbitrarily complex algorithmcleaner code, simplify programming

programmer focus on algorithm

33

Key benefitsautomate CKPT process

backup actionsabortcommit

cover arbitrarily complex algorithmsimplify programming

programmer focus on algorithm

2. Pre Optimize

3. Redundancy Eliminations

4. Hoisting

6. Non Rollback Exposed Store Elimination

1. CKPT Inlining

7. Heap Optimize

8. Array Optimize

9. Post Optimize

5. Aggregation

34

How Can A Compiler Help Checkpointing?

Enable CKPTcompiler transformations

Optimize CKPTdo standard optimizations apply?support CKPT-specific optimizations?

CKPT Usesdebuggingbacktracking

35

36

0

20

40

60

80

100

120% of Buffer Size Reduction INLINE

+RE1

+RE2

+RE3

+Hoist

+Aggr

+NRESE

+HeapOpti

+ArrayOpti

Optimization: buffer size reduction

up to 98% of CKPT buffer size reduction

%

%

%

%

%

top related