aadebug 2000 - munchen non-intrusive on-the-fly data race detection using execution replay michiel...

45
AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data Non-intrusive on-the-fly data race detection using execution race detection using execution replay replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

Upload: gladys-weaver

Post on 31-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG 2000 - MUNCHEN

Non-intrusive on-the-fly data Non-intrusive on-the-fly data race detection using execution race detection using execution

replayreplay

Michiel Ronsse - Koen De Bosschere

Ghent University - Belgium

Page 2: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 2

ContentsContents

Introduction Non-determinism & data races RecPlay

Method Implementation

Example Experimental Evaluation Conclusions

Page 3: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 3

IntroductionIntroduction

Developing parallel programs for multiprocessors with shared memory is considered difficult: number of threads running simultaneously co-operation & synchronisation through shared

memory:• too much synchronisation: deadlock• too little synchronisation: race condition

cyclic debugging is impossible due to non-deterministic nature of most parallel programs program execution is not repeatable

Page 4: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 4

Causes of non-determinismCauses of non-determinism

Sequential Programs: input (keyboard, disk, network), signals, interrupts, certain system calls (gettimeofday(),…)

Parallel programs: race conditions: two threads accessing the same shared variable (memory

location) in an unsynchronised way and at least one thread modifies the variable

Page 5: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 5

Example codeExample code

#include <pthread.h>

unsigned global=5;

thread1(){ global=global+6; }thread2(){ global=global+7; }

main(){pthread_t t1,t2;pthread_create(&t1, NULL, thread1, NULL);pthread_create(&t2, NULL, thread2, NULL);pthread_join(t1, NULL);pthread_join(t2, NULL);printf(“global=%d\n”, global);

}

Page 6: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 6

Possible executionsPossible executions

L(5)

global=12 global=18global=11

L(5)

L(5)

L(5)

L(5)

L(11)S(11)

S(12) S(11)S(12)

S(11)

S(18)

A

A

A

A

A

A

Page 7: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 7

Race conditionsRace conditions

Two types: synchronisation races:

• doesn’t allow us to use cycli debugging• is not a bug, is desired non-determinism

data races:• doesn’t allow us to use cyclic debugging• is a bug, is undesired non-determinism

distinction is a matter of abstraction Automatic of data races detection is possible

collect all memory references check parallel references

Page 8: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 8

Detecting data racesDetecting data races

Static methods: checking the source code for all possible

executions with all possible input NP complete not feasible

Dynamic methods: during an actual execution => only detects data

races during this execution

Removal requires cyclic debugging

Page 9: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 9

Dynamic data race detectionDynamic data race detection

Piece of code between two consecutive synchronisation operations: a segment

We collect two sets for all segments i of all thread: L(i) and S(i) with the addresses of all load and store operations

For all parallel segments,

)()()()()()( iSjSjLjSiSiL gives the list of conflicting addresses.

Page 10: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 10

Existing race detection methodsExisting race detection methods

Huge overhead causing probe effect and Heisenbugs

Only detect the existence of a data race (and the variable), not the instructions involved.

It is a bug, we need cyclic debugging!

Page 11: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 11

RecPlayRecPlay Synchronisation races: execution replay Data races:

detect also enables cyclic debugging

Allows you to detect/remove the first data race Three phases:

record the order of the synchronisation operations replay the synchronisation operations and check for

data races normal replay, without checking for data races

Page 12: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 12

OverviewOverview

Chooseinput

Record Replay+detect

Replay+ident.

Replay+debug

Replay+debug

Choosenew input

Theend

Automatic Requires user intervention

Page 13: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 13

InstrumentationInstrumentation

JiTI (Just in Time Instrumentation) was developed especially for RecPlay, but it is a generic instrumentation tool

Instruments memory and synchronisation operations

Deals correctly with data in code, code in data, self-modifying code

Clones processes: the original process is used for the data and the instrumented clone is used for the code

No need for recompilation, relinking or instrumentation of files.

Page 14: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 14

Execution replayExecution replay

ROLT (Reconstruction of Lamport Timestamps) is used for tracing/replaying the synchronisation operations

Attaches a scaler Lamport timestamp to each synchronisation operation

Delaying synchronisation operations for operations with a smaller timestamp suffices for a correct replay

We only need to log a small subset of all operations

Page 15: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 15

Collecting memory operationsCollecting memory operations

We need two lists of adresses per segment i: L(i) and S(i)

A multilevel bitmap is used low memory consumption comparing two bitmaps is easy

We lose information: two accesses to the same variable are counted once. This is however no problem for data race detection

Page 16: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 16

Memory bitmapMemory bitmap

9 bit 9 bit 14 bit

Page 17: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 17

Detecting parallel segmentsDetecting parallel segments

A vectorclock is attached to each segment

All segment information (two bitmaps+vector timestamps) is kept on a list L.

Each new segment is compared against the segments on list L.

Page 18: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 18

Detecting obsolete segmentsDetecting obsolete segments Obsolete segments should be removed from list L.

We use snooped matrix clock in order to detect these segments

Page 19: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 19

Detecting obsolete segmentsDetecting obsolete segments

segment on list L

obsolete segment

segment in execution

point of execution

the future

Page 20: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 20

Identification phaseIdentification phase

If a data race is detected, we know the address involved the type of operations involved (load or store) the threads involved the segments containing the racing instructions

We need another replayed execution to find the racing instructions themselves (+ call stack, …)

This replay executes at full speed till the racing segments start executing.

Page 21: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 21

B2

An ExampleAn Example

Page 22: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 22

B2A1

C4P(S1)

An ExampleAn Example

Page 23: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 23

B2A1

C4P(S1)

An ExampleAn Example

Page 24: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 24

B2A1

C4P(S1)

V(S1)

An ExampleAn Example

Page 25: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 25

B2A1

C4P(S1)

V(S1)

An ExampleAn Example

Page 26: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 26

B2A1

C4P(S1)

V(S1)

An ExampleAn Example

Page 27: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 27

B2A1

C4P(S1)

V(S1)

CA+BA3 V(S2)

An ExampleAn Example

Page 28: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 28

B2A1

C4P(S1)

V(S1)

CA+BA3 V(S2)

An ExampleAn Example

Page 29: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 29

B2A1

C4P(S1)

V(S1)

CA+BA3 V(S2)

P(S2)

An ExampleAn Example

Page 30: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 30

B2A1

C4P(S1)

V(S1)

CA+BA3 V(S2)

P(S2)

An ExampleAn Example

Page 31: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 31

B2A1

C4P(S1)

V(S1)

CA+BA3 V(S2)

P(S2)

An ExampleAn Example

Page 32: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 32

B2A1

C4P(S1)

V(S1)

CA+BA3 V(S2)

P(S2)

An ExampleAn Example

Page 33: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 33

B2A1

C4P(S1)

V(S1)

CA+BA3

P(S2)

V(S3)

V(S2)

An ExampleAn Example

Page 34: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 34

B2A1

C4P(S1)

V(S1)

CA+BA3

P(S2)

V(S3)

V(S2)

An ExampleAn Example

Page 35: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 35

B2A1

C4P(S1)

V(S1)

CA+BA3

P(S2)

V(S3)

V(S2)

P(S3)

An ExampleAn Example

Page 36: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 36

B2A1

C4P(S1)

V(S1)

CA+BA3

P(S2)

V(S3)

V(S2)

P(S3)

An ExampleAn Example

Page 37: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 37

B2A1

C4P(S1)

V(S1)

CA+BA3

P(S2)

V(S3)

V(S2)

P(S3)

An ExampleAn Example

Page 38: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 38

B2A1

C4P(S1)

V(S1)

CA+BA3

P(S2)

V(S3)

V(S2)

P(S3)

An ExampleAn Example

Page 39: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 39

B2A1

C4P(S1)

V(S1)

CA+BA3

P(S2)

V(S3)

V(S2)

P(S3)

An ExampleAn Example

Page 40: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 40

B2A1

C4P(S1)

V(S1)

CA+BA3

P(S2)

V(S3)

V(S2)

P(S3)

An ExampleAn Example

Page 41: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 41

Experimental EvaluationExperimental Evaluation

RecPlay has been implemented for Solaris running on SPARC multiprocessors

Tested on a SUN SparcServer 1000 with 4 processors

SPLASH-2 was used as a benchmark number of multithreaded numeric applications,

such as fast fourier transform, a raytracer, ... Several data races were found, including in

SPLASH-2

Page 42: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 42

Basic performance of RecPlayBasic performance of RecPlay

program normal record replay+detectruntime runtime slowdown runtime slowdown

cholesky 8.67 8.88 1.024 721.4 83.2fft 8.76 8.83 1.008 82.8 8.3LU 6.36 6.40 1.006 144.5 22.7radix 6.03 6.20 1.028 182.8 30.3ocean 4.96 5.06 1.020 107.7 21.7raytrace 9.89 10.19 1.030 675.9 68.3water-Nsq. 9.46 9.71 1.026 321.5 34.0water-spat. 8.12 8.33 1.026 258.8 31.9radiosity 21.13 21.50 1.018 datarace foundaverage 1.021 30.6

Page 43: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 43

Segments with memory accessesSegments with memory accesses

program created max. stored comparedcholesky 13983 1915 (13.7%) 968154fft 181 37 (20.5%) 2347LU 1285 42 (3.3%) 18891radix 303 36 (11.9%) 4601ocean 14150 47 (0.3%) 272037raytrace 97598 62 (0.1%) 337743water-Nsq. 637 48 (7.5%) 7717water-spat. 639 45 (7.0%) 7962radiosity 438763 6834 (2.0%) 188323337

Page 44: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 44

Efficiency of the ROLT mechanismEfficiency of the ROLT mechanism

program number of trace bandwidthsync op. size bytes/s bits/op

cholesky 13857 1132 127.5 0.65fft 177 65 7.4 2.94LU 1275 134 20.9 0.84radix 273 108 17.4 3.16ocean 22987 6458 1276.3 2.25raytrace 150960 41416 4064.4 2.19water-Nsq. 631 336 34.6 4.26water-spat. 625 332 39.9 4.25radiosity 524667 24578 1143.2 0.37average 748.0 2.30

Page 45: AADEBUG 2000 - MUNCHEN Non-intrusive on-the-fly data race detection using execution replay Michiel Ronsse - Koen De Bosschere Ghent University - Belgium

AADEBUG2000 - Munchen 45

ConclusionsConclusions

RecPlay is a practical and effictient tool for detecting and removing data races

RecPlay also make cyclic debugging possible Three types of clocks (scalar, vector and

matrix) are used to enable a fast and memory-effictient implementation

Data races have been found