ssgrr 20031 a taxonomy of execution replay systems frank cornelis andy georges mark christiaens...

Post on 31-Dec-2015

212 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

SSGRR 2003 1

A Taxonomy of Execution Replay

SystemsFrank CornelisAndy Georges

Mark ChristiaensMichiel Ronsse

Tom GhesquiereKoen De Bosschere

Dept. ELISGhent University

July 30, 2003 SSGRR 2003 2

The Debugging Problem

The debugging process is hard to automate Current tools are inadequate for debugging

large scale, interactive, multi-threaded, and event-driven applications

Hard to find bugs: Synchronization errors Memory leaks Data races Dangling pointers

July 30, 2003 SSGRR 2003 3

Inadequate Tools

Most common debugging technique: cyclic debugging

Problem: there is no guarantee that the same behavior is observed during subsequent runs as many applications are non-deterministic

Ideal situation: reverse execution…

July 30, 2003 SSGRR 2003 4

Solution: Execution Replay

Execution 1 Execution 2

Trace file

record replay

July 30, 2003 SSGRR 2003 5

Requirements

Record must have low intrusion

Replay must be accurate Record phase must be space

efficient Replay phase must be time

efficient

July 30, 2003 SSGRR 2003 6

Torn

ado

RecPlay

JaRec

jRapture

Interrupt

Replay

Scheduling

Replay

Compressed

difference

s

Instant Replay

Input

Replay

Output

ReplayRSA

DejaVu

Igor

Rec

ap

July 30, 2003 SSGRR 2003 7

Outline

Introduction Content-based vs. ordering-based Dealing with input Dealing with timing Dealing with other processors Conclusion

July 30, 2003 SSGRR 2003 8

Content-based

Record input for every instruction

…add r1,1 → r1load 8(r1) → r2store r2 → 12(r1)…

r1 = 10r1 = 11

r2 = 401r1=11

+ Instruction can be executed in isolation– Huge trace files

July 30, 2003 SSGRR 2003 9

Ordering-based

Record control flow of program from a given initial state

C1; C2

+ S

mal

ler

trac

e fil

es

– R

eexe

cutio

n re

quire

d

July 30, 2003 SSGRR 2003 10

Sources of non-determinism

Input (e.g. a database, time, pixel coordinates)

Timing (e.g. interrupts, scheduler actions)

Interaction with other processors (processor, DMA, coprocessor)

July 30, 2003 SSGRR 2003 11

Outline

Introduction Content-based vs. ordering-based Dealing with input Dealing with timing Dealing with other processors Conclusion

July 30, 2003 SSGRR 2003 12

Input instructions

application

kernel

IO-instructions

System callscontent-basedordering-based

content-based

TornadojRapture

July 30, 2003 SSGRR 2003 13

Outline

Introduction Content-based vs. ordering-based Dealing with input Dealing with timing Dealing with other processors Conclusion

July 30, 2003 SSGRR 2003 14

Dealing with timing

Interrupts Input/output (timing aspect; not input aspect) Scheduling

application other code

ordering-based

July 30, 2003 SSGRR 2003 15

How to determine the ordering

PC is not enough

Need extra counter: SIC1

- Instructions executed

- No of backward jumps

1 Software Instruction Counter

Interrupt replayRepeatable schedulingDejaVu

July 30, 2003 SSGRR 2003 16

Outline

Introduction Content-based vs. ordering-based Dealing with input Dealing with timing Dealing with other processors Conclusion

July 30, 2003 SSGRR 2003 17

Dealing with other processors

Multi-threading (multiple threads in one address space)

Multi-processing (multiple processes sharing a common block of memory)

A coprocessor (video, DMA, …)

Code 1 Code 2

data

c1 c2 c1 c2

July 30, 2003 SSGRR 2003 18

Many systems RecPlay

Ordering-based up to the first data race JaRec

Ordering-based up to the first data race IGOR

Content-based: checkpointing Recap

Content-based: reverse execution Instant Replay

Ordering-based: version numbers Netzers’s approach

Ordering-based: also replays data races

July 30, 2003 SSGRR 2003 19

Overview

De

jaVu

1

De

jaVu

2

IGO

R

Instant R

epla

y

Interru

pt R

ep

lay

JaRec

jRa

pture

Re

cap

Re

cPla

y

RS

A

To

rnad

o

Input

System Calls Interrupts SM (content-based) SM (ordering-based)

July 30, 2003 SSGRR 2003 20

Conclusion

No execution replay system deals with all forms of non-determinism

The more accurate the system gets, the more resources it needs (time, space), and hence becomes less useful

There is a need for stable and platform-independent tools to further support debugging

top related