ssgrr 20031 a taxonomy of execution replay systems frank cornelis andy georges mark christiaens...
TRANSCRIPT
SSGRR 2003 1
A Taxonomy of Execution Replay
SystemsFrank CornelisAndy Georges
Mark ChristiaensMichiel Ronsse
Tom GhesquiereKoen De Bosschere
Dept. ELISGhent University
July 30, 2003 SSGRR 2003 2
The Debugging Problem
The debugging process is hard to automate Current tools are inadequate for debugging
large scale, interactive, multi-threaded, and event-driven applications
Hard to find bugs: Synchronization errors Memory leaks Data races Dangling pointers
July 30, 2003 SSGRR 2003 3
Inadequate Tools
Most common debugging technique: cyclic debugging
Problem: there is no guarantee that the same behavior is observed during subsequent runs as many applications are non-deterministic
Ideal situation: reverse execution…
July 30, 2003 SSGRR 2003 4
Solution: Execution Replay
Execution 1 Execution 2
Trace file
record replay
July 30, 2003 SSGRR 2003 5
Requirements
Record must have low intrusion
Replay must be accurate Record phase must be space
efficient Replay phase must be time
efficient
July 30, 2003 SSGRR 2003 6
Torn
ado
RecPlay
JaRec
jRapture
Interrupt
Replay
Scheduling
Replay
Compressed
difference
s
Instant Replay
Input
Replay
Output
ReplayRSA
DejaVu
Igor
Rec
ap
July 30, 2003 SSGRR 2003 7
Outline
Introduction Content-based vs. ordering-based Dealing with input Dealing with timing Dealing with other processors Conclusion
July 30, 2003 SSGRR 2003 8
Content-based
Record input for every instruction
…add r1,1 → r1load 8(r1) → r2store r2 → 12(r1)…
r1 = 10r1 = 11
r2 = 401r1=11
+ Instruction can be executed in isolation– Huge trace files
July 30, 2003 SSGRR 2003 9
Ordering-based
Record control flow of program from a given initial state
C1; C2
+ S
mal
ler
trac
e fil
es
– R
eexe
cutio
n re
quire
d
July 30, 2003 SSGRR 2003 10
Sources of non-determinism
Input (e.g. a database, time, pixel coordinates)
Timing (e.g. interrupts, scheduler actions)
Interaction with other processors (processor, DMA, coprocessor)
July 30, 2003 SSGRR 2003 11
Outline
Introduction Content-based vs. ordering-based Dealing with input Dealing with timing Dealing with other processors Conclusion
July 30, 2003 SSGRR 2003 12
Input instructions
application
kernel
IO-instructions
System callscontent-basedordering-based
content-based
TornadojRapture
July 30, 2003 SSGRR 2003 13
Outline
Introduction Content-based vs. ordering-based Dealing with input Dealing with timing Dealing with other processors Conclusion
July 30, 2003 SSGRR 2003 14
Dealing with timing
Interrupts Input/output (timing aspect; not input aspect) Scheduling
application other code
ordering-based
July 30, 2003 SSGRR 2003 15
How to determine the ordering
PC is not enough
Need extra counter: SIC1
- Instructions executed
- No of backward jumps
1 Software Instruction Counter
Interrupt replayRepeatable schedulingDejaVu
July 30, 2003 SSGRR 2003 16
Outline
Introduction Content-based vs. ordering-based Dealing with input Dealing with timing Dealing with other processors Conclusion
July 30, 2003 SSGRR 2003 17
Dealing with other processors
Multi-threading (multiple threads in one address space)
Multi-processing (multiple processes sharing a common block of memory)
A coprocessor (video, DMA, …)
Code 1 Code 2
data
c1 c2 c1 c2
July 30, 2003 SSGRR 2003 18
Many systems RecPlay
Ordering-based up to the first data race JaRec
Ordering-based up to the first data race IGOR
Content-based: checkpointing Recap
Content-based: reverse execution Instant Replay
Ordering-based: version numbers Netzers’s approach
Ordering-based: also replays data races
July 30, 2003 SSGRR 2003 19
Overview
De
jaVu
1
De
jaVu
2
IGO
R
Instant R
epla
y
Interru
pt R
ep
lay
JaRec
jRa
pture
Re
cap
Re
cPla
y
RS
A
To
rnad
o
Input
System Calls Interrupts SM (content-based) SM (ordering-based)
July 30, 2003 SSGRR 2003 20
Conclusion
No execution replay system deals with all forms of non-determinism
The more accurate the system gets, the more resources it needs (time, space), and hence becomes less useful
There is a need for stable and platform-independent tools to further support debugging