execution replay for multiprocessor virtual machines
DESCRIPTION
Execution Replay for Multiprocessor Virtual Machines. George W. Dunlap Dominic Lucchetti Michael A. Fetterman Peter M. Chen. Big ideas. Detection and replay of memory races is possible on commodity hardware Overhead high for some workloads …but surprisingly low for other workloads. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/1.jpg)
Execution Replay for
Multiprocessor Virtual Machines
George W. DunlapDominic Lucchetti
Michael A. FettermanPeter M. Chen
![Page 2: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/2.jpg)
Big ideas
• Detection and replay of memory races is possible on commodity hardware
• Overhead high for some workloads
• …but surprisingly low for other workloads
![Page 3: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/3.jpg)
Execution Replay
CPU
Memory
Disk
Network
Keyboard, mouse
Interrupts
![Page 4: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/4.jpg)
Uses of Execution Replay
• Reconstructing state– Fault tolerance
• Reconstructing execution– Debugging– Realistic trace generation
• Both– Intrusion analysis
![Page 5: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/5.jpg)
Single-processor Replay• Basic principles well understood
– Log all non-deterministic inputs– Timing of asynchronous events
• Minimal overhead (Dunlap02)– 13% worst case– Log for months or years
• Available commercially– VMWare: Record/Replay
![Page 6: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/6.jpg)
Replay for Multiprocessors• Memory races in multiprocessor VMs• The Ordering Requirement• The CREW Protocol
– Implementing with page protections– Relation to the Ordering Requirement– Generating constrants from CREW events
• DMA-capable devices and CREW• Performance
![Page 7: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/7.jpg)
The Multiprocessor Challenge
• Interleaved reads and writes– Fine-grained non-determinism– Much more difficult
• Existing solutions– Hardware modification– Software instrumentation
• SMP-ReVirt– Hardware MMU to detect sharing
![Page 8: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/8.jpg)
Multiprocessor Replay
P2
Memory
P1
P1 P2
n=3n=5
if (n<4)
![Page 9: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/9.jpg)
Ordering Memory Accesses
• Preserving order will reproduce execution– a→b: “a happens-before b”– Ordering is transitive: a→b, b→c means
a→c
• Two instructions must be ordered if:– they both access the same memory, and– one of them is a write
![Page 10: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/10.jpg)
Constraints: Enforcing order
• To guarantee a→d:– a→d– b→d– a→c– b→c
• Suppose we need b→c– b→c is necessary– a→d is redundant
P1
a
b
c
d
P2
overconstrained
![Page 11: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/11.jpg)
CREW Protocol
• Each shared object in one of two states:– Concurrent-Read: all processors can read,
none can write– Exclusive-Write: one processor (the
owner) can read and write; others have no access
![Page 12: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/12.jpg)
CREW protocol, con’t• Enforced with hardware MMU
– Read/write– Read-only– None
• Change CREW states on demand– Fault, fixup, re-execute
• CREW event– Increasing or reducing permission due to CREW
state changes
![Page 13: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/13.jpg)
CREW Property
• If two instructions on different processors: – access the same page,– and one of them is a write,– there will be a CREW event on each
processor between them.
![Page 14: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/14.jpg)
Generating Constraints• State: Concurrent Read
– All processors read-only
• d*: CREW fault• New state: P2 Exclusive• r: privilege reduction
– Read to None
• i: privilege increase– Read to Read/write
• Log timing of r and i• Constraint:
– r → i
P1
a
d
P2
ri
d*
![Page 15: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/15.jpg)
Direct Memory Access
• Device accesses memory directly
• Logically another processor– Reads and writes need to be ordered– IOMMU: can’t fault/fixup/re-execute
• Observation: Transaction model
• Device: non-preemptible actor
![Page 16: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/16.jpg)
Prototype: SMP-ReVirt
• Modified Xen hypervisor
• Implement logging, CREW protocol
• Details in paper
![Page 17: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/17.jpg)
Evaluation questions
• What is the overhead?
• What affects performance?– In paper
• When might I want to use MP?– Log with 1, 2, or N cpus?
![Page 18: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/18.jpg)
Evaluation Workloads
• SPLASH2 parallel application suite– FMM, LU, ocean, radix, water-spatial,
radiosity
• Kernel-build
• Dbench
![Page 19: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/19.jpg)
Predicting results• Key changes in sharing attributes
– 4096-byte sharing granularity– “Miss” is very expensive
• SPLASH2– Good: high spatial locality / low false sharing– Bad: random access patterns / high false sharing
• The Linux kernel– Tuned to 16-byte cacheline– Involving the kernel may be expensive
![Page 20: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/20.jpg)
Single-processor Xen guests
1.001.04
1.01 1.001.03
1.13
1.001.05
0
0.2
0.4
0.6
0.8
1
1.2
FMM LU ocean radix water-spatial
kernel-build
radiosity dbench
Norm
aliz
ed r
untim
e
Unmodified 1-cpu guest
Logging 1-cpuguest
`
![Page 21: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/21.jpg)
Log Growth RateWorkload Log growth(GB/day) Days to fill 300GB
FMM 0.234 1280
LU 0.237 1261
Ocean 0.232 1295
Radix 0.292 1025
Water-spatial 0.232 1296
Kernel-build 0.564 531
Radiosity 0.231 1295
Dbench 0.557 538
![Page 22: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/22.jpg)
2-processor Xen guests
1.51
1.001.08
1.601.48
2.10
1.90
1.76
1.96
1.741.83
1.99
0
0.5
1
1.5
2
2.5
FMM LU ocean radix water-spatial kernel-build
No
rma
lize
d r
un
tim
e
Unmodified 2-cpuguest
Logging 2-cpu guest
Logging 1-cpu guest
![Page 23: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/23.jpg)
2-processor, con’t
8.70
7.21
1.85 1.88
0123456789
10
radiosity dbench
No
rma
lize
d r
un
tim
e
Unmodified 2-cpu guest
Logging 2-cpu guest
Logging 1-cpu guest
![Page 24: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/24.jpg)
Log Growth RateWorkload Log growth(GB/day) Days to fill 300GB
FMM 34.5 8.7
LU 3.2 92.7
Ocean 4.3 69.1
Radix 39.8 7.5
Water-spatial 36.3 8.25
Kernel-build 43.3 6.9
Radiosity 88.4 3.4
Dbench 77.0 3.9
![Page 25: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/25.jpg)
4-processor Xen guests
7.36
1.12 1.28
4.20
1.72
9.03
0
2
4
6
8
10
FMM LU ocean radix water-spatial kernel-build
Nor
mal
ized
run
time
Unmodified domain, 4 cpus
CREW logging, 4 cpus
CREW logging, 2 cpus*
CREW logging, 1 cpu
![Page 26: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/26.jpg)
Recap• Memory races in multiprocessor VMs• The Ordering Requirement• The CREW Protocol
– Implementing with page protections– Relation to the Ordering Requirement– Generating constrants from CREW events
• DMA-capable devices and CREW• Performance
![Page 27: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/27.jpg)
Big ideas
• Detection and replay of memory races is possible on commodity hardware
• Overhead high for some workloads
• …but surprisingly low for other workloads
![Page 28: Execution Replay for Multiprocessor Virtual Machines](https://reader036.vdocuments.us/reader036/viewer/2022062500/56814edf550346895dbc72cb/html5/thumbnails/28.jpg)
Questions