jetstream: cluster-scale parallelization of …...mongodb 5.2 9 117 openoffice 7.0 10 32 firefox...
TRANSCRIPT
![Page 1: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/1.jpg)
JetStream:Cluster-scaleParallelizationofInformationFlowQueries
AndrewQuinn,DavidDevecsery,PeterChenandJasonFlinn
![Page 2: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/2.jpg)
2
• DIFT instruments execution to track causality• Also known as “Taint-Tracking”
Dynamic Information Flow Tracking
Sources (Inputs)
Sinks (Outputs)
![Page 3: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/3.jpg)
3
o1
DIFT for Debugging
o1Server
![Page 4: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/4.jpg)
4
DIFT for Debugging
o1
Outputs
Inputs
Server
![Page 5: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/5.jpg)
5
DIFT for Debugging
o1
Outputs
Inputs
Server
![Page 6: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/6.jpg)
6
DIFT for Debugging
o1
Outputs
Inputs
Server
![Page 7: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/7.jpg)
7
DIFT for Debugging
o1
Outputs
Inputs
Server
![Page 8: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/8.jpg)
8
DIFT – limitations
Arnold ‘14 Overheads ~100x
X-Ray ‘12 Long queries
TaintDroid ‘10 No native code
Backtracker ‘03 Coarse-grained causality
![Page 9: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/9.jpg)
9
Parallelize DIFT
![Page 10: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/10.jpg)
10
Parallelizing DIFT is HARD
A = read()B = read()C = A + BD = X + Y E = CB = 0Z = A[D]F = E write(F)
SequentialDependencies
![Page 11: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/11.jpg)
11
Parallelizing DIFT is HARD
A = read()B = read()C = A + BD = X + Y E = CB = 0Z = A[D]F = E write(F)
SequentialDependencies
Speck (ASPLOS ‘08)
Parallel Lifeguards (ASPLOS ‘08)
![Page 12: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/12.jpg)
12
Parallelizing DIFT is HARD
A = read()B = read()C = A + BD = X + Y E = CB = 0Z = A[D]F = E write(F)
SequentialDependencies
Speck (ASPLOS ‘08)
Parallel Lifeguards (APSLOS ‘08)
“EmbarrassinglySequential”- Ruwase etal.
![Page 13: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/13.jpg)
13
JetStream
Aggregation – pipeline parallelism
Local DIFT – epoch parallelism 2xà21x
Fasterthanoriginalexecution!
![Page 14: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/14.jpg)
14
• Design of JetStream• Local DIFT• Aggregation
• Evaluation
Outline
![Page 15: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/15.jpg)
15
Debugging Query
Outputs
Inputs
![Page 16: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/16.jpg)
16
Local DIFT
Outputs
Inputs
Time slice execution into Epochs
![Page 17: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/17.jpg)
17
Local DIFT
Outputs
Inputs
Leverage Record and Replay to calculate DIFT in parallel
![Page 18: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/18.jpg)
18
Local DIFT
Outputs
Inputs
Track mapping between all intermediate locations
![Page 19: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/19.jpg)
19
Local DIFT
Outputs
Inputs
Mapping is too expensive to calculate:• log operations• defer calculating relationships until aggregation
![Page 20: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/20.jpg)
20
DIFT
D = B + C C = A[D]
• Fast Forward: replay execution until start of epoch• Analysis: log operations using a graph
CAFast Forward
Analysis
B D
CA B D
![Page 21: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/21.jpg)
21
DIFT
D = B + C C = A[D]
• Fast Forward: replay execution until start of epoch• Analysis: log operations using a graph
CAFast Forward
Analysis
B D
CA B D
![Page 22: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/22.jpg)
22
DIFT
D = B + C C = A[D]
• Fast Forward: replay execution until start of epoch• Analysis: log operations using a graph
CAFast Forward
Analysis
B D
CA B D
![Page 23: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/23.jpg)
23
Local DIFT output
Outputs
Inputs
![Page 24: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/24.jpg)
24
JetStreamLocal DIFT – epoch parallelism
Aggregation – pipeline parallelism
![Page 25: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/25.jpg)
25
Aggregation
Outputs
Inputs
Calculate paths between source and sinks• Many nodes are not on path between source and sink• Use sequential information to prune work
![Page 26: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/26.jpg)
26
Forward Pass
Outputs
Inputs
Pass locations which are derived from a source
derivedlocations
![Page 27: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/27.jpg)
27
Forward Pass
Outputs
Inputs
Pass locations which are derived from a source
derivedlocations
![Page 28: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/28.jpg)
28
Backward Pass
Outputs
Inputs
usedlocations
Pass locations which are used by a sink
![Page 29: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/29.jpg)
29
Backward Pass
Outputs
Inputs
usedlocations
Pass locations which are used by a sink
![Page 30: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/30.jpg)
30
In the paper:• Insights about why naïve approaches fail• Partitioning – challenging to predict the local DIFT time • Pre-pruning – garbage collection of the graph
JetStream
![Page 31: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/31.jpg)
31
• Evaluation
Outline
![Page 32: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/32.jpg)
32
• CloudLab cluster of 32 machines, 1–128 cores
Experimental Setup
Benchmark Sequential DIFT Time (Minutes)
Sources(millions)
Sinks(millions)
Ghostscript 1.3 3 0.2Gzip 1.8 64 488Evince 3.9 10 104Nginx 3.3 10 35Mongodb 5.2 9 117OpenOffice 7.0 10 32Firefox 30.6 0.9 2
![Page 33: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/33.jpg)
33
Benchmark Sequential DIFT Time (Minutes)
Sources(millions)
Sinks(millions)
Ghostscript 1.3 3 0.2Gzip 1.8 64 488Evince 3.9 10 104Nginx 3.3 10 35Mongodb 5.2 9 117OpenOffice 7.0 10 32Firefox 30.6 0.9 2
• CloudLab cluster of 32 machines, 1–128 cores
Experimental Setup
sources:Cookiessinks:suspiciousconnections
sources:homedirectorysinks:all
![Page 34: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/34.jpg)
34
• Unexpected analysis:• prioritize low record overhead
• Expected analysis• periodic checkpoint• gather partitioning stats
Two Different Scenarios
![Page 35: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/35.jpg)
35
1
10
100
1 10 100
Normalize
dSped
up
NumberofCoresGzip Ghostscript Evince MongodbNginx OpenOffice Firefox Ideal
mean:13x
Scalability of Unexpected Analysis
![Page 36: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/36.jpg)
36
1
10
100
1 10 100
Normalize
dSpeedu
p
NumberofCoresGzip Ghostscript Evince MongodbNginx OpenOffice Firefox Ideal
mean:21x
Scalability of Expected Analysis
![Page 37: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/37.jpg)
37
JetStream
Aggregation – pipeline parallelism
Local DIFT – epoch parallelism 2xà21x
Fasterthanoriginalexecution!
![Page 38: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/38.jpg)
38
Questions
![Page 39: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/39.jpg)
39
Related Work
Epoch Parallelism• Wallace and Hazelwood, “SuperPin: Parallelizing
Dynamic Instrumentation for Real-Time Performance”
Local DIFT• Ruwase et al. “Parallelizing Dynamic Information Flow
Tracking”• Nightengale et al. “Parallelizing security checks on
commodity hardware”
![Page 40: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/40.jpg)
40
Benchmarks
Benchmark Replay Time (seconds)
JetStream Time(seconds)
Ghostscript 1.0 5.6Gzip 3.0 2.3Evince 13.5 19.5Nginx 4.8 5.6Mongodb 22.8 13.8OpenOffice 7.6 25.0Firefox 67.4 94.4
![Page 41: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/41.jpg)
41
Aggregation Results
0
2
4
6
8
10
12
BackwardsPass(seconds)
BothPassesTime(seconds)
![Page 42: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/42.jpg)
42
FastForward Instrumentation Analysis Pre-pruneForwardPass Prune BackwardPass
Gzip
0
0.5
1
1.5
2
2.5
3
3.5
4
0 20 40 60 80 100 120
GzipFirstQuery
0
0.5
1
1.5
2
2.5
3
3.5
4
0 20 40 60 80 100 120
GzipSecondQuery
![Page 43: JetStream: Cluster-scale Parallelization of …...Mongodb 5.2 9 117 OpenOffice 7.0 10 32 Firefox 30.6 0.9 2 • CloudLab cluster of 32 machines, 1–128 cores Experimental Setup sources:](https://reader034.vdocuments.us/reader034/viewer/2022042302/5ecd89086b1cd83a2a61d85b/html5/thumbnails/43.jpg)
43
FastForward Instrumentation Analysis Pre-pruneForwardPass Prune BackwardPass
OpenOffice
0
5
10
15
20
25
30
35
40
0 20 40 60 80 100 120
OpenOfficeFirstQuery
0
5
10
15
20
25
30
35
40
0 20 40 60 80 100 120
OpenOfficeSecondQuery