stroboscope: declarative network monitoring on a budget · 2019. 5. 13. · stroboscope:...
TRANSCRIPT
Stroboscope: Declarative Network Monitoring on a Budget
Olivier TilmansUniversité catholique de Louvain
USENIX NSDI’18April 11, 2018
Joint work withT. Bühler (ETH Zürich), I. Poese (BENOCS), S. Vissicchio (UCL) and L. Vanbever (ETH Zürich)Adapted from original picture © Michael Magg, 2007, CC-BY-SA 3.0
Consider this example ISP network topology
A B C D
E U
VZWX Y
1.2.3.0/24
Packet
Border router
RouterLink
Customer peering
2
What is the ingress router for this packet arriving at router D?
A B C D
E U
VZWX Y
1.2.3.0/24
Packet
Border router
RouterLink
Customer peering
2
Which paths does the traffic follow?
A B C D
E U
VZWX Y
1.2.3.0/24
Packet
Border router
RouterLink
Customer peering
2
Which paths does the traffic follow?
A B C D
E U
VZWX Y
1.2.3.0/24
Packet
Border router
RouterLink
Customer peering
Tracking flows network-wide requires tomatch measurements across multiple vantage points
NetFlow, ProgME [ToN’11], FlowRadar [NSDI’16]
2
Which paths does the traffic follow?
A B C D
E U
VZWX Y
1.2.3.0/24
Packet
Border router
RouterLink
Customer peering
Tracking flows network-wide requires tomatch measurements across multiple vantage points
NetFlow, ProgME [ToN’11], FlowRadar [NSDI’16]
2
Is traffic load-balanced as expected?
A B C D
E U
VZWX Y
1.2.3.0/24
3
Is the latency acceptable?
A B C D
E U
VZWX Y
1.2.3.0/24123
Packets Time(t)
A B C D
E U
VZWX Y
1.2.3.0/24
Time(t + 25ms)
321
3
Are there losses?
A B C D
E U
VZWX Y
1.2.3.0/24123
Packets
Time(t)
A B C D
E U
VZWX Y
1.2.3.0/24
Time(t + 25ms)
31
3
Are there losses?
A B C D
E U
VZWX Y
1.2.3.0/24123
Packets
Time(t)
A B C D
E U
VZWX Y
1.2.3.0/24
Time(t + 25ms)
31
Fine-grained data-plane performance metrics requirepacket-level visibility over individual flows
3
Fined-grained network monitoring is widely researched
� No control over end hosts
� Limited data-plane flexibility� Limited monitoring bandwidth
Gigascope [SIGMOD’03]Planck [SIGCOMM’14]Everflow [SIGCOMM’15]Compiling Path Queries [NSDI’16]Trumpet [SIGCOMM’16]Marple [SIGCOMM’17]
4
Fined-grained ISP network monitoring posesunique and unmet challenges
� No control over end hosts
� Limited data-plane flexibility� Limited monitoring bandwidth
Gigascope [SIGMOD’03]Planck [SIGCOMM’14]Everflow [SIGCOMM’15]Compiling Path Queries [NSDI’16]Trumpet [SIGCOMM’16]Marple [SIGCOMM’17]
4
Fined-grained ISP network monitoring posesunique and unmet challenges
� No control over end hosts� Limited data-plane flexibility
� Limited monitoring bandwidth
Gigascope [SIGMOD’03]Planck [SIGCOMM’14]Everflow [SIGCOMM’15]Compiling Path Queries [NSDI’16]Trumpet [SIGCOMM’16]Marple [SIGCOMM’17]
4
Fined-grained ISP network monitoring posesunique and unmet challenges
� No control over end hosts� Limited data-plane flexibility� Limited monitoring bandwidth
Gigascope [SIGMOD’03]Planck [SIGCOMM’14]Everflow [SIGCOMM’15]Compiling Path Queries [NSDI’16]Trumpet [SIGCOMM’16]Marple [SIGCOMM’17]
4
Stroboscope: Declarative Network Monitoring on a Budget
� Collecting traffic slices to monitor networks� Adhering to a monitoring budget� Using Stroboscope today
Consider the following flow of packets
2 3 4
Mirroredpacket
8 9 10
A B C D
E U
VZWX Y
1.2.3.0/24
Collector
12345
123456
Nodemirroring rule
123456789
2 43
1234567891011
Collected traffic slice
234567891011 67891011 891011
8 9 10
6
Consider the following flow of packets
2 3 4
Mirroredpacket
8 9 10
A B C D
E U
VZWX Y
1.2.3.0/24
Collector12345
123456
Nodemirroring rule
123456789
2 43
1234567891011
Collected traffic slice
234567891011 67891011 891011
8 9 10
6
Stroboscope activates mirroring for the flow
2 3 4
Mirroredpacket
8 9 10
A B C D
E U
VZWX Y
1.2.3.0/24
CollectorA
12345
123456
Nodemirroring rule
123456789
2 43
1234567891011
Collected traffic slice
234567891011 67891011 891011
8 9 10
6
Packets are copied and encapsulated towards the collector
2 3 4
Mirroredpacket
8 9 10
A B C D
E U
VZWX Y
1.2.3.0/24
CollectorA
12345 123456
Nodemirroring rule
123456789
2 43
1234567891011
Collected traffic slice
234567891011 67891011 891011
8 9 10
6
The mirroring rule is deactivated after a preset delay
2 3 4
Mirroredpacket
8 9 10
A B C D
E U
VZWX Y
1.2.3.0/24
Collector
12345 123456
Nodemirroring rule
123456789
2 43
1234567891011
Collected traffic slice
234567891011 67891011 891011
8 9 10
6
Stroboscope stores the traffic slice for analysis
2 3 4
Mirroredpacket
8 9 10
A B C D
E U
VZWX Y
1.2.3.0/24
Collector
12345 123456
Nodemirroring rule
123456789
2 43
1234567891011
Collected traffic slice
234567891011 67891011 891011
8 9 10
6
Stroboscope periodically toggles the mirroring rule
2 3 4
Mirroredpacket
8 9 10
A B C D
E U
VZWX Y
1.2.3.0/24
CollectorA
12345 123456
Nodemirroring rule
123456789
2 43
1234567891011
Collected traffic slice
234567891011
67891011 891011
8 9 10
6
Stroboscope periodically toggles the mirroring rule
2 3 4
Mirroredpacket
8 9 10A B C D
E U
VZWX Y
1.2.3.0/24
CollectorA
12345 123456
Nodemirroring rule
123456789
2 43
1234567891011
Collected traffic slice
234567891011
67891011
891011
8 9 10
6
Stroboscope periodically toggles the mirroring rule
2 3 4
Mirroredpacket
8 9 10A B C D
E U
VZWX Y
1.2.3.0/24
Collector
12345 123456
Nodemirroring rule
123456789
2 43
1234567891011
Collected traffic slice
234567891011
67891011
891011
8 9 10
6
Stroboscope collects multiples traffic slices over time
2 3 4
Mirroredpacket
8 9 10
A B C D
E U
VZWX Y
1.2.3.0/24
Collector
12345 123456
Nodemirroring rule
123456789
2 43
1234567891011
Collected traffic slice
234567891011 67891011
891011
8 9 10
6
Stroboscope works with currently deployed routers
� Most vendors provide traffic mirroring and encapsulation primitives� The collector activates mirroring for a flow by updating one ACL� Routers autonomously deactivate mirroring rules using timers� Traffic slices can be as small as 23ms on our routers (Cisco C7018)
7
Consider the following forwarding path
A B C D
E U
VZWX Y
1.2.3.0/24
W
A B C D
8
Stroboscope activates mirroring rules along a path
MIRROR 1.2.3.0/24 ON [A B C D]
CONFINE 1.2.3.0/24 ON [A B E C D]
A B C D
E U
VZWX Y
1.2.3.0/24
W
A B C D
8
Traffic slices are collected
4 5 6 73214 5 6 X 834 X 6 7 8 934 X 6 7 8 9 10 11
∼25ms
A B C D
E U
VZWX Y
1.2.3.0/24
W
A B C D
8
A CONFINE query mirrors any packet leaving a region
MIRROR 1.2.3.0/24 ON [A B C D]CONFINE 1.2.3.0/24 ON [A B E C D]
Confinementregion
A B C D
E U
VZWX Y
1.2.3.0/24
W
A B C D
8
A CONFINE query mirrors any packet leaving a region
MIRROR 1.2.3.0/24 ON [A B C D]CONFINE 1.2.3.0/24 ON [A B E C D]
Confinementregion
A B C D
E U
VZWX Y
1.2.3.0/24
W
A B C D
8
Counting packets missing in all last hops of a pathestimates loss rates
4 5 6 73214 5 6 X 834 X 6 7 8 934 X 6 7 8 9 10 11
∼25ms
A B C D
E U
VZWX Y
1.2.3.0/24
W
A B C D
8
Counting packets partially following the pathestimates load-balancing ratios
4 5 6 73214 5 6 X 834 X 6 7 8 934 X 6 7 8 9 10 11
∼25ms
A B C D
E U
VZWX Y
1.2.3.0/24
W
A B C D
8
Analyzing matching packets across traffic slicesenables fine-grained measurements at scale
Forwarding paths discovery, timestamp reconstruction, payload inspection, . . .
Analyzing matching packets across traffic slicesenables fine-grained measurements at scaleForwarding paths discovery, timestamp reconstruction, payload inspection, . . .
Stroboscope: Declarative Network Monitoring on a Budget
� Collecting traffic slices to monitor networks� Adhering to a monitoring budget� Using Stroboscope today
Stroboscope defines two types of queries
MIRROR CONFINE
11
Stroboscope defines two types of queries
MIRROR CONFINE
11
MIRROR queries reconstruct the path taken by packets
MIRROR 1.2.3.0/24 ON [A B C D]
A B C D
E U
VZWX Y
1.2.3.0/24A DCB
12
Fewer mirroring rules reduces bandwidth usage
MIRROR 1.2.3.0/24 ON [A B C D]
A B C D
E U
VZWX Y
1.2.3.0/24A DB
12
Too few mirroring rules creates ambiguity
MIRROR 1.2.3.0/24 ON [A B C D]
A B C D
E U
VZWX Y
1.2.3.0/24A D
12
Too few mirroring rules creates ambiguity
MIRROR 1.2.3.0/24 ON [A B C D]
A B C D
E U
VZWX Y
1.2.3.0/24A D
The Key-Points Sampling algorithm minimizes mirroring rulesand guarantees non-ambiguous reconstructed paths
12
Stroboscope defines two types of queries
MIRROR CONFINE
13
CONFINE queries mirror packets leaving a confinement region
CONFINE 1.2.3.0/24 ON [A B E C D]
A B C D
E U
VZWX Y
1.2.3.0/24
U
X Y
V
U
Z
Edge Mirroring rule
14
Fewer mirroring rules minimizes control-plane overhead
CONFINE 1.2.3.0/24 ON [A B E C D]
A B C D
E U
VZWX Y
1.2.3.0/24U
X Y
V
U
Z
Edge Mirroring rule
14
The lower bound is a multi-terminal node cut
CONFINE 1.2.3.0/24 ON [A B E C D]
A B C D
E U
VZWX Y
1.2.3.0/24
U
X Y
V
U
Z
Edge Mirroring rule
14
The lower bound is a multi-terminal node cut
CONFINE 1.2.3.0/24 ON [A B E C D]
A B C D
E U
VZWX Y
1.2.3.0/24
U
X Y
V
U
Z
Edge Mirroring rule
The Surrounding algorithm minimizes mirroring rules andguarantees to mirror any packet leaving the confinement region
14
Query activations must be scheduled to meet the budget
Monitoringqueries Where?
Key-Points SamplingSurrounding
When?Measurementcampaign
15
Stroboscope divides the monitoring budget in timeslots
Timeslot (∼30ms)
150ms
40Mbps
Q4?Mbps Q5Q1?Mbps Q2 Q3
Q6?Mbps
16
Stroboscope requires traffic demand estimations
Timeslot (∼30ms)
150ms
40MbpsQ4?Mbps Q5Q1?Mbps Q2 Q3
Q6?Mbps
16
Stroboscope conservatively estimates traffic demands
Prefix measured inprior iterations?
Peak observed demandyes
Prefix with enoughNetFlow records?no
NetFlow estimationyes
Assume the query requiresthe full budgetno
17
Stroboscope conservatively estimates traffic demands
Prefix measured inprior iterations? Peak observed demandyes
Prefix with enoughNetFlow records?no
NetFlow estimationyes
Assume the query requiresthe full budgetno
17
Stroboscope conservatively estimates traffic demands
Prefix measured inprior iterations? Peak observed demandyes
Prefix with enoughNetFlow records?no
NetFlow estimationyes
Assume the query requiresthe full budgetno
17
Stroboscope conservatively estimates traffic demands
Prefix measured inprior iterations? Peak observed demandyes
Prefix with enoughNetFlow records?no
NetFlow estimationyes
Assume the query requiresthe full budgetno
17
Stroboscope conservatively estimates traffic demands
Prefix measured inprior iterations? Peak observed demandyes
Prefix with enoughNetFlow records?no
NetFlow estimationyes
Assume the query requiresthe full budgetno
17
Confine queries are scheduled in all timeslots
40MbpsQ415Mbps Q5Q110Mbps Q2 Q3
Q620Mbps0*Mbps
Confinequery
18
Stroboscope first approximates a minimal sub-schedule
,optionally optimizing for the optimal bin-packing solution
Q6Q4
Q5Q1Q2
Q3Minimalsub-schedule
Q6 Q4Q5Q1
Q2 Q3Minimalsub-schedule
19
Stroboscope first approximates a minimal sub-schedule,optionally optimizing for the optimal bin-packing solution
Q6Q4
Q5Q1Q2
Q3Minimalsub-schedule
Q6 Q4Q5Q1
Q2 Q3Minimalsub-schedule
19
Stroboscope replicates the sub-schedule
,and minimizes budget leftovers
Q6 Q4Q5Q1
Q2 Q3
Q6 Q4Q5Q1
Q2 Q3
Minimalsub-schedule
Q6 Q4Q5Q1
Q2 Q3
Q6 Q4Q5Q1
Q2 Q3
Q6Q1Q3
Budget usagemaximization
20
Stroboscope replicates the sub-schedule,and minimizes budget leftovers
Q6 Q4Q5Q1
Q2 Q3
Q6 Q4Q5Q1
Q2 Q3
Minimalsub-schedule
Q6 Q4Q5Q1
Q2 Q3
Q6 Q4Q5Q1
Q2 Q3
Q6Q1Q3
Budget usagemaximization
20
Stroboscope achieves deterministic sampling
Stroboscope: Declarative Network Monitoring on a Budget
� Collecting traffic slices to monitor networks� Adhering to a monitoring budget� Using Stroboscope today
Selecting mirroring locations in realistic ISP topologies is fast
3 5 7 9 11 13 15Input path length
0.01
0.1
1
10
100200
Tim
e[m
s]
Key-Points Sampling
15 20 25 30 40 50 60Region size
0.050.1
1
51550
100
Tim
e[m
s]
Minimal surrounding
Node surrounding
23
Schedules can be quickly approximated
10 50 100 200 500 1000Input query count
0.00010.001
0.01
110
1001000
Tim
e[s
]
Approximation
Optimized schedule
24
Stroboscope tracks the rate of mirrored traffic in real time
1 000Kb/sA
B
Destination
Source
CollectorMIRROR �ON [A B]
0 1 10 12 20Time [s]
10002000
5000
Traffi
c ra
te [K
b/s]
realmirrored
17
budget
25
Measurement campaigns are stopped earlyif the estimated demand are exceeded
10 000Kb/sA
B
Destination
Source
CollectorMIRROR �ON [A B]
0 1 10 12 20Time [s]
10002000
5000
Traffi
c ra
te [K
b/s]
realmirrored
17
budget
25
Exceeding the total budget schedules the queryonce per measurement campaign
1 000Kb/sA
B
Destination
Source
CollectorMIRROR �ON [A B]
0 1 10 12 20Time [s]
10002000
5000
Traffi
c ra
te [K
b/s]
realmirrored
17
budget
25
Stable recorded traffic rates are used for future estimations
1 000Kb/sA
B
Destination
Source
CollectorMIRROR �ON [A B]
0 1 10 12 20Time [s]
10002000
5000
Traffi
c ra
te [K
b/s]
realmirrored
17
budget
25
Stroboscope exceeds the monitoring budget forat most one timeslot
A
B
Destination
Source
CollectorMIRROR �ON [A B]
0 1 10 12 20Time [s]
10002000
5000
Traffi
c ra
te [K
b/s]
realmirrored
11.225 11.25
17
budget
25
Stroboscope: Declarative Network Monitoring on a Budget
� Traffic slicing as a first-classdata-plane primitive� Strong guarantees on budget complianceand measurement accuracy� Measurement analysis decoupledfrom measurement collection
https://stroboscope.ethz.ch
Backup slides
NetFlow brings a poor visibility over traffic in ISP networks
0 1 2 10 100Maximal number of observations
0.71
0.74
0.86
0.95
1.0
Fract
ion o
f B
GP p
refixe
s
30
0.91
0.64
29
Stroboscope defines a declarative requirement language
MIRROR 1.2.3.0/24 ON [A B C D], [A E C D]MIRROR 1.2.3.0/24 ON [A ->D]CONFINE 1.2.3.0/24 ON [A B E C D]CONFINE 1.2.3.0/24 [A ->D]MIRROR 1.2.3.0/24 ON [ ->D]CONFINE 1.2.3.0/24 ON [ ->D]USING 15Mbps DURING 500ms EVERY 5 s
A B C D
E U
VZWX Y
1.2.3.0/24
30
The placement algorithms minimize the mirroring rules
10 30 50 70 90Optimization gain
0.0
0.2
0.4
0.6
0.8
1.0
Frac
tion
ofex
perim
ents
Key-Points Sampling
10 30 50 70Optimization gain
0.0
0.2
0.4
0.6
0.8
1.0
Frac
tion
ofex
perim
ents Minimal surrounding
Node surrounding
31
Schedules can be computed by two pipelines
10 50 100 200 500 1000Input query count
0.00010.001
0.01
110
1001000
Tim
e[s
]
Approximation
Optimized schedule
0 10 20 30 40 50 60 70 80Optimization gain
0.2
0.4
0.6
0.8
Frac
tion
ofex
perim
ents
32