low-latency network monitoring jeff rasley
TRANSCRIPT
Low-latency Network Monitoring via Oversubscribed Port Mirroring
Jeff Rasley, Brent Stephens, !Colin Dixon, Eric Rozner, !Wes Felter, Kanak Agarwal, !John Carter, Rodrigo Fonseca
Self-Tuning Networks
Measurement
DecisionControl
Control Loop Examples!!
• Traffic Engineering!• Failure Detection
How fast can we do this?
Self-Tuning Networks
Measurement
DecisionControl
100 ms — 1 sec+Control Loop Examples!!
• Traffic Engineering!• Failure Detection
How fast can we do this?
Self-Tuning Networks
Measurement
DecisionControl
100 ms — 1 sec+
~100 µs
Control Loop Examples!!
• Traffic Engineering!• Failure Detection
How fast can we do this?
Self-Tuning Networks
Measurement
DecisionControl
100 ms — 1 sec+
~100 µs~10 ms
Control Loop Examples!!
• Traffic Engineering!• Failure Detection
How fast can we do this?
Self-Tuning Networks
Measurement
DecisionControl
100 ms — 1 sec+
~100 µs~10 ms
Control Loop Examples!!
• Traffic Engineering!• Failure Detection
How fast can we do this?
Motivation for Faster Control Loops
Background TCP flows, Microsoft data center!DCTCP, Alizadeh et al. Sigcomm ‘10
0
20
40
60
80
100
0 100 200 300 400 500 600
% o
f flo
ws
shor
ter t
han
x
Flow Duration at 1Gbps (ms)
This gets much !worse as you go from !
1 Gbps –> 10 Gbps
Motivation for Faster Control Loops
Background TCP flows, Microsoft data center!DCTCP, Alizadeh et al. Sigcomm ‘10
0
20
40
60
80
100
0 100 200 300 400 500 600
% o
f flo
ws
shor
ter t
han
x
Flow Duration at 1Gbps (ms)
This gets much !worse as you go from !
1 Gbps –> 10 Gbps
Motivation for Faster Control Loops
Background TCP flows, Microsoft data center!DCTCP, Alizadeh et al. Sigcomm ‘10
0
20
40
60
80
100
0 100 200 300 400 500 600
% o
f flo
ws
shor
ter t
han
x
Flow Duration at 1Gbps (ms)
At 100 ms, 90% are over !
before you see them!
This gets much !worse as you go from !
1 Gbps –> 10 Gbps
Why is Measurement Slow?
❖ Traditionally, this didn’t need to be fast!
❖ Control plane CPUs are typically slow!
❖ Sampling or port counter polling!
❖ Is this likely to get better? Maybe!
❖ Faster control plane CPUs could help, still a big gap between CPUs and ASICs
Our Solution: Abuse Port Mirroring
❖ Modern switches support port-mirroring!
❖ Copies all packets e.g. going out a port to a designated mirror port!
❖ We abuse port mirroring to radically increase the number of samples/sec we get from a switch!
❖ We mirror all ports to a single mirror port!
❖ Oversubscription approximates sampling (in the data plane) at much higher rates
Production!Traffic
Switch
Our Solution: Abuse Port Mirroring
❖ Modern switches support port-mirroring!
❖ Copies all packets e.g. going out a port to a designated mirror port!
❖ We abuse port mirroring to radically increase the number of samples/sec we get from a switch!
❖ We mirror all ports to a single mirror port!
❖ Oversubscription approximates sampling (in the data plane) at much higher rates
Mirror Port
Production!Traffic
Switch
Our Solution: Abuse Port Mirroring
❖ Modern switches support port-mirroring!
❖ Copies all packets e.g. going out a port to a designated mirror port!
❖ We abuse port mirroring to radically increase the number of samples/sec we get from a switch!
❖ We mirror all ports to a single mirror port!
❖ Oversubscription approximates sampling (in the data plane) at much higher rates
Mirror Port
Production!Traffic
Switch
Architecture ❖ A set of collectors receives a stream of samples from
mirror ports!
❖ Netmap or Intel DPDK for fast processing!
❖ Reconstruct flow information across all flows in the network!
❖ e.g. flow throughput and port congestion!
❖ Collectors can interact with an SDN controller to implement various applications!
❖ e.g. traffic engineering
Collector(s)
Controller
H H H H
S
S
S
S
S
)
What Can Go Wrong?❖ Lose input/output port information from packets!
❖ Recover meta-data about packets by sharing topology state from the controller!
❖ When mirror port fills, its drop policy is unknown thus making it hard to calculate throughout!
❖ Rate estimation via TCP sequence numbers!
❖ Oversubscribed port may occupy switch buffer space, taking away from production traffic!
❖ Indeed, buffers were reduced. Latency of production traffic decreased. Negligible increase in packet loss (~0.1%).
Collector(s)
Controller
H H H H
S
S
S
S
S
)
What Can Go Wrong?❖ Lose input/output port information from packets!
❖ Recover meta-data about packets by sharing topology state from the controller!
❖ When mirror port fills, its drop policy is unknown thus making it hard to calculate throughout!
❖ Rate estimation via TCP sequence numbers!
❖ Oversubscribed port may occupy switch buffer space, taking away from production traffic!
❖ Indeed, buffers were reduced. Latency of production traffic decreased. Negligible increase in packet loss (~0.1%).
Collector(s)
Controller
H H H H
S
S
S
S
S
)What are you crazy?!!
!
By oversubscribing a mirror port there’s a
possibility of degrading production traffic. !
— Cisco
What Can Go Wrong?❖ Lose input/output port information from packets!
❖ Recover meta-data about packets by sharing topology state from the controller!
❖ When mirror port fills, its drop policy is unknown thus making it hard to calculate throughout!
❖ Rate estimation via TCP sequence numbers!
❖ Oversubscribed port may occupy switch buffer space, taking away from production traffic!
❖ Indeed, buffers were reduced. Latency of production traffic decreased. Negligible increase in packet loss (~0.1%).
Collector(s)
Controller
H H H H
S
S
S
S
S
)
Results: Sample Latency (high congestion)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
1
0 1 2 3 4 5 6
CD
F
Measurement Latency (ms)
IBM G8264 (10Gb) Pronto 3290 (1Gb)
Low Congestion !Sample Latency: !
75–150 µs
H
Switch
H
H
Sender Receiver
Collectort_1
t_2
Latency = t_2 - t_1
What Can You Do With This? TE!Stride(8) 100 MiB Workload!
CDF of Flow Throughput
��
����
����
����
����
��
�� �� � �� � �� �� �� ��
���
��������
���������������
Setup!• 16 hosts!• k=4 Fat Tree
ECMPOptimal
Stride(8)!Permutation of hosts, !such that each flow !traverses the core
IBM G8264 (10 Gb) Hardware
What Can You Do With This? TE!Stride(8) 100 MiB Workload!
CDF of Flow Throughput
��
����
����
����
����
��
�� �� � �� � �� �� �� ��
���
��������
���������������
��
����
����
����
����
��
�� �� � �� � �� �� �� ��
���
��������
���������������
�� ������
Setup!• 16 hosts!• k=4 Fat Tree
ECMPOptimal
Poll 100ms
Stride(8)!Permutation of hosts, !such that each flow !traverses the core
IBM G8264 (10 Gb) Hardware
What Can You Do With This? TE!Stride(8) 100 MiB Workload!
CDF of Flow Throughput
��
����
����
����
����
��
�� �� � �� � �� �� �� ��
���
��������
���������������
��
����
����
����
����
��
�� �� � �� � �� �� �� ��
���
��������
���������������
�� ������
��
����
����
����
����
��
�� �� � �� � �� �� �� ��
���
��������
���������������
�� ����������
Setup!• 16 hosts!• k=4 Fat Tree
ECMPOptimal
Poll 100ms
Stride(8)!Permutation of hosts, !such that each flow !traverses the core
Our TE
IBM G8264 (10 Gb) Hardware
Conclusion
❖ Using oversubscribed port mirroring we get ~1 million samples / sec.!
❖ We get sampling latencies between 100 µs – 6ms on real hardware, today.!
❖ We improve this by 3–4 orders of magnitude, the state of the art is 100 ms – 1 sec+
Production Traffic Throughput
0 1 2 3 4 5 6
1 2 3 4 5 6 7 8 9 10
Ave
Per-F
low
Tput
(Gbp
s)
Congested Output Ports
Mirror No Mirror
Production Traffic Latency
0
0.5
1
1.5
2
2.5
3
1 2 3 4 5 6 7 8 9 10
Aver
age
Late
ncy
(ms)
Congested Output Ports
Mirror No Mirror
Production Traffic Packet Loss
0%0.02%0.04%0.06%0.08%
0.1%0.12%0.14%0.16%
1 2 3 4 5 6 7 8 9
Perc
ent o
f Pkt
s Lo
st
Congested Output Ports
MirrorNo Mirror