simplifying datacenter network debugging with pathdump · 2016. 11. 3. · • over 10k switches,...

95
Simplifying Datacenter Network Debugging with PathDump University of Edinburgh, Cornell University Praveen Tammana Rachit Agarwal Myungjin Lee

Upload: others

Post on 12-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Simplifying Datacenter Network Debugging with PathDump

†University of Edinburgh, ‡Cornell UniversityPraveen Tammana† Rachit Agarwal‡ Myungjin Lee†

Page 2: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Datacenter networks are complex

--source: TechRepublic.com

• Increasingly larger scale• Over 10k switches, 100k servers• Each server with 10 to 40 Gbps• Aggregate traffic > 100 Tbps

Page 3: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Datacenter networks are complex

--source: TechRepublic.com

• Increasingly larger scale• Over 10k switches, 100k servers• Each server with 10 to 40 Gbps• Aggregate traffic > 100 Tbps

• Stringent performance requirements• E.g., Amazon and Google studies

Page 4: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Datacenter networks are complex

--source: TechRepublic.com

• Increasingly larger scale• Over 10k switches, 100k servers• Each server with 10 to 40 Gbps• Aggregate traffic > 100 Tbps

• Stringent performance requirements• E.g., Amazon and Google studies

• Complex policies• Security, isolation, etc.

• Network programmability• Too many possible configurations

Page 5: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network problems are inevitable

• Result: Mismatch between network behavior and operator intent

Loops

Failures, bugs

Page 6: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network problems are inevitable

• Result: Mismatch between network behavior and operator intent

Loops

Failures, bugs

Page 7: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network problems are inevitable

• Result: Mismatch between network behavior and operator intent

Loops

Failures, bugs

Silent random packet drops

Faulty interface

2

X 1 3

Page 8: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network problems are inevitable

• Result: Mismatch between network behavior and operator intent

Loops

Failures, bugs

Silent random packet drops

Faulty interface

2

X

Human errors

Black hole

1 3 1

32

XX

X

Page 9: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network problems are inevitable

• Result: Mismatch between network behavior and operator intent

Loops

Failures, bugs

Silent random packet drops

Faulty interface

2

X

Human errors

Black hole

1 3 1

32

XX

X

Page 10: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network problems are inevitable

• Result: Mismatch between network behavior and operator intent

• Lots of research efforts in building network debuggers

Loops

Failures, bugs

Silent random packet drops

Faulty interface

2

X

Human errors

Black hole

1 3 1

32

XX

X

Page 11: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Existing designs:  in-network techniques

Network debuggers are even more complex

Page 12: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Existing designs:  in-network techniques

Idea: Use programmability of network switches to capture debugging information

Network debuggers are even more complex

Page 13: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network debuggers are even more complexStatic analysis of data plane snapshotsE.g.: HSA [NSDI’12], Anteater [SIGCOMM’11]

Page 14: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network debuggers are even more complexStatic analysis of data plane snapshotsE.g.: HSA [NSDI’12], Anteater [SIGCOMM’11]

Page 15: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network debuggers are even more complex

FIBsACLs

Static analysis of data plane snapshotsE.g.: HSA [NSDI’12], Anteater [SIGCOMM’11]

Page 16: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network debuggers are even more complex

• Capturing consistent network state is a hard problem

FIBsACLs

Static analysis of data plane snapshotsE.g.: HSA [NSDI’12], Anteater [SIGCOMM’11]

Page 17: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network debuggers are even more complexPer-switch per-packet loggingE.g.: NetSight [NSDI’14]

Page 18: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network debuggers are even more complexPer-switch per-packet loggingE.g.: NetSight [NSDI’14]

Page 19: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network debuggers are even more complex

• High bandwidth and processing overhead

Per-switch per-packet loggingE.g.: NetSight [NSDI’14]

Page 20: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network debuggers are even more complexSelective packet sampling and mirroringE.g.: Everflow [SIGCOMM’15], Planck [SIGCOMM’14], sFlow

Page 21: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network debuggers are even more complexSelective packet sampling and mirroringE.g.: Everflow [SIGCOMM’15], Planck [SIGCOMM’14], sFlow

Sample/mirror packet

Page 22: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network debuggers are even more complexSelective packet sampling and mirroringE.g.: Everflow [SIGCOMM’15], Planck [SIGCOMM’14], sFlow

Page 23: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network debuggers are even more complex

• Identifying packets to sample for debugging problems is complex

Selective packet sampling and mirroringE.g.: Everflow [SIGCOMM’15], Planck [SIGCOMM’14], sFlow

Page 24: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network debuggers are even more complexSQL-like queries on switchesE.g.: Pathquery [NSDI’16]

Page 25: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network debuggers are even more complexSQL-like queries on switchesE.g.: Pathquery [NSDI’16]

Query

Page 26: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Network debuggers are even more complexSQL-like queries on switchesE.g.: Pathquery [NSDI’16]

• Requires dynamic installation of switch rules

Query

Page 27: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Summary: Complex networks and debuggers

Complex networks--source: TechRepublic.com

Data plane snapshots

Per-switch per-packet logs

Packet mirroring

Packet samplingDynamic rule installation

Network debuggers even more complex

Page 28: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump: (Simple) In-network + End-hosts

In-network debugging functionality

Network elements

ProblemProblem

Page 29: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump: (Simple) In-network + End-hosts

• Use end-hosts for most debugging problems

• In-network functionality for a small number of debugging problems

Network elements

Server Server

Debugging functionalityat end-host

In-networkdebugging

ProblemProblem

Page 30: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump in a nutshell• Before forwarding a packet, checks a condition • If met, embeds its ID into packet header

Switch

Page 31: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump in a nutshell• Before forwarding a packet, checks a condition • If met, embeds its ID into packet header

• No data plane snapshots• No per-switch per-packet logs• No packet sampling• No packet mirroring• No dynamic rule installation

Switch

Page 32: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump in a nutshell

• Captures each and every packet header• Stores and updates flow-level statistics• Exposes API for debugging purposes

Server

• Before forwarding a packet, checks a condition • If met, embeds its ID into packet header

Switch

Page 33: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump in a nutshell

• Captures each and every packet header• Stores and updates flow-level statistics• Exposes API for debugging purposes

• Enables slicing-and-dicing of statistics across flows (potentially stored at various end-hosts)

Aggregator

Server

• Before forwarding a packet, checks a condition • If met, embeds its ID into packet header

Switch

Page 34: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump: Three challenges

Page 35: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump: Three challengesCoverage

How to support large class of debugging functionalities?

Debug more than 85% of reported network problems

Page 36: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump: Three challengesCoverage

How to support large class of debugging functionalities?

Debug more than 85% of reported network problems

Packets not reaching destinationHow to handle packet drops and loops caused by network problems?

Exploit load balancing (e.g. ECMP) and identify spurious packet drops

Page 37: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump: Three challengesCoverage

How to support large class of debugging functionalities?

Debug more than 85% of reported network problems

Packets not reaching destinationHow to handle packet drops and loops caused by network problems?

Exploit load balancing (e.g. ECMP) and identify spurious packet drops

Data plane/end-host resourcesSwitch resources and packet header space are limited PathDump should not hog user app’s resources at end-host

CherryPick: Per-packet path tracing technique10s of flow rules at switchTwo VLAN tags in the packet25% of one core / 100MB of mem. at end-host

Page 38: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump architecture

DstSrc

1. Switch embeds unique ID (e.g., link ID)

User packet

ToR

Aggregate

Core

Page 39: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump architecture

DstSrc

1. Switch embeds unique ID (e.g., link ID)

1

23

4

ToR

Aggregate

Core

1

4

Page 40: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump architecture

DstSrc

1. Switch embeds unique ID (e.g., link ID)

1

23

4

• Packet header space limitation

• Cherrypick [SOSR’15] for current deployments

1

4

Page 41: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump architecture

DstSrc

User packet 1

23

4

1. Switch embeds unique ID (e.g., link ID)

ToR

Aggregate

Core

Page 42: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump architecture

DstSrc

User packet

Picked link

Only one shortest path from Core to Dst

1

23

4

1. Switch embeds unique ID (e.g., link ID)

ToR

Aggregate

Core

Page 43: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump architecture

DstSrc

1

23

4

1. Switch embeds unique ID (e.g., link ID)

ToR

Aggregate

Core

Page 44: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump architecture

DstSrc

1

23

4

More details in our paper

1. Switch embeds unique ID (e.g., link ID)

✓ 72-port Fat-tree : 90K servers✓ 62-port VL2 : 20K servers

Page 45: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump architecture2. End-host captures packet path and updates flow-level statistics

DstSrc

2

Page 46: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump architecture2. End-host captures packet path and updates flow-level statistics

Dst

2

Page 47: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump architecture

Dst

2. End-host captures packet path and updates flow-level statistics

Page 48: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump architecture

Dst

Packet streamOVS

Trajectory Information Base

TIB

< 5 tuple flow id >

path: set of switch ids

start, end, #pkts, #bytes

StoreAgent

2. End-host captures packet path and updates flow-level statistics

Page 49: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Congested linkTraffic matrix

Load imbalance

PathDump architecture3. Aggregator runs debugging applications

Aggregator

Server

Event-driven debugging applications

On-demand debugging applications

On-demand vs. Event-driven

Page 50: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Congested linkTraffic matrix

Load imbalance

PathDump architecture3. Aggregator runs debugging applications

Aggregator

Server

Event-driven debugging applications

Request/Reply

On-demand debugging applications

On-demand vs. Event-driven

Page 51: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump architecture3. Aggregator runs debugging applications

Aggregator

Server

Event-driven debugging applications

Path conformanceBlack hole Silent packet drop

Congested linkLoad imbalance

On-demand vs. Event-driven

Page 52: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump architecture3. Aggregator runs debugging applications

Aggregator

Server

Event-driven debugging applications

post alarm

Path conformanceBlack hole Silent packet drop

Congested linkLoad imbalance

On-demand vs. Event-driven

Page 53: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump interface

• Aggregator APIs: Install(), execute() and uninstall()

A small set of simple APIs enables a variety of debugging applications

• Other end-host APIs: getCount(), getPoorTCPFlows(), Alarm(), etc.

A

B

C

Aggregator

Page 54: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump interface

• Aggregator APIs: Install(), execute() and uninstall()

A small set of simple APIs enables a variety of debugging applications

• Other end-host APIs: getCount(), getPoorTCPFlows(), Alarm(), etc.

A

B

C

Flow 1

Aggregator

Page 55: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump interface

• Aggregator APIs: Install(), execute() and uninstall()

A small set of simple APIs enables a variety of debugging applications

• Other end-host APIs: getCount(), getPoorTCPFlows(), Alarm(), etc.

A

B

C

Flow 1

Flow 2

Aggregator

Page 56: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump interface

• Aggregator APIs: Install(), execute() and uninstall()

A small set of simple APIs enables a variety of debugging applications

• Other end-host APIs: getCount(), getPoorTCPFlows(), Alarm(), etc.

A

B

C

Flow 1

Flow 2

Aggregator

Flow 1 A-B-C 5 pktsFlow 2 A-C 2 pkts

Page 57: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump interface

• Aggregator APIs: Install(), execute() and uninstall()

A small set of simple APIs enables a variety of debugging applications

• Other end-host APIs: getCount(), getPoorTCPFlows(), Alarm(), etc.

A

B

C

Flow 1

Flow 2

Aggregator

Flow 1 A-B-C 5 pktsFlow 2 A-C 2 pkts

getFlows (B-C) Flow 1

Page 58: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump interface

• Aggregator APIs: Install(), execute() and uninstall()

A small set of simple APIs enables a variety of debugging applications

• Other end-host APIs: getCount(), getPoorTCPFlows(), Alarm(), etc.

A

B

C

Flow 1

Flow 2

Aggregator

Flow 1 A-B-C 5 pktsFlow 2 A-C 2 pkts

getPaths (Flow 2) A-C

Page 59: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 1: Path conformance

Page 60: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Dst

Example 1: Path conformanceCheck if actual forwarding path != network policy

• May occur due to switch faults or network state change

Policy: Packet must avoid switch 44

1 2 3

5 6

Page 61: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Dst

Example 1: Path conformanceCheck if actual forwarding path != network policy

• May occur due to switch faults or network state change

Policy: Packet must avoid switch 4

Actual Path: 1 – 4 – 3

violation4

1 2 3

5 6

Page 62: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Dst

Example 1: Path conformanceCheck if actual forwarding path != network policy

• May occur due to switch faults or network state change

Policy: Packet must avoid switch 4

Actual Path: 1 – 4 – 3

violation4

1 2 3

5 6

# Given flowID, paths, switchID

1: for path in paths:2: if switchID in path: 3: Alarm(flowID, PC_FAIL, result)

Page 63: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 2: Silent random packet drop diagnosis

A B

C

D

Aggregator

• No packet drop hint• Software/Hardware bug

Page 64: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 2: Silent random packet drop diagnosis

A B

C

D

Aggregator1. Install(query)

getPoorTCPFlows()

getPoorTCPFlows()

Page 65: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 2: Silent random packet drop diagnosis

A B

C

D

A - B - C

A - B - D

Aggregator

getPoorTCPFlows()

getPoorTCPFlows()

Page 66: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 2: Silent random packet drop diagnosis

A B

C

D

A - B - C

A - B - D

Aggregator

2. Alarm()

getPoorTCPFlows()

getPoorTCPFlows()

Page 67: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 2: Silent random packet drop diagnosis

A B

C

D

A - B - C

A - B - D

Aggregator

2. Alarm() 3. getPaths()

getPoorTCPFlows()

getPoorTCPFlows()

Page 68: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 2: Silent random packet drop diagnosis

A B

C

D

A - B - C

A - B - D

Aggregator

2. Alarm() 3. getPaths()

getPoorTCPFlows()

getPoorTCPFlows()

Page 69: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

4. Max-Coverage algorithm• A – B : 2• B – C : 1• B – D : 1

Example 2: Silent random packet drop diagnosis

A B

C

D

A - B - C

A - B - D

Aggregator

getPoorTCPFlows()

getPoorTCPFlows()

Page 70: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 2: Silent random packet drop diagnosis

Network Load = 70%

Lab setup• 4-ary fat-tree topology• Web-traffic model

Page 71: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 2: Silent random packet drop diagnosis

Network Load = 70%

Number of faulty interfaces

Lab setup• 4-ary fat-tree topology• Web-traffic model

Page 72: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 2: Silent random packet drop diagnosis

Network Load = 70%

Lab setup• 4-ary fat-tree topology• Web-traffic model

Page 73: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 2: Silent random packet drop diagnosis

Network Load = 70%

Time taken to reach 100% accuracy

Lab setup• 4-ary fat-tree topology• Web-traffic model

Page 74: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 2: Silent random packet drop diagnosis

Network Load = 70%

Lab setup• 4-ary fat-tree topology• Web-traffic model

As loss rate increases, algorithm takes less time to obtain 100% accuracy

Page 75: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 3: Load imbalance diagnosis

Link 1

Link 2

A

Aggregator

Page 76: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 3: Load imbalance diagnosis

Link 1

Link 2

A

If flow >= 1 MB ➔ Link 1

Aggregator

Page 77: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 3: Load imbalance diagnosis

If flow < 1 MB ➔ Link 2

Link 1

Link 2

A

If flow >= 1 MB ➔ Link 1

Aggregator

Page 78: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 3: Load imbalance diagnosis

If flow < 1 MB ➔ Link 2

Link 1

Link 2

A

If flow >= 1 MB ➔ Link 1

AggregatorExecute (query)

Page 79: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 3: Load imbalance diagnosis

If flow < 1 MB ➔ Link 2

Link 1

Link 2

A

If flow >= 1 MB ➔ Link 1

Aggregator

Local flow size distribution

Page 80: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 3: Load imbalance diagnosis

If flow < 1 MB ➔ Link 2

Link 1

Link 2

A

If flow >= 1 MB ➔ Link 1

Aggregator

Page 81: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Example 3: Load imbalance diagnosis

If flow < 1 MB ➔ Link 2

Link 1

Link 2

A

If flow >= 1 MB ➔ Link 1

Aggregator

Page 82: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Other debugging applications• Real-time routing loop detection

• Blackhole diagnosis

• TCP performance anomaly diagnosis• TCP incast and outcast

• Traffic measurement• Traffic matrix, heavy-hitter detection, etc.

More details in our paper

Page 83: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Packet processing overhead at end-host

Page 84: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Packet processing overhead at end-host

Minimal packet processing overhead atop Open vSwitch

Maximum of 4% throughput loss

Page 85: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

More details (In the paper)• Distributed query mechanism

• Supported network debugging problems

• Implementation details

• Evaluation over real testbed(s)

Page 86: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Conclusion• DCNs are complex; and their debuggers are even more complex• Design and implement PathDump, a simple debugger• Keeps network switches simple

• No complex operations in network switches • Executes debugging queries in a distributed manner• Consumes small amount of data plane and end-host resources• Debugs a large class of network problems

https://github.com/PathDump

Page 87: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps
Page 88: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Datacenter networks are complex(*remove slide) 

• Complexity due to need for • High availability • High performance

--source: The Gigaspaces blog --source: TechRepublic.com

Page 89: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

PathDump interface

Host APIgetFlows(linkID, timeRange)getPaths(flowID, linkID, timeRange)getCount(flow, timeRange)getDuration(flow, timeRange)getPoorTCPFlows(threshold)Alarm(flowID, reason, paths)

Aggregator   APIexecute(list<hostID>, query)install(list<hostID>, query, Period)uninstall(list<hostID>, query)

✓ Write a query using host API

Simple 9 APIs enables a variety of debugging applications

✓ Install(), execute() or uninstall() with Aggregator API

Page 90: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Query processing

Controller

Direct query

Controller

Multi level query

Page 91: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Query processingFlow size distribution query

Page 92: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Query processingFlow size distribution queryDirect query is better when

#servers are small

Page 93: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Query processingFlow size distribution query

Multi-level query is better when #servers are large

Page 94: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Query processingFlow size distribution query Top-k flows query

Page 95: Simplifying Datacenter Network Debugging with PathDump · 2016. 11. 3. · • Over 10k switches, 100k servers • Each server with 10 to 40 Gbps • Aggregate traffic > 100 Tbps

Query processingFlow size distribution query Top-k flows query

If aggregation reduces response data, multi-query is more efficient