self-healing networking with flow label

29
Self-healing Networking with Flow Label Alexander Azimov [email protected]

Upload: others

Post on 12-Jun-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Self-healing Networking with Flow Label

Self-healing Networkingwith Flow Label

Alexander Azimov [email protected]

Page 2: Self-healing Networking with Flow Label

ToR + 2xPlanes + ToR

๐‘†22๐‘†22๐‘†21

๐‘‹22๐‘‹21

๐‘†12๐‘†12๐‘†11

๐‘‹12๐‘‹11

๐‘‡๐‘œ๐‘…1 ๐‘‡๐‘œ๐‘…2

Servers Servers

โ„Ž๐‘Ž๐‘ โ„Ž

๐‘๐‘Ÿ๐‘œ๐‘ก๐‘œ๐‘ ๐‘Ÿ๐‘_๐‘–๐‘๐‘‘๐‘ ๐‘ก_๐‘–๐‘

๐‘ ๐‘Ÿ๐‘_๐‘๐‘œ๐‘Ÿ๐‘ก๐‘‘๐‘ ๐‘ก_๐‘๐‘œ๐‘Ÿ๐‘ก

Page 3: Self-healing Networking with Flow Label

Theory DC: Many-Many Paths

N_PLANES: Number of planes in DC;

N_X_SPINES: Number of super spines (X) in each plane;

โ€ข Inside ToR: 1

โ€ข Inside PoD: N_PLANES

โ€ข Between PoDs: N_PLANES x N_X_SPINES

Page 4: Self-healing Networking with Flow Label

Real DC: Many-Many Paths

N_PLANES: Number of planes in DC; (8)

N_X_SPINES: Number of super spines (X) in each plane; (32)

โ€ข Inside ToR: 1

โ€ข Inside PoD: N_PLANES = 8

โ€ข Between PoDs: N_PLANES x N_X_SPINES = 256

Page 5: Self-healing Networking with Flow Label

๐‘‹11 is Broken: Constant Loss

๐‘†22๐‘†22๐‘†21

๐‘‹22๐‘‹21

๐‘†12๐‘†12๐‘†11

๐‘‹12๐‘‹11

๐‘‡๐‘œ๐‘…1 ๐‘‡๐‘œ๐‘…2

Servers Servers

โ„Ž๐‘Ž๐‘ โ„Ž

๐‘๐‘Ÿ๐‘œ๐‘ก๐‘œ๐‘ ๐‘Ÿ๐‘_๐‘–๐‘๐‘‘๐‘ ๐‘ก_๐‘–๐‘

๐‘ ๐‘Ÿ๐‘_๐‘๐‘œ๐‘Ÿ๐‘ก๐‘‘๐‘ ๐‘ก_๐‘๐‘œ๐‘Ÿ๐‘ก

Page 6: Self-healing Networking with Flow Label

Unhappy TCP Flow

๐‘†22๐‘†22๐‘†21

๐‘‹22๐‘‹21

๐‘†12๐‘†12๐‘†11

๐‘‹12๐‘‹11

๐‘‡๐‘œ๐‘…1 ๐‘‡๐‘œ๐‘…2

โ„Ž๐‘Ž๐‘ โ„Ž

๐‘๐‘Ÿ๐‘œ๐‘ก๐‘œ๐‘ ๐‘Ÿ๐‘_๐‘–๐‘๐‘‘๐‘ ๐‘ก_๐‘–๐‘

๐‘ ๐‘Ÿ๐‘_๐‘๐‘œ๐‘Ÿ๐‘ก๐‘‘๐‘ ๐‘ก_๐‘๐‘œ๐‘Ÿ๐‘ก

RTOServers Servers

Page 7: Self-healing Networking with Flow Label

RTO & SYN_RTO Timeouts

0

20

40

60

80

100

120

140

1th retry 2th retry 3th retry 4th retry 5th retry 6th retry 7th retry

Timeout in Seconds

SYN DATA

RTO_MIN SYN_RTO

200ms 1s

Timeouts

Real RTT

1ms

RTO = MAX(RTO_MIN, RTT)

Page 8: Self-healing Networking with Flow Label

LinuxKernel

2014

Page 9: Self-healing Networking with Flow Label

LinuxKernel

2015

Page 10: Self-healing Networking with Flow Label

LinuxKernel

2016

Page 11: Self-healing Networking with Flow Label

TCP RTO & skb->hash

skb->hashRTO

IP6 Flow Label

GRE Encap: KEY

UDP Encap: SRC Port

IP6 Ecnap: Flow Label

Page 12: Self-healing Networking with Flow Label

net.ipv6.auto_flowlabels

0: automatic flow labels are completely disabled

1: automatic flow labels are enabled by default, they can be disabled on a per socket basis using the IPV6_AUTOFLOWLABEL socket option

2: automatic flow labels are allowed, they may be enabled on a per socket basis using the IPV6_AUTOFLOWLABEL socket option

3: automatic flow labels are enabled and enforced, they cannot be disabled by the socket option

Default: 1

Page 13: Self-healing Networking with Flow Label

Unhappy TCP Flow Becomes Happier

๐‘†22๐‘†22๐‘†21

๐‘‹22๐‘‹21

๐‘†12๐‘†12๐‘†11

๐‘‹12๐‘‹11

๐‘‡๐‘œ๐‘…1 ๐‘‡๐‘œ๐‘…2

โ„Ž๐‘Ž๐‘ โ„Ž

๐‘๐‘Ÿ๐‘œ๐‘ก๐‘œ๐‘ ๐‘Ÿ๐‘_๐‘–๐‘๐‘‘๐‘ ๐‘ก_๐‘–๐‘

๐‘ ๐‘Ÿ๐‘_๐‘๐‘œ๐‘Ÿ๐‘ก๐‘‘๐‘ ๐‘ก_๐‘๐‘œ๐‘Ÿ๐‘ก๐‘“๐‘™๐‘œ๐‘ค ๐‘™๐‘Ž๐‘๐‘’๐‘™

RTO

Servers Servers

Page 14: Self-healing Networking with Flow Label

Evaluation: Without Flow Label

One of four ToR uplinks drops packets, significant service degradation

75%

Page 15: Self-healing Networking with Flow Label

Evaluation: Flow Label + eBPF

One of four ToR uplink drops packets, no effect on the service!

75%

Page 16: Self-healing Networking with Flow Label

Self-healing Datacenter: Cookbook

โ€ข Does it scale? Yes!

โ€ข Does it have many paths? Yes!

โ€ข Does it have fault tolerance? Use IPv6! Use flow label!

โ€ข How do I change RTO? eBPF is the answer!

โ€ข Without documentation!

Page 17: Self-healing Networking with Flow Label

Theory Internet: Many-Many Paths

Multihomed at the edge;

Multiple connections between peers;

Multiple connection with upstreams;

Page 18: Self-healing Networking with Flow Label

Real Internet: Many-Many Paths

Average number of best paths: 3.8

Maximum number of best paths: 44

>60% of prefixes have more then 1 path

Page 19: Self-healing Networking with Flow Label

A Real Outage

Page 20: Self-healing Networking with Flow Label
Page 21: Self-healing Networking with Flow Label

RTO & Anycast

TCP Proxy 2

TCP Proxy 1Anycast IP

Anycast IP

Src IP 1 Dst IP 2 FL=X1

Src Port 1 Dst Port 2

Ack=A Seq=S

Page 22: Self-healing Networking with Flow Label

RTO & Anycast

TCP Proxy 2

TCP Proxy 1Anycast IP

Anycast IPRTO

Src IP 1 Dst IP 2 FL=X2

Src Port 1 Dst Port 2

Ack=A Seq=S

Page 23: Self-healing Networking with Flow Label

SYN RTO & Anycast

TCP Proxy 2

TCP Proxy 1Anycast IP

Anycast IP

SYN

Src IP 1 Dst IP 2 FL=X1

Src Port 1 Dst Port 2

Ack=0 Seq=S1

Page 24: Self-healing Networking with Flow Label

SYN RTO & Anycast

TCP Proxy 2

TCP Proxy 1Anycast IP

Anycast IP

SYN/ACK

Src IP 2 Dst IP 1 FL=Y1

Src Port 2 Dst Port 1

Ack=S1+1 Seq=S2

Page 25: Self-healing Networking with Flow Label

SYN RTO & Anycast

TCP Proxy 2

TCP Proxy 1Anycast IP

Anycast IPSYN

Src IP 1 Dst IP 2 FL=X2

Src Port 1 Dst Port 2

Ack=0 Seq=S1

Page 26: Self-healing Networking with Flow Label

SYN RTO & Anycast

TCP Proxy 2

TCP Proxy 1Anycast IP

Anycast IP

SYN/ACK

SYN/ACK

Src IP 2 Dst IP 1 FL=Y1

Src Port 2 Dst Port 1

Ack=S1+1 Seq=S2

Src IP 2 Dst IP 1 FL=Z1

Src Port 2 Dst Port 1

Ack=S1+1 Seq=S3

Page 27: Self-healing Networking with Flow Label

SYN RTO & Anycast

TCP Proxy 2

TCP Proxy 1Anycast IP

Anycast IPACK

Src IP 1 Dst IP 2 FL=X2

Src Port 1 Dst Port 2

Ack=S2 + 1 Seq=S1 + 1

Page 28: Self-healing Networking with Flow Label

Flow Label: Safe Mode

Client โ€“ sends SYN, Server โ€“ responds with SYN&ACK

โ€ข In case of SYN_RTO or RTO events Server SHOULD recalculate its TCP socket hash, thus change Flow Label. This behavior MAY be switched on by default;

โ€ข In case of SYN_RTO or RTO events Client MAY recalculate its TCP socket hash, thus change Flow Label. This behavior MUST be switched off by default;

Page 29: Self-healing Networking with Flow Label

Self-healing Datacenter: Cookbook

โ€ข Flow label provides is a way to โ€˜jumpโ€™ from a failing path;

โ€ข Already works in controlled environment;

โ€ข Can disrupt TCP connection with stateful anycast services;

โ€ข We need to change Linux defaults!

โ€ข This time we need to document it!

TCP