th infocom 2014 traffic engineering with equalcost ...€¦ · traffic engineering with...

Traffic Engineering withEqualCostMultiPath:

An Algorithmic Perspective

Marco Chiesa s

joint work with

Guy Kindler Michael Schapira

Israel Networking Day 2014 April 24th

Infocom 2014

trafficengineering (TE)

network operators' goal: ● provide best possible service● minimize costs

how?● fully exploit network resources

→ route flows of traffic along the “best” paths

traditional TE tools

ECMP (EqualCostMultiPath)● the most widely deployed TE mechanism● loadbalancing tool● very simple mechanism

contributions

ECMP and arbitrary topologies no reasonable approximation is possible→

ECMP and datacenter topologies:● hypercubes vs folded Clos networks ● large flows in folded Clos networks

:(

:):)


ECMP (EqualCostMultiPath)● operators set link weights● traffic is routed along shortestpaths

s

t


ECMP (EqualCostMultiPath)● operators set link weights● traffic is routed along shortestpaths

s3

3

2

199

t


when multiple shortest paths are available:● perpacket level equal split● perflow level hashbased split

→ equal split for many small flows

3

199

3

2

s

t

4

22

2

fst=4




3

99 4

3

2

s

t

2

11

3

2

fst=4

1




3

1 99

3

2

s

t

2

2

2

2

fst=4

99 4


OSPF + ECMP (EqualCostMultiPath)● operators set link weights

how? heuristic approaches: ● local search [Fortz,Thorup, 2000][Sundaresan et al, 2010]

● memetic algorithms [Buriol et al, 2002]

● genetic algorithms [Ericsson et al, 2002]

● branchandcut for mixedILP [Palmar et al, 2006]

wanted: algorithm with provable guarantees?

TE model:multi commidity flow

input:● capacitated graph G=(V,E)● demand matrix D={dij}

constraints:● flows cannot exceed links capacities ● flows are equally split among all shortestpaths

optimization functions:● maximize total throughput (flow)● minimize congestion● minimize sum of linkcosts

known results:inapproximability for maxflow

Theorem [FortzThorup,2000]. Given an instance I=(G,D), it is NPhard to distinguish whether:

OPT(I) = 1 or OPT(I) = 2/3 k,

for any k>0

→ no efficient algorithm can provably route at least a fraction of OPT(I)

23

23+e

our first results:inapproximability for maxflow

Theorem [FortzThorup,2000]. Given an instance I=(G,D), with a single entry in D, it is NPhard to distinguish whether:

OPT(I) = 1 or OPT(I) = 2/ k,

for any k>0

23

our first results:no inapproximability

for maxflowTheorem [FortzThorup,2000]. Given an instance I=(G,D), with a single entry in D, it is NPhard to distinguish whether:

OPT(I) = 1 or OPT(I) = 2/3q,

for any q>0

→ no algorithm can provably route at least a fraction k of OPT(G) in polynomial time

23

Theorem [FortzThorup,2000]. Given an instance I=(G,D), with a single entry in D, it is NPhard to distinguish whether:

OPT(I) = 1 or OPT(I) = 2/3q,

for any q>0

→ no efficient algorithm can provably route at least a fraction q of OPT(I)

our first results:any constant inapproximability

for maxflow

23

key tool:amplification operator X

operator X: instance I → instance Inew

such that

OPT(Inew) = OPT(I) 2

OPTA(A(G)) = (OPTG)4

...

amplifying the gap

OPT(I) = 1 or OPT(I) = it is NPhard to distinguish between 1 and ~0.6

OPT(X(G)) = 1 or OPT(X(H)) = it is NPhard to distinguish between 1 and ~0.4

OPT(X2(G)) = 1 or OPT(X2(H)) = it is NPhard to distinguish between 1 and ~0.2

…

23

amplifying the gap


OPT(X(I)) = 1 or OPT(X(I)) = it is NPhard to distinguish between 1 and ~0.4

OPT(X2(G)) = 1 or OPT(X2(H)) = it is NPhard to distinguish between 1 and ~0.2

…

49

23

amplifying the gap


OPT(X(I)) = 1 or OPT(X(I)) = it is NPhard to distinguish between 1 and ~0.4

OPT(X2(I)) = 1 or OPT(X2(I)) = it is NPhard to distinguish between 1 and ~0.2

…

23

49

1681

amplification gap technique:graph G

kc1

kc3

kc2

kc4

kc5

s

t

● source s● target t● capacitated edges

amplification gap technique:graph G

kc1

kc3

kc2

kc4

kc5

s

t

● source s● target t● capacitated edges

t

maxflow (s t) = → OPT

amplification gap technique: recursive replacement

kc1

kc3

kc2

kc4

kc5

s

t

c2

c2c2c2c1

c2c3

c2c5c2c4

s'

t'


kc1

kc3

kc2

kc4

kc5

s

t

c2

c2c2c2c1

c2c3

c2c5c2c4

s'

t'

maxflow(s' → t') = c2OPT


c2O

PT

s

t

c2

c2c2c2c1

c2c3

c2c5c2c4

s'

t'

kc1

kc3

kc4

kc5



c1OPT

c3OPT

c2O

PT

c4O

PT

c5OPT

s

t

c2

c2c2c2c1

c2c3

c2c5c2c4

s'

t'


amplification gap technique:graph X(G)

c1OPT

c3OPT

c2O

PT

c4O

PT

c5OPT

s

t

maxflow(s t,G') = →OPT ...⋅

amplification gap technique:graph X(G)

c1OPT

c3OPT

c2O

PT

c4O

PT

c5OPT

s

t

maxflow(s t,G') = →OPT ⋅ OPT = OPT 2

datacenter topologies and ECMP

topology constraints:● ddimensional hypercubes (e.g., bCubelike)

→ NPhard● llayers folded Clos networks (e.g., VL2like)

→ easy● random regular graphs (e.g., Jellyfish)

→ future work


topology constraints:● ddimensional hypercubes (e.g., bCube) Clos networks (e.g., VL2) network

d=4


topology constraints:● ddimensional hypercubes (e.g., bCube) Clos networks (e.g., VL2) network

d=4

computing the best weight assignment is computationally intractable


topology constraints:● llayers folded Clos networks (e.g.,VL2)

l=2

servers

l=2



l=3

servers



l=4

servers



l=4

setting all link weights to 1 is optimal

optimality proof sketch


l=2

servers


l=4

optimality proof sketch

servers

ECMP and large flowsperflow level hashbased split

→ when there are a few large flows, traffic may not be properly loadbalanced

l=4

servers

ECMP and large flowsperflow level hashbased split

→ when there are a few large flows, traffic may not be properly loadbalanced

l=4

severe performance degradation [Al Fares et al, 2010]:30% of the bandwidth in a datacenter is wasted

ECMP, Clos networks, and large flows

proposed solution [AlFares et al, 2010]:● route small flows (mice) using ECMP● route large flows (elephant) using a greedy

algorithm

our results:● 2 inapproximability● greedy is a approximation algorithm

● approximation if all flows have equal size

12

15

14

conclusions

● in general, no efficient algorithm exists to assign the best link weights

datacenter topologies:

hypercubes still hard to find the best weights→

folded Clos networks set all weights to 1→

greedy algorithm for routing large flows is a approximation in a 3layers Folded Clos network (VL2like)

conclusions

● in general, no efficient algorithm exists to assign the best reasonable link weights

datacenter topologies:

hypercubes hard to find the best weights→

folded Clos networks set all weights to 1→

greedy algorithm for routing large flows is a approximation in a 3layers Folded Clos network (VL2like)

conclusions

● in general, no efficient algorithm exists to assign the best reasonable link weights

● datacenter topologies:● hypercubes hard to find the best weights→● folded Clos networks set all weights to 1→

– greedy algorithm for routing large flows is a approximation in a 3layers Folded Clos network (VL2like)

15

thank you

th infocom 2014 traffic engineering with equalcost ...€¦ · traffic engineering with...

Documents