system support for elastic execution in virtual middleboxes · split / merge ! system support for...

Post on 21-Jul-2020

7 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Split / Merge !System Support for Elastic Execution !in Virtual Middleboxes!

Shriram RAJAGOPALAN Dan WILLIAMS

Hani JAMJOOM Andrew WARFIELD

IBM Research & UBC IBM Research IBM Research UBC

2

The Problem

Elastic Applications Need Elastic Middleboxes

3

Clients

Elastic Application Tier

Edge Switch

Software Defined Network!

...

IDS/IPS Firewall LB VPN Accelerator

Flows

Virtual Middleboxes

Hotspots Cannot be Alleviated Quickly

4

Apps

M2 M1

When M1 is overloaded, provision more middleboxes to serve new flows

Virtual Middleboxes

Scaling Inefficiencies Lead to Poor Utilization

5

Apps

In the worst case, active flows in each instance delay scale-in

M1 M3 M4 M2 Virtual Middleboxes

6

The Insight

Flow State is Naturally Partitioned

7

Apps

M1 FlowTable Virtual

Middleboxes

Enabling Elasticity in Virtual Middleboxes

8

M FlowTable

Dynamic partitioning of flow states among “replicas” enables elastic execution

Apps

M FlowTable Virtual

Middleboxes

Understanding the State Inside a Middlebox

9

Middlebox VM

Appl

icat

ion

Logi

c

Input!( Flows )!

Output!

Understanding the State Inside a Middlebox

10

Middlebox VM

Appl

icat

ion

Logi

c

Flow Table!Key! Value!

5-tuple! [Flow State]! Partitionable among replicas!

Input!( Flows )!

Output!

Understanding the State Inside a Middlebox

11

Middlebox VM

Appl

icat

ion

Logi

c

Flow Table!Key! Value!

5-tuple! [Flow State]! Partitionable among replicas!

Threshold counters!

Non-critical statistics!

May be shared among replicas (coherent)!

Input!( Flows )!

Output!

Understanding the State Inside a Middlebox

12

Middlebox VM

Appl

icat

ion

Logi

c

Flow Table!Key! Value!

5-tuple! [Flow State]! Partitionable among replicas!

Threshold counters!

Non-critical statistics!

May be shared among replicas (coherent)!

Caches! Other!processes! ...!

Input!( Flows )!

Output!

Internal to a replica (ephemeral)!

13

Our Contribution

Split/Merge: A State-Centric Approach to Elasticity

14

Ephemeral!VM!Coherent!

Partitionable !(Flow States)!

Virtual Network Interface!

Split/Merge: A State-Centric Approach to Elasticity

15

Split!

VM! 1! VM! 2! VM! 3!

Replica 1! Replica 2! Replica 3!

Ephemeral!VM!Coherent!

Partitionable !(Flow States)!

Virtual Network Interface!

Split/Merge: A State-Centric Approach to Elasticity

16

Split!

VM! 1! VM! 2! VM! 3!Unchanged Interfaces!

Merge!

VM! 1! VM! 2!

Replica 1! Replica 2! Replica 3!

Replica 1! Replica 2+3!

Coherency is maintained!

Ephemeral!VM!Coherent!

Partitionable !(Flow States)!

Virtual Network Interface!

17

Implementation

FreeFlow

�  A VMM based runtime that provides Split/Merge abstraction to applications!

�  Developers modify application code to annotate flow state!

�  FreeFlow takes care of the rest!!

18

19

FreeFlow: A Split/Merge Implementation

VMM!VMM!

VM!Replica 1!

VM!Replica 2!

Traffic to Middlebox!

...!

Flow 2!Flow 1!

1

Ephemeral!Coherent!

Partitionable !(Flow States)!

2

�  Need to manage application state!

VMM!VMM!

VM!Replica 1!

VM!Replica 2!

Traffic to Middlebox!

...!

Flow 2!Flow 1!

20

FreeFlow: A Split/Merge Implementation

Free

Flow

Li

brar

y!

Free

Flow

Li

brar

y!

1 2

�  Need to manage application state!

�  Need to ensure flows are routed to the correct replica!

21

FreeFlow: A Split/Merge Implementation

VMM!VMM!

VM!Replica 1!

VM!Replica 2!

Controller!

Traffic to Middlebox!

Flow 1! Flow 2!

...!

FreeFlow Module!

Free

Flow

Li

brar

y!

Free

Flow

Li

brar

y!

1 2

�  Need to manage application state!

�  Need to ensure flows are routed to the correct replica!

�  Need to decide when to split or merge a replica!

22

FreeFlow: A Split/Merge Implementation

VMM!VMM!

VM!Replica 1!

VM!Replica 2!

Controller!

Traffic to Middlebox!

Flow 1! Flow 2!

...!

FreeFlow Module!

Orchestrator !

Free

Flow

Li

brar

y!

Free

Flow

Li

brar

y!

1 2

Annotating State using FreeFlow API

23

create_flow(flow_key, size)!delete_flow(flow_key)!!flow_state get_flow(flow_key)!

put_flow(flow_key)!!

flow_timer(flow_key, timeout, callback)!

create_shared(key, size, cb) !delete_shared(key)!!

state get_shared(key, flags) // synch | pull |local!put_shared(key, flags) // synch | push |local!

Partitioned State API

Coherent State API

Forwarding Flows Correctly using OpenFlow

24

OpenFlow Switch!

OpenFlow Table!

<a>! port 1!

<b>! port 2!

<c>! port 2!

...!

port 1! Virtual Switch!

OpenFlow Table!

<a>! port 1!

...!

port 1!

Middlebox Replica 1!

Flow Table!<a>!

...!

port 2!

Virtual Switch!

OpenFlow Table!

<b>! port 1!

<c>! port 1!

...!

port 1!

Middlebox Replica 2!

Flow Table!

<b>!

<c>!

...!

1.1.1.1 (de:ad:be:ef:ca:fe)

1.1.1.1 (de:ad:be:ef:ca:fe)

Flow Migration

25

OpenFlow Switch!

OpenFlow Table!

<a>! port 1!

<b>! port 2!

<c>! port 2!

...!

port 1! Virtual Switch!

OpenFlow Table!

<a>! port 1!

...!

port 1!

Middlebox Replica 1!

Flow Table!<a>!

...!

port 2!

Virtual Switch!

OpenFlow Table!

<b>! port 1!

<c>! port 1!

...!

port 1!

Middlebox Replica 2!

Flow Table!

<b>!

<c>!

...!

Migrating <b> from replica 2 to replica 1

1.1.1.1 (de:ad:be:ef:ca:fe)

1.1.1.1 (de:ad:be:ef:ca:fe)

Flow Migration

26

OpenFlow Switch!

OpenFlow Table!

<a>! port 1!

<b>! controller!

<c>! port 2!

...!

port 1! Virtual Switch!

OpenFlow Table!

<a>! port 1!

...!

port 1!

Middlebox Replica 1!

Flow Table!<a>!

...!

port 2!

Virtual Switch!

OpenFlow Table!

<c>! port 1!

…!

port 1!

Middlebox Replica 2!

Flow Table!

<b>!

<c>!

...!

Suspend flow & buffer packets

1.1.1.1 (de:ad:be:ef:ca:fe)

1.1.1.1 (de:ad:be:ef:ca:fe)

Flow Migration

27

OpenFlow Switch!

OpenFlow Table!

<a>! port 1!

<b>! controller!

<c>! port 2!

...!

port 1! Virtual Switch!

OpenFlow Table!

<a>! port 1!

...!

port 1!

Middlebox Replica 1!

Flow Table!<a>!

<b>!

...!

port 2!

Virtual Switch!

OpenFlow Table!

<c>! port 1!

…!

port 1!

Middlebox Replica 2!

Flow Table!

<c>!

...!

Move flow state to target

1.1.1.1 (de:ad:be:ef:ca:fe)

1.1.1.1 (de:ad:be:ef:ca:fe)

Flow Migration

28

OpenFlow Switch!

OpenFlow Table!

<a>! port 1!

<b>! port 1!

<c>! port 2!

...!

port 1! Virtual Switch!

OpenFlow Table!

<a>! port 1!

<b>! port 1!

…!

port 1!

Middlebox Replica 1!

Flow Table!<a>!

<b>!

...!

port 2!

Virtual Switch!

OpenFlow Table!

<c>! port 1!

…!

port 1!

Middlebox Replica 2!

Flow Table!

<c>!

...!

Release buffer & resume flow

1.1.1.1 (de:ad:be:ef:ca:fe)

1.1.1.1 (de:ad:be:ef:ca:fe)

Managing Coherent State

29

create_shared(key, size, cb) !delete_shared(key)!!

state get_shared(key, flags) // synch | pull |local!put_shared(key, flags) // synch | push |local!

Managing Coherent State

30

create_shared(“foo”, 4, NULL)!

while (1)!

!process_packet()!

!p_foo = get_shared(“foo”, synch)!

!val = (*p_foo)++!

!put_shared(“foo”, synch)!

!

!if (val > threshold)!

! !bar()!

Strong Consistency

Middlebox applications rarely need strong consistency!

Distributed lock for every update

Managing Coherent State

31

create_shared(“foo”, 4, merge_fn)!

while (1)!

!process_packet()!

!p_foo = get_shared(“foo”, local)!

!val = (*p_foo)++!

!put_shared(“foo”, local)!

!

!if (val > threshold)!

! bar()!

! put_shared(“foo”, push)!

Eventual Consistency

Hi frequency local updates

Periodic global updates

32

Evaluation

Evaluation Overview

�  Eliminating hotspots during scale-out!

�  Fast and efficient scale-in!

�  Split/Merge Bro during a load burst!

33

Hotspots Cannot be Alleviated Quickly

34

Apps

Virtual Middleboxes M2 M1

When M1 is overloaded, provision more middleboxes to serve new flows

Eliminating Hotspots by Shedding Load

35

0

20

40

60

80

100

Max

Lat

ency

(ms) w/ FreeFlow

0 20 40 60 80

100

0 20 40 60 80 100

Max

Lat

ency

(ms)

Time (s)

w/o FreeFlow

11.6

12

12.4

12.8

13.2Av

g La

tenc

y (m

s) w/ FreeFlow

11.6

12

12.4

12.8

13.2

50 55 60 65 70 75 80 85 90

Avg

Late

ncy

(ms)

Time (s)

w/o FreeFlow

Eliminating Hotspots by Shedding Load

36

Scaling-In a Deployment : Best Case Scenario

37

Apps

Virtual Middleboxes M1 M3 M4 M2

Scaling-In a Deployment : Best Case Scenario

38

Apps

M1 M3 M2 Virtual Middleboxes

0

20

40

60

80

100

20 40 60 80 100 120 140 160 180 200

Avg

Syst

em U

tiliz

atio

n (%

)

Time (s)

Best Case

Scaling-In: Best Case Scenario

39

Scaling-In using kill : Worst Case Scenario

40

Apps

M1 M3 M4 M2 Virtual Middleboxes

Scaling-In using kill : Worst Case Scenario

41

Apps

M1 M3 M4 M2 Virtual Middleboxes

Scaling-In using kill : Worst Case Scenario

42

Apps

In the worst case, active flows in each instance delay scale-in

M1 M3 M4 M2 Virtual Middleboxes

0

20

40

60

80

100

20 40 60 80 100 120 140 160 180 200

Avg

Syst

em U

tiliz

atio

n (%

)

Time (s)

Best CaseWorst Case (No FreeFlow)

Scaling-In using kill : Slow & Inefficient

43

Scaling-In using merge : Worst Case Scenario

44

Apps

M M M M Virtual Middleboxes

FreeFlow

Scaling-In using merge : Worst Case Scenario

45

Apps

M M M M

FreeFlow

Virtual Middleboxes

Scaling-In using merge : Worst Case Scenario

46

Apps

M M M Virtual Middleboxes

FreeFlow

Scaling-In using merge : Worst Case Scenario

47

Apps

FreeFlow consolidates state and flows from multiple replicas into one

M 1+2+3+4

Virtual Middleboxes

FreeFlow

0

20

40

60

80

100

20 40 60 80 100 120 140 160 180 200

Avg

Syst

em U

tiliz

atio

n (%

)

Time (s)

Best CaseWorst Case (No FreeFlow)

Worst Case w/ FreeFlow

Scaling-In using merge : Fast & Efficient

48

Merge

�  Ported the Event Engine to FreeFlow!

�  Support for UDP, TCP/HTTP protocols!

�  SQL Injection Detection plugin!

49

Splitting & Merging Bro IDS

Web Servers

M

SQL Injection Attacks

Bro

FreeFlow

Web Servers

M

SQL Injection Attacks

Bro

50

Handling a Load Burst

0

20

40

60

80

100

0 10 20 30 40 50 60 70 80 90 100 110

Atta

cks

Det

ecte

d (%

)

Time (s)

Load Burst

Web Servers

M

SQL Injection Attacks

Bro

51

Handling a Load Burst : No Scaling

0

20

40

60

80

100

0 10 20 30 40 50 60 70 80 90 100 110

Atta

cks

Det

ecte

d (%

)

Time (s)

Load Burst

Unscaled

Without enough capacity to handle the load burst, the system performance degrades severely

52

Handling a Load Burst : Pre-Scaled

0

20

40

60

80

100

0 10 20 30 40 50 60 70 80 90 100 110

Atta

cks

Det

ecte

d (%

)

Time (s)

Load Burst

Unscaled

Pre-scaled

Web Servers

M1

SQL Injection Attacks

Bro

Pre-scaled

M2

Two instances are provisioned apriori, enough to handle a load burst, if any

Web Servers

M1

SQL Injection Attacks

Bro M2

53

Handling a Load Burst : Pre-Scaled

0

20

40

60

80

100

0 10 20 30 40 50 60 70 80 90 100 110

Atta

cks

Det

ecte

d (%

)

Time (s)

Load Burst

Unscaled

Pre-scaled

Load burst has no impact on system performance, as there is enough capacity to handle the load

Pre-scaled

Web Servers

M

SQL Injection Attacks

Bro

54

Handling a Load Burst : Split/Merge

0

20

40

60

80

100

0 10 20 30 40 50 60 70 80 90 100 110

Atta

cks

Det

ecte

d (%

)

Time (s)

Load Burst

Unscaled

Pre-scaled

Dynamically scaledwith FreeFlow

FreeFlow

One replica handles the load well, before the load burst

55

Handling a Load Burst : Split/Merge

0

20

40

60

80

100

0 10 20 30 40 50 60 70 80 90 100 110

Atta

cks

Det

ecte

d (%

)

Time (s)

Load Burst

Unscaled

Pre-scaled

Dynamically scaledwith FreeFlow

Web Servers

M

SQL Injection Attacks

Bro

FreeFlow

Split

When load burst starts, the Orchestrator splits the replica and rebalances the load

Web Servers

M

SQL Injection Attacks

Bro

FreeFlow

M

56

Handling a Load Burst : Split/Merge

0

20

40

60

80

100

0 10 20 30 40 50 60 70 80 90 100 110

Atta

cks

Det

ecte

d (%

)

Time (s)

Load Burst

Unscaled

Pre-scaled

Dynamically scaledwith FreeFlow

With the load rebalanced, performance returns to normal

Web Servers

M

SQL Injection Attacks

Bro

FreeFlow

M

57

Handling a Load Burst : Split/Merge

0

20

40

60

80

100

0 10 20 30 40 50 60 70 80 90 100 110

Atta

cks

Det

ecte

d (%

)

Time (s)

Load Burst

Unscaled

Pre-scaled

Dynamically scaledwith FreeFlow

Merge

When system utilization drops after the load burst, the Orchestrator merges the two replicas

58

Handling a Load Burst : Split/Merge

0

20

40

60

80

100

0 10 20 30 40 50 60 70 80 90 100 110

Atta

cks

Det

ecte

d (%

)

Time (s)

Load Burst

Unscaled

Pre-scaled

Dynamically scaledwith FreeFlow

Web Servers

M

SQL Injection Attacks

Bro

FreeFlow

Summary

59

60

Flow Migration Overhead - TCP

61

0

100

200

300

400

0 10 20 30 40 50 60

Thro

ughp

ut (M

bits

/s)

Time (s)

BaselineFreeFlow

Flow Migration

Flow Migration Overhead - UDP

62

0 0.5

1 1.5

2 2.5

3

0 1 2 3 4 5 6 7 8 9 10

Pack

et L

aten

cy (m

s)

Time (s)

150Mbps200Mbps250Mbps

0 2 4 6 8

10 12 14

0 1 2 3 4 5 6 7 8 9 10

Thro

ughp

ut (M

bits

)[5

0ms

win

dow

]

Time (s)

150Mbps200Mbps250Mbps

top related