design principles in parser design - stanford...

Post on 22-Jun-2020

17 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Design principles in parser designGlen GibbDept. of Electrical Engineering

Advisor: Prof. Nick McKeown

Tuesday, May 14, 13

2

Header parsing?

Fiel

d

Fiel

d

Fiel

d

Fiel

d

Fiel

d

2

Header parsing?Identify headers & extract fields

?? ?? ?? ?? ??A CB

Tuesday, May 14, 13

Fiel

d

Fiel

d

Fiel

d

Fiel

d

Fiel

d

2

Header parsing?Identify headers & extract fields

?? ?? ?? ?? ??A CBDest.

Source

Proto.

Tuesday, May 14, 13

Fiel

d

Fiel

d

Fiel

d

Fiel

d

Fiel

d

2

Header parsing?Identify headers & extract fields

?? ?? ?? ?? ??A CB

Next Hop1234

Dest.

Source

Proto.

Tuesday, May 14, 13

Fiel

dDest.

Fiel

d

Fiel

d

Fiel

d

Fiel

d

Fiel

d

2

Header parsing?Identify headers & extract fields

?? ?? ?? ?? ??A CB

Next Hop1234

Dest.

Source

Proto.

Tuesday, May 14, 13

Fiel

dDest.

Fiel

d

Fiel

d

Fiel

d

Fiel

d

Fiel

d

2

Header parsing?Identify headers & extract fields

?? ?? ?? ?? ??A CB

Next Hop1234

Host X can talk to host Y

except via HTTPFirewallALLOWDENY

ALLOW

Dest.

Source

Proto.

Tuesday, May 14, 13

Fiel

dDest.

Fiel

d

Fiel

d

Fiel

d

Fiel

d

Fiel

d

2

Header parsing?Identify headers & extract fields

?? ?? ?? ?? ??A CB

Next Hop1234

Host X can talk to host Y

except via HTTPFirewallALLOWDENY

ALLOW

Dest.

Source

Proto.

Tuesday, May 14, 13

Fiel

dSource

Fiel

dDest.

Fiel

dProto.

Fiel

dDest.

Fiel

d

Fiel

d

Fiel

d

Fiel

d

Fiel

d

2

Header parsing?Identify headers & extract fields

?? ?? ?? ?? ??A CB

Next Hop1234

Host X can talk to host Y

except via HTTPFirewallALLOWDENY

ALLOW

Dest.

Source

Proto.

Tuesday, May 14, 13

Fiel

dSource

Fiel

dDest.

Fiel

dProto.

Fiel

dDest.

Fiel

d

Fiel

d

Fiel

d

Fiel

d

Fiel

d

2

Header parsing?Identify headers & extract fields

> 1 billion packets / secondNew packet every ns

?? ?? ?? ?? ??A CB

Next Hop1234

Host X can talk to host Y

except via HTTPFirewallALLOWDENY

ALLOW

Dest.

Source

Proto.

Tuesday, May 14, 13

Almost no prior work

3

Tuesday, May 14, 13

Leaping Multiple Headers in a Single Bound: Wire-Speed Parsing Using the

Kangaroo SystemC. Kozanitis, J. Huber, S. Singh, & G. Varghese

INFOCOM 2010

4

Programmable parserParses multiple headers per cycle

Receives all headers before parsing → high latency

Tuesday, May 14, 13

400 Gb/s Programmable Packet Parsing on a Single FPGA

M. Attig & G. BrebnerANCS 2011

5

Language to describe header sequencesCompile into efficient designs on FPGA

FPGA-centric — commercial switches are ASICsExtremely deep pipeline (100+ stages)

Tuesday, May 14, 13

6

Neither paper analyzes design trade-offs

or presents design principles

Tuesday, May 14, 13

1. Packet parsing

2. Understanding parser design

3. Providing flexibility

7

Outline

Tuesday, May 14, 13

8

Packet parsingNetwork review

Parsing process

Tuesday, May 14, 13

9

Internet

Tuesday, May 14, 13

9

Tuesday, May 14, 13

9

Tuesday, May 14, 13

10

1

2

3

4

Tuesday, May 14, 13

10

Packet Color

Output Port

⬤ 1

⬤ 2

⬤ 3

⬤ 4

1

2

3

4

Tuesday, May 14, 13

10

Packet Color

Output Port

⬤ 1

⬤ 2

⬤ 3

⬤ 4

1

2

3

4

Tuesday, May 14, 13

11

Packet

Tuesday, May 14, 13

11

Packet PayloadHeader 1 Header 2 Header 3

Tuesday, May 14, 13

11

Packet PayloadHeader 1 Header 2 Header 3 PayloadHeader 1 Header 2 Header 3

Field 1 Field 2 Field 3 ... Field n

Tuesday, May 14, 13

11

(Source Address) (DestinationAddress)

Packet PayloadHeader 1 Header 2 Header 3 PayloadHeader 1 Header 2 Header 3

Field 1 Field 2 Field 3 ... Field n

(Ethernet) (VLAN) (IPv4)

Tuesday, May 14, 13

11

Destination PortA 1B 2C 3D 4

(Source Address) (DestinationAddress)

Packet PayloadHeader 1 Header 2 Header 3 PayloadHeader 1 Header 2 Header 3

Field 1 Field 2 Field 3 ... Field n

(Ethernet) (VLAN) (IPv4)

Tuesday, May 14, 13

Parser

Match Tables

EthernetForwarding

IPRouting

Access Control List

ActionProcessing

Header fields

Packets

In

Queues

Out

12

Tuesday, May 14, 13

Parser

Match Tables

EthernetForwarding

IPRouting

Access Control List

ActionProcessing

Header fields

Packets

In

Queues

Out

12

Src MACDst MACEth Type

VLAN ID Src IPDst IPProtocol

Priority

Src PortDst Port

Ethernet VLAN IP TCP

Tuesday, May 14, 13

Parser

Match Tables

EthernetForwarding

IPRouting

Access Control List

ActionProcessing

Header fields

Packets

In

Queues

Out

12

Src MACDst MACEth Type

VLAN ID Src IPDst IPProtocol

Priority

Src PortDst Port

Ethernet VLAN IP TCP

Src MACDst MACEth TypeVLAN ID

Eth TypeDst IP

Src MACDst MACEth TypeVLAN ID

Src IPDst IPProtocol

Priority

Src PortDst Port

Tuesday, May 14, 13

13

Packet parsingNetwork review

Parsing process

Tuesday, May 14, 13

14

Parsing: identify headers & extract fields

Tuesday, May 14, 13

14

Parsing: identify headers & extract fields

A B C

A D

A B B

Tuesday, May 14, 13

14

Parsing: identify headers & extract fields

A B C

A D

A B B

?? ?? ?? ?? ??

Tuesday, May 14, 13

14

Parsing: identify headers & extract fields

A B C

A D

A B B

?? ?? ?? ?? ??A

Tuesday, May 14, 13

14

Parsing: identify headers & extract fields

A B C

A D

A B B

?? ?? ?? ?? ??A

Next:

B

Tuesday, May 14, 13

14

Parsing: identify headers & extract fields

A B C

A D

A B B

?? ?? ?? ?? ??A

Next:

B

Tuesday, May 14, 13

14

Parsing: identify headers & extract fields

A B C

A D

A B B

?? ?? ?? ?? ??A

Next:

B

B

Tuesday, May 14, 13

14

Parsing: identify headers & extract fields

A B C

A D

A B B

?? ?? ?? ?? ??A

Next:

B

B

Len:

20B

Next:

C

Tuesday, May 14, 13

Fiel

d

14

Parsing: identify headers & extract fields

A B C

A D

A B B

?? ?? ?? ?? ??A

Next:

B

B

Len:

20B

Next:

C

Tuesday, May 14, 13

Fiel

d

14

Parsing: identify headers & extract fields

A B C

A D

A B B

?? ?? ?? ?? ??A

Next:

B

B

Len:

20B

Next:

C

Tuesday, May 14, 13

Fiel

d

Fiel

d

Fiel

d

14

Parsing: identify headers & extract fields

A B C

A D

A B B

?? ?? ?? ?? ??A

Next:

B

B

Len:

20B

Next:

C

Tuesday, May 14, 13

Fiel

d

Fiel

d

Fiel

d

Fiel

d

14

Parsing: identify headers & extract fields

A B C

A D

A B B

?? ?? ?? ?? ??A

Next:

B

CB

Len:

20B

Next:

C

Next:

Tuesday, May 14, 13

Fiel

d

Fiel

d

Fiel

d

Fiel

d

Next Hop1234

14

Parsing: identify headers & extract fields

A B C

A D

A B B

?? ?? ?? ?? ??A

Next:

B

CB

Len:

20B

Next:

C

Next:

Tuesday, May 14, 13

15

Parse graphs

A

B C

D E

F

Tuesday, May 14, 13

15

Parse graphs

A

B C

D E

F

A C D F

A

B C

D E

F

Tuesday, May 14, 13

AExtract fields: 1, 2

BExtract fields: 2

CExtract fields: 1

DExtract fields: 2, 4

EExtract fields: 2

FExtract fields: 1, 2

15

Parse graphs

Tuesday, May 14, 13

AExtract fields: 1, 2

BExtract fields: 2

CExtract fields: 1

DExtract fields: 2, 4

EExtract fields: 2

FExtract fields: 1, 2

15

Parse graphs

Parse graph isthe state machine

Tuesday, May 14, 13

16

Parse graphs in the fieldEthernet

VLANVLAN

IPv4

GRE

NVGREEthernet

ARP/RARPTCP UDP

VXLAN

Data centerEthernet

IPv4 IPv6

MPLS

MPLS

MPLS

MPLS

MPLS

Service provider

Ethernet

IPv4 IPv6ARP RARP

TCP UDP GRE IPsec ESPIPsec AHSCTP

Enterprise edge

Enterprise

EthernetVLAN

VLAN

IPv4 IPv6

TCP UDP ICMP

ARP/RARP

Tuesday, May 14, 13

16

Ethernet

IPv4

VLAN(802.1Q)

VLAN(802.1Q) MPLS MPLS MPLS MPLS MPLS

IPv6

ARP RARP

VLAN(802.1ad)

PBB(802.1ah)

Ethernet

EoMPLS

ICMP

ICMPv6

TCPUDPGRE IPsec ESP IPsec AH SCTP

VXLANNVGRE IPv4IPv6

Parse graphs in the field

Tuesday, May 14, 13

What makes parsing hard?

17

Tuesday, May 14, 13

What makes parsing hard?• Many headers

• Many paths

• Variable path lengths

Ethernet

IPv4

VLAN(802.1Q)

VLAN(802.1Q) MPLS MPLS MPLS MPLS MPLS

IPv6

ARP RARP

VLAN(802.1ad)

PBB(802.1ah)

Ethernet

EoMPLS

ICMP

ICMPv6

TCPUDPGRE IPsec ESP IPsec AH SCTP

VXLANNVGRE IPv4IPv6

17

Tuesday, May 14, 13

What makes parsing hard?• Many headers

• Many paths

• Variable path lengths

• Variable header lengths

• Header identified by previous

Ethernet

IPv4

VLAN(802.1Q)

VLAN(802.1Q) MPLS MPLS MPLS MPLS MPLS

IPv6

ARP RARP

VLAN(802.1ad)

PBB(802.1ah)

Ethernet

EoMPLS

ICMP

ICMPv6

TCPUDPGRE IPsec ESP IPsec AH SCTP

VXLANNVGRE IPv4IPv6

17

Len:

20B

Len:

20B

Nex

t: IPv4

Nex

t: TCP

PayloadTCPLen: 20-60B

IPv4Len: 20-60B

EthernetLen: 14B

Tuesday, May 14, 13

What makes parsing hard?• Many headers

• Many paths

• Variable path lengths

• Variable header lengths

• Header identified by previous

• Line rate

• Aggressive latency

• Area & power constrained

Ethernet

IPv4

VLAN(802.1Q)

VLAN(802.1Q) MPLS MPLS MPLS MPLS MPLS

IPv6

ARP RARP

VLAN(802.1ad)

PBB(802.1ah)

Ethernet

EoMPLS

ICMP

ICMPv6

TCPUDPGRE IPsec ESP IPsec AH SCTP

VXLANNVGRE IPv4IPv6

17

Len:

20B

Len:

20B

Nex

t: IPv4

Nex

t: TCP

PayloadTCPLen: 20-60B

IPv4Len: 20-60B

EthernetLen: 14B

Tuesday, May 14, 13

What makes parsing hard?• Many headers

• Many paths

• Variable path lengths

• Variable header lengths

• Header identified by previous

• Line rate

• Aggressive latency

• Area & power constrained

Ethernet

IPv4

VLAN(802.1Q)

VLAN(802.1Q) MPLS MPLS MPLS MPLS MPLS

IPv6

ARP RARP

VLAN(802.1ad)

PBB(802.1ah)

Ethernet

EoMPLS

ICMP

ICMPv6

TCPUDPGRE IPsec ESP IPsec AH SCTP

VXLANNVGRE IPv4IPv6

17

64 x 10 Gb/s switch:• 1 billion pkts/sec• 250ns port-to-port• 40W

Len:

20B

Len:

20B

Nex

t: IPv4

Nex

t: TCP

PayloadTCPLen: 20-60B

IPv4Len: 20-60B

EthernetLen: 14B

Tuesday, May 14, 13

Implementing a parser

18

Tuesday, May 14, 13

Implementing a parser

18

A

B C

D E

F

Tuesday, May 14, 13

Packet data Extracted fields

Header types & locations

Implementing a parser

18

Header IdentificationA

B C

D E

F

ExtractedField

Buffer

Field Extraction

Tuesday, May 14, 13

Packet data Extracted fields

Header types & locations

Implementing a parser

18

Header IdentificationA

B C

D E

F

ExtractedField

Buffer

Field Extraction

Access ControlALLOWDENY

ALLOW

Fiel

d(S

ourc

e)

Fiel

d (D

est)

Fiel

d(P

roto

)

Fiel

d (S

ourc

e)

Fiel

d(D

est)

Fiel

d

Fiel

d

Fiel

d(P

roto

)

?? ?? ?? ?? ??

Tuesday, May 14, 13

Packet data Extracted fields

Header types & locations

Implementing a parser

18

Header IdentificationA

B C

D E

F

ExtractedField

Buffer

Field Extraction

Access ControlALLOWDENY

ALLOW

Fiel

d(S

ourc

e)

Fiel

d (D

est)

Fiel

d(P

roto

)

Fiel

d (S

ourc

e)

Fiel

d(D

est)

Fiel

d

Fiel

d

Fiel

d(P

roto

)

?? ?? ?? ?? ??A

Tuesday, May 14, 13

Packet data Extracted fields

Header types & locations

Implementing a parser

18

Header IdentificationA

B C

D E

F

ExtractedField

Buffer

Field Extraction

Access ControlALLOWDENY

ALLOW

Fiel

d(S

ourc

e)

Fiel

d (D

est)

Fiel

d(P

roto

)

Fiel

d (S

ourc

e)

Fiel

d(D

est)

Fiel

d

Fiel

d

Fiel

d(P

roto

)

?? ?? ?? ?? ??A

Extracted Field Buffer

Tuesday, May 14, 13

Packet data Extracted fields

Header types & locations

Implementing a parser

18

Header IdentificationA

B C

D E

F

ExtractedField

Buffer

Field Extraction

Access ControlALLOWDENY

ALLOW

Fiel

d(S

ourc

e)

Fiel

d (D

est)

Fiel

d(P

roto

)

Fiel

d (S

ourc

e)

Fiel

d(D

est)

Fiel

d(P

roto

)

?? ?? ?? ?? ??A CB

Extracted Field Buffer

Tuesday, May 14, 13

Packet data Extracted fields

Header types & locations

Implementing a parser

19

Header IdentificationA

B C

D E

F

ExtractedFieldBuffer

Field Extraction

Tuesday, May 14, 13

Packet data Extracted fields

Header types & locations

Implementing a parser

19

Header Identification

State Machine

A

B C

D E

F

ExtractedFieldBuffer

Field Extraction

Tuesday, May 14, 13

Packet data Extracted fields

Header types & locations

Implementing a parser

19

Header Identification

State Machine

A

B C

D E

F

ExtractedFieldBuffer

Field ExtractionHeader Extract Fields

A A1, A2B B1C C2, C4⋯ ⋯

Tuesday, May 14, 13

Data processing width?

20

?? ?? ?? ?? ?? ??

Tuesday, May 14, 13

Data processing width?

20

?? ?? ?? ?? ?? ??

A

B C

D E

F

Tuesday, May 14, 13

A

BC

D E

F

A

BC

D E

F

Data processing width?

20

?? ?? ?? ?? ?? ??

Packet position (B)0

48

12

Tuesday, May 14, 13

A

BC

D E

F

A

BC

D E

F

Data processing width?

20

?? ?? ?? ?? ?? ??

Packet position (B)0

48

12

Tuesday, May 14, 13

A

BC

D E

F

A

BC

D E

F

Data processing width?

20

?? ?? ?? ?? ?? ??

Packet position (B)0

48

12

4 cycles, 1 decision/cycleTuesday, May 14, 13

A

BC

D E

F

A

BC

D E

F

Data processing width?

20

?? ?? ?? ?? ?? ??

Packet position (B)0

48

12

4 cycles, 1 decision/cycleTuesday, May 14, 13

A

BC

D E

F

A

BC

D E

F

Data processing width?

20

?? ?? ?? ?? ?? ??

Packet position (B)0

48

12

4 cycles, 1 decision/cycle 2 cycles, 2 decisions/cycleTuesday, May 14, 13

21

Tuesday, May 14, 13

21

Processingwidth: 1B

Processingwidth: 2B

Processingwidth: 3B

Processingwidth: 16B

Tuesday, May 14, 13

21

Processingwidth: 1B

Processingwidth: 2B

Processingwidth: 3B

Processingwidth: 16B

Parser constructionPrototype: 2 months

Tuesday, May 14, 13

21

Processingwidth: 1B

Processingwidth: 2B

Processingwidth: 3B

Processingwidth: 16B

Parser constructionPrototype: 2 months

Processingwidth: 1B

Processingwidth: 2B

Processingwidth: 2B

Rate:10 Gb/s

Rate:20 Gb/s

Rate:100 Gb/s

Tuesday, May 14, 13

22

Understandingparser design

Parser generator

Trade-offs in parser design

Tuesday, May 14, 13

23

Parser(Verilog).v

Netlist Layout Reports:area, power, timing

Parser Generator

Clock Processingwidth

Parsegraph

…Parsersper chip

Synthesis

Tuesday, May 14, 13

24

.v

Netlist Layout Reports:area, power, timing

Parser(Verilog)

Parser Generator

Clock Processingwidth

Parsegraph

…Parsersper chip

Synthesis

Tuesday, May 14, 13

24

.v

Netlist Layout Reports:area, power, timing

Parser(Verilog)

Parser GeneratorGenesis

[Shacham et. al., IEEE Micro ’10]

Design Instance+

Per-Application ConfigurationA = 1 B = 12

Architectural Template

Clock Processingwidth

Parsegraph

…Parsersper chip

Synthesis

Tuesday, May 14, 13

24

.v

Netlist Layout Reports:area, power, timing

Parser(Verilog)

Parser GeneratorGenesis

[Shacham et. al., IEEE Micro ’10]

Design Instance+

Per-Application ConfigurationA = 1 B = 12

Architectural Template

Parser architectural template: mixed Perl/Verilog

//; foreach my $header (@headers) {//; my $hdrParser = generate('hdr_parser',//; "hdr_parser_" . $n++,//; Header => $header); `$hdrParser->instantiate()` ( .pkt_data (pkt data),

Clock Processingwidth

Parsegraph

…Parsersper chip

Synthesis

Tuesday, May 14, 13

25

.v

Parser design

ProcessingWidth

Parser GeneratorParse graph

Tuesday, May 14, 13

25

header {name: ____fields: ____extract: ____next-header: ____

}

...

Parse Graph &Header Formats

.v

Parser design

ProcessingWidth

Parser GeneratorParse graph

Tuesday, May 14, 13

A

BC

D E

F

25

header {name: ____fields: ____extract: ____next-header: ____

}

...

Parse Graph &Header Formats

.v

Parser design

ProcessingWidth

Parser GeneratorParse graph

Tuesday, May 14, 13

A

BC

D E

F

25

header {name: ____fields: ____extract: ____next-header: ____

}

...

Parse Graph &Header Formats

.v

Parser design

ProcessingWidth

Parser GeneratorParse graph

Tuesday, May 14, 13

A

BC

D E

F

25

header {name: ____fields: ____extract: ____next-header: ____

}

...

Parse Graph &Header Formats

.v

Parser design

ProcessingWidth

Parser GeneratorParse graph

A

A→B

A→C

Tuesday, May 14, 13

A

BC

D E

F

25

header {name: ____fields: ____extract: ____next-header: ____

}

...

Parse Graph &Header Formats

.v

Parser design

ProcessingWidth

Parser GeneratorParse graph

A

A→B

A→C

C

C→D

C→E

DD→F

EE→F

Tuesday, May 14, 13

26

A

C D

Next Header

B

Tuesday, May 14, 13

26

A

C D

Next Header

B

A

C D

Next Header

B

Tuesday, May 14, 13

26

A

C D

Next Header

B

A

C D

Next Header

BRequires bufferingto delay processing

Process all data by packet end ⇒ more data some cycles

Tuesday, May 14, 13

27

Meeting throughput needs

Tuesday, May 14, 13

27

Meeting throughput needs

r = f・wthroughput

(rate) frequencydata width

Tuesday, May 14, 13

27

Meeting throughput needs

r = f・wthroughput

(rate) frequencydata width

Parserwidth: w

Parser1

width: w/n

⋮Parser

n

width: w/n

Tuesday, May 14, 13

27

Meeting throughput needs

r = f・wthroughput

(rate) frequencydata width

Parserwidth: w

Parser1

width: w/n

⋮Parser

n

width: w/n

r = n・f・w/n

Tuesday, May 14, 13

28

Understandingparser design

Parser generator

Trade-offs in parser design

Tuesday, May 14, 13

Data processing width?

29

r = n・f・w

Parser1

width: w

⋮Parser

n

width: w

Fixed forswitch

Single instance:Build a single parser of rate r

(r = const n = 1 f 1/w)

Multiple instances:Build multiple parsers with total rate r

(r = const f = const n 1/w)

∝Tuesday, May 14, 13

Single parser instance

30

10 Gb/s Big parse graph

2 4 8 160 M

2 M

4 M

6 M

8 M

0

150

300

450

600

Gat

es

Processing width (B)

Pow

er (

mW

)

GatesPower

Tuesday, May 14, 13

Single parser instance

30

10 Gb/s Big parse graph

2 4 8 160 M

2 M

4 M

6 M

8 M

0

150

300

450

600

Gat

es

Processing width (B)

Pow

er (

mW

)

GatesPower

Tuesday, May 14, 13

Single parser instance

30

10 Gb/s Big parse graph

2 4 8 160 M

2 M

4 M

6 M

8 M

0

150

300

450

600

Gat

es

Processing width (B)

Pow

er (

mW

)

GatesPower

Area: narrow width

Power: slow clockTuesday, May 14, 13

31

Aggregating parsers

10 20 30 40 50 60 70 800M

0.5M

1M

1.5M

2M

0

150

300

450

600

Gat

es

Rate (Gb/s) per instance

Pow

er (

mW

)

640 Gb/s Big parse graph

SizePower

Tuesday, May 14, 13

31

Aggregating parsers

10 20 30 40 50 60 70 800M

0.5M

1M

1.5M

2M

0

150

300

450

600

Gat

es

Rate (Gb/s) per instance

Pow

er (

mW

)

640 Gb/s Big parse graph

SizePower

Tuesday, May 14, 13

31

Aggregating parsers

10 20 30 40 50 60 70 800M

0.5M

1M

1.5M

2M

0

150

300

450

600

Gat

es

Rate (Gb/s) per instance

Pow

er (

mW

)

640 Gb/s Big parse graph

SizePower

Area: independent of instance rate and count

Power: prefer fewer fast parsers

Tuesday, May 14, 13

Parse graph impacts area

32

Tuesday, May 14, 13

Parse graph impacts area

32

Enterprise Enterprise Edge Service Provider Big

Tuesday, May 14, 13

Parse graph impacts area

32

10 Gb/s 20 Gb/s 40 Gb/s 80 Gb/s0 M

0.5 M

1 M

1.5 M

2 M

Gat

es

Rate per instance

Enterprise Enterprise Edge Service Provider Big

Tuesday, May 14, 13

Parse graph impacts area

32

10 Gb/s 20 Gb/s 40 Gb/s 80 Gb/s0 M

0.5 M

1 M

1.5 M

2 M

Gat

es

Rate per instance

640 Gb/s aggregate

Enterprise Enterprise Edge Service Provider Big

Tuesday, May 14, 13

Parse graph impacts area

32

10 Gb/s 20 Gb/s 40 Gb/s 80 Gb/s0 M

0.5 M

1 M

1.5 M

2 M

Gat

es

Rate per instance

640 Gb/s aggregate

Enterprise Enterprise Edge Service Provider Big

Why?

Tuesday, May 14, 13

Extracted fields dominate area

33

0 M

0.5 M

1 M

1.5 M

2 M

Enterprise Enterprise Edge Service Provider Composite

Gat

es

Field Result BufferField ExtractionHeader Identification

640 Gb/s 40 Gb/s per instance

Tuesday, May 14, 13

Extracted fields dominate area

33

0 M

0.5 M

1 M

1.5 M

2 M

Enterprise Enterprise Edge Service Provider Composite

Gat

es

Field Result BufferField ExtractionHeader Identification

672 b 888 b 688 b 1664 b

640 Gb/s 40 Gb/s per instance

Tuesday, May 14, 13

34

672b 888b

688b1672b

Tuesday, May 14, 13

34

0 500 1000 1500 20000 M

0.5 M

1 M

1.5 M

2 M

Gat

es

Field Result Buffer Width (b)

640 Gb/s 40 Gb/s per instance

672b 888b

688b1672b

Tuesday, May 14, 13

34

0 500 1000 1500 20000 M

0.5 M

1 M

1.5 M

2 M

Gat

es

Field Result Buffer Width (b)

640 Gb/s 40 Gb/s per instance

672b 888b

688b1672b

Tuesday, May 14, 13

34

0 500 1000 1500 20000 M

0.5 M

1 M

1.5 M

2 M

Gat

es

Field Result Buffer Width (b)

640 Gb/s 40 Gb/s per instance

672b 888b

688b1672b

3 headersExtracted fields: 1672b

Tuesday, May 14, 13

34

0 500 1000 1500 20000 M

0.5 M

1 M

1.5 M

2 M

Gat

es

Field Result Buffer Width (b)

640 Gb/s 40 Gb/s per instance

672b 888b

688b1672b

3 headersExtracted fields: 1672b

Tuesday, May 14, 13

34

0 500 1000 1500 20000 M

0.5 M

1 M

1.5 M

2 M

Gat

es

Field Result Buffer Width (b)

640 Gb/s 40 Gb/s per instance

672b 888b

688b1672b

3 headersExtracted fields: 1672b

Area determined by

extracted field buffer size

Tuesday, May 14, 13

Design principles

35

Single parser instances area → minimize by reducing width power → minimize by reducing clock

Aggregating instances for throughput area → independent of instance rate & count power → minimize using few fast instances

Extracted field buffer dominates areaArea determined by extracted field size total

Tuesday, May 14, 13

36

Providing flexibilityRMT model

Programmable parser

Generating parse table entries

Tuesday, May 14, 13

37

Parser specific to one parse graph

Tuesday, May 14, 13

Parser

37

Parser specific to one parse graph

Tuesday, May 14, 13

Parser

37

Parser specific to one parse graph

Tuesday, May 14, 13

Parser

37

Parser specific to one parse graph

Tuesday, May 14, 13

Parser

37

Parser specific to one parse graph

S1

Tuesday, May 14, 13

Parser

37

Parser specific to one parse graph

Switch = S1

S1

Tuesday, May 14, 13

38

Tuesday, May 14, 13

38

Parser

Match Tables

EthernetForwarding

IPRouting

Access Control List

ActionProcessing

Header fields

Packets

In

Queues

Out

Tuesday, May 14, 13

38

Parser

Match Tables

EthernetForwarding

IPRouting

Access Control List

ActionProcessing

Header fields

Packets

In

Queues

Out

•CPU•GPU• FPGA

•OpenFlow/SDN?

Tuesday, May 14, 13

39

Parser

Match Tables

EthernetForwarding

IPRouting

Access Control List

ActionProcessing

Header fields

Packets

In

Queues

Out

Tuesday, May 14, 13

39

Parser

Match Tables

EthernetForwarding

IPRouting

Access Control List

ActionProcessing

Header fields

Packets

In

Queues

Out

Multiple Match Table (MMT)

Tuesday, May 14, 13

39

Parser

Match Tables

EthernetForwarding

IPRouting

Access Control List

ActionProcessing

Header fields

Packets

In

Queues

Out

Multiple Match Table (MMT)

Programmable Parser

Reconfigurable Match + Action Tables

Reco

mbi

ne

Packets

In

Queues

Out

Reconfigurable Multiple Table (RMT)

Tuesday, May 14, 13

39

Parser

Match Tables

EthernetForwarding

IPRouting

Access Control List

ActionProcessing

Header fields

Packets

In

Queues

Out

Multiple Match Table (MMT)

Programmable Parser

Reconfigurable Match + Action Tables

Reco

mbi

ne

Packets

In

Queues

Out

Reconfigurable Multiple Table (RMT)

Tuesday, May 14, 13

39

Parser

Match Tables

EthernetForwarding

IPRouting

Access Control List

ActionProcessing

Header fields

Packets

In

Queues

Out

Multiple Match Table (MMT)

Programmable Parser

Reconfigurable Match + Action Tables

Reco

mbi

ne

Packets

In

Queues

Out

Reconfigurable Multiple Table (RMT)

Tuesday, May 14, 13

39

Parser

Match Tables

EthernetForwarding

IPRouting

Access Control List

ActionProcessing

Header fields

Packets

In

Queues

Out

Multiple Match Table (MMT)

Programmable Parser

Reconfigurable Match + Action Tables

Reco

mbi

ne

Packets

In

Queues

Out

Reconfigurable Multiple Table (RMT)

Tuesday, May 14, 13

39

Parser

Match Tables

EthernetForwarding

IPRouting

Access Control List

ActionProcessing

Header fields

Packets

In

Queues

Out

Multiple Match Table (MMT)

Programmable Parser

Reconfigurable Match + Action Tables

Reco

mbi

ne

Packets

In

Queues

Out

Reconfigurable Multiple Table (RMT)

Tuesday, May 14, 13

39

Parser

Match Tables

EthernetForwarding

IPRouting

Access Control List

ActionProcessing

Header fields

Packets

In

Queues

Out

Multiple Match Table (MMT)

Programmable Parser

Reconfigurable Match + Action Tables

Reco

mbi

ne

Packets

In

Queues

Out

Reconfigurable Multiple Table (RMT)

Tuesday, May 14, 13

OutputQueues

Rec

ombi

ne

Match Table Ac

tion Match

Table Actio

n

OUTIN

DAT

AH

EAD

ER

Stage 1 Stage n

40

RMT architecture

Tuesday, May 14, 13

OutputQueues

Rec

ombi

ne

Match Table Ac

tion Match

Table Actio

n

OUTIN

DAT

AH

EAD

ER

Stage 1 Stage n

40

RMT architecture

Data H

Tuesday, May 14, 13

OutputQueues

Rec

ombi

ne

Match Table Ac

tion Match

Table Actio

n

OUTIN

DAT

AH

EAD

ER

Stage 1 Stage n

40

RMT architecture

Data

H

Tuesday, May 14, 13

OutputQueues

Rec

ombi

ne

Match Table Ac

tion Match

Table Actio

n

OUTIN

DAT

AH

EAD

ER

Stage 1 Stage n

40

RMT architecture

Data

Tuesday, May 14, 13

OutputQueues

Rec

ombi

ne

Match Table Ac

tion Match

Table Actio

n

OUTIN

DAT

AH

EAD

ER

Stage 1 Stage n

40

RMT architecture

Data

Tuesday, May 14, 13

OutputQueues

Rec

ombi

ne

Match Table Ac

tion Match

Table Actio

n

OUTIN

DAT

AH

EAD

ER

Stage 1 Stage n

40

RMT architecture

Data

Tuesday, May 14, 13

OutputQueues

Rec

ombi

ne

Match Table Ac

tion Match

Table Actio

n

OUTIN

DAT

AH

EAD

ER

Stage 1 Stage n

40

RMT architecture

Data

Tuesday, May 14, 13

OutputQueues

Rec

ombi

ne

Match Table Ac

tion Match

Table Actio

n

OUTIN

DAT

AH

EAD

ER

Stage 1 Stage n

40

RMT architecture

HData

Tuesday, May 14, 13

OutputQueues

Rec

ombi

ne

Match Table Ac

tion Match

Table Actio

n

OUTIN

DAT

AH

EAD

ER

Stage 1 Stage n

40

RMT architecture

Tuesday, May 14, 13

RMT Match Tables

41

PhysicalStage 1

PhysicalStage 2

PhysicalStage n

Logical Table 1

Logical Table 2

4 5

Logical Table 3 6

Tuesday, May 14, 13

Forwarding Metamorphosis: Fast Programmable Match-Action

Processing in Hardware for SDN

P. Bosshart, G.Gibb, H.S. Kim, G. Varghese,N. McKeown, M. Izzard, F. Mujica & M. Horowitz

SIGCOMM 2013 [to appear]

42

Tuesday, May 14, 13

43

Providing flexibilityRMT model

Programmable parser

Generating parse table entries

Tuesday, May 14, 13

Providing programmability

44

A

BC

D E

F

C

C→D

C→E

DD→F

EE→F

Header Identification

ExtractedFieldBuffer

Field ExtractionHeader Extract Fields

A A1, A2B B1C C2, C4⋯ ⋯

Extracted fieldsPacket data

Header types & locations

Tuesday, May 14, 13

Providing programmability

44

A

BC

D E

F

C

C→D

C→E

DD→F

EE→F

Header Identification

ExtractedFieldBuffer

Field ExtractionHeader Extract Fields

A A1, A2B B1C C2, C4⋯ ⋯

Extracted fieldsPacket data

Header types & locations

Replace hard-coded logic with

programmable logic

Tuesday, May 14, 13

Providing programmability

44

A

BC

D E

F

C

C→D

C→E

DD→F

EE→F

Header Identification

ExtractedFieldBuffer

Field ExtractionHeader Extract Fields

A A1, A2B B1C C2, C4⋯ ⋯

Extracted fieldsPacket data

Header types & locations

Curr. State Match Values Next StateA A1, A2 BB B1 --C C2, C4 D⋯ ⋯ ⋯

Tuesday, May 14, 13

Current State Match Values Next State

A 11 (A→B) BA A→C CC C→D, D→F FC C→E E

45

A

B C

D E

F

Tuesday, May 14, 13

Current State Match Values Next State

A 11 (A→B) BA A→C CC C→D, D→F FC C→E E

45

A

B C

D E

F

Tuesday, May 14, 13

Current State Match Values Next State

A 11 (A→B) BA A→C CC C→D, D→F FC C→E E

45

A

B C

D E

F

Tuesday, May 14, 13

Current State Match Values Next State

A 11 (A→B) BA A→C CC C→D, D→F FC C→E E

45

A

B C

D E

F

Tuesday, May 14, 13

Current State Match Values Next State

A 11 (A→B) BA A→C CC C→D, D→F FC C→E E

45

A

B C

D E

F

Tuesday, May 14, 13

Current State Match Values Next State

A 11 (A→B) BA A→C CC C→D, D→F FC C→E E

45

A

B C

D E

F

Tuesday, May 14, 13

Current State Match Values Next State

A 11 (A→B) BA A→C CC C→D, D→F FC C→E E

45

A

B C

D E

F

Tuesday, May 14, 13

46

Parser state table

Tuesday, May 14, 13

46

Parser state table

Current State

Match Values

Next State

Tuesday, May 14, 13

46

Parser state table

Current State

Match Values

Next State

TCAM or RAM RAM

Tuesday, May 14, 13

46

Parser state table

Current State

Match Values

Next State

Header Length

TCAM or RAM RAM

Tuesday, May 14, 13

46

Parser state table

Current State

Match Values

Next State

Header Length

Next Match Offsets

TCAM or RAM RAM

Tuesday, May 14, 13

46

Parser state table

Current State

Match Values

Next State

Header Length

Next Match Offsets

TCAM or RAM RAM

Next headerlocation

Next matchlocations

Tuesday, May 14, 13

46

Parser state table

Current State

Match Values

Next State

Header Length

Next Match Offsets

TCAM or RAM RAM Optional

Next headerlocation

Next matchlocations

Tuesday, May 14, 13

46

Parser state table

Current State

Match Values

Next State

Header Length

Next Match Offsets

Next Lookup Mask

TCAM or RAM RAM Optional

Next headerlocation

Next matchlocations

Tuesday, May 14, 13

46

Parser state table

Current State

Match Values

Next State

Header Length

Next Match Offsets

Next Lookup Mask

Extract Fields

TCAM or RAM RAM Optional

Next headerlocation

Next matchlocations

Tuesday, May 14, 13

Cost of programmability

47

Extracted Field Buffer Hdr Ident/Field ExtractTCAM (State Table) RAM (State Table)

Tuesday, May 14, 13

Cost of programmability

47

Fixed Programmable0 M

1.75 M

3.5 M

5.25 M

7 M

Gat

es

Extracted Field Buffer Hdr Ident/Field ExtractTCAM (State Table) RAM (State Table)

Tuesday, May 14, 13

Cost of programmability

47

Fixed Programmable0 M

1.75 M

3.5 M

5.25 M

7 M

Gat

es

Extracted Field Buffer Hdr Ident/Field ExtractTCAM (State Table) RAM (State Table)

Tuesday, May 14, 13

Cost of programmability

47

Fixed Programmable0 M

1.75 M

3.5 M

5.25 M

7 M

Gat

es

Extracted Field Buffer Hdr Ident/Field ExtractTCAM (State Table) RAM (State Table)

4.4mm2

2.6mm2

Tuesday, May 14, 13

Cost of programmability

47

Fixed Programmable0 M

1.75 M

3.5 M

5.25 M

7 M

Gat

es

Extracted Field Buffer Hdr Ident/Field ExtractTCAM (State Table) RAM (State Table)

4.4mm2

2.6mm2

Programmability costs 1.5-3x

State table size determines area increase

Tuesday, May 14, 13

Take-aways

48

Cost of programmability1.5-3x fixed parser area

State table dominates additional area area → minimize TCAM and RAMParse graph edge count determines table size

Tuesday, May 14, 13

49

Providing flexibilityRMT model

Programmable parser

Generating parse table entries

Tuesday, May 14, 13

50

Naïve generation of state table entries

Tuesday, May 14, 13

50

1 2 4 6 8 10 12 14 160

37.5

75

112.5

150

TC

AM

tab

le s

ize

(Kb)

Processing width (B)

Naïve generation of state table entries

Tuesday, May 14, 13

50

1 2 4 6 8 10 12 14 160

37.5

75

112.5

150

TC

AM

tab

le s

ize

(Kb)

Processing width (B)

Naïve generation of state table entries

Tuesday, May 14, 13

51

State table entry generationCurrent

State Match Values Next State

A 11 (A→B) BA A→C CC C→D, D→F FC C→E E

A

B C

D E

F

Tuesday, May 14, 13

51

State table entry generationCurrent

State Match Values Next State

A 11 (A→B) BA A→C CC C→D, D→F FC C→E E

A

B C

D E

F

Tuesday, May 14, 13

51

State table entry generation

Merge nodes to minimize edges

Current State Match Values Next State

A 11 (A→B) BA A→C CC C→D, D→F FC C→E E

A

B C

D E

F

Tuesday, May 14, 13

51

State table entry generation

Merge nodes to minimize edges

Problem: graph clustering is NP-hard

Current State Match Values Next State

A 11 (A→B) BA A→C CC C→D, D→F FC C→E E

A

B C

D E

F

Tuesday, May 14, 13

Kangaroo

52

Intuition: iteratively identify minimal edge clustering starting at leaves

Tuesday, May 14, 13

Kangaroo

52

Intuition: iteratively identify minimal edge clustering starting at leaves

Tuesday, May 14, 13

Kangaroo

52

Intuition: iteratively identify minimal edge clustering starting at leaves

Tuesday, May 14, 13

Kangaroo

52

Intuition: iteratively identify minimal edge clustering starting at leaves

Tuesday, May 14, 13

Kangaroo

52

Intuition: iteratively identify minimal edge clustering starting at leaves

Kangaroo’s algorithm:• access to data anywhere in header region• non-minimal solutions for non-trees

Tuesday, May 14, 13

Improving solutionfor non-trees

53

Tuesday, May 14, 13

Improving solutionfor non-trees

53

Tuesday, May 14, 13

Improving solutionfor non-trees

53

Tuesday, May 14, 13

Improving solutionfor non-trees

53

Two independent

solutions

Tuesday, May 14, 13

Improving solutionfor non-trees

53

Two independent

solutions

Solution: solve shared regions independently

Tuesday, May 14, 13

Improving solutionfor non-trees

53

Two independent

solutions

Solution: solve shared regions independently

Tuesday, May 14, 13

Improving solutionfor non-trees

53

Two independent

solutions

Solution: solve shared regions independently

Tuesday, May 14, 13

Improving solutionfor non-trees

53

Two independent

solutions

Solution: solve shared regions independently

Tuesday, May 14, 13

Improving solutionfor non-trees

53

Two independent

solutions

Solution: solve shared regions independently

Tuesday, May 14, 13

Streaming-aware algorithm

54

Kangaroo:

Streaming:

Tuesday, May 14, 13

Streaming-aware algorithm

54

Kangaroo:

Streaming:

Tuesday, May 14, 13

Streaming-aware algorithm

54

Kangaroo:

Streaming:

Tuesday, May 14, 13

Streaming-aware algorithm

54

Kangaroo:

Streaming:

Tuesday, May 14, 13

Streaming-aware algorithm

54

Kangaroo:

Streaming: Next Hdr

Next Hdr

Tuesday, May 14, 13

Streaming-aware algorithm

55

Kangaroo: OPT (n, b) = minc2Clusters(n)

0

@entries(c) +X

j2Fringe(c)

OPT (j, . . .)

1

A

Tuesday, May 14, 13

Streaming-aware algorithm

55

Kangaroo:

Streaming:

OPT (n, b) = minc2Clusters(n)

0

@entries(c) +X

j2Fringe(c)

OPT (j, . . .)

1

A

OPT (n, b, w) = minc2Clusters(n,w)

0

@entries(c) +X

j2Fringe(c)

OPT (j, . . . , NewLoc(w, j, c))

1

A

Tuesday, May 14, 13

Streaming-aware algorithm

55

Kangaroo:

Streaming:

OPT (n, b) = minc2Clusters(n)

0

@entries(c) +X

j2Fringe(c)

OPT (j, . . .)

1

A

New parameter:window location

OPT (n, b, w) = minc2Clusters(n,w)

0

@entries(c) +X

j2Fringe(c)

OPT (j, . . . , NewLoc(w, j, c))

1

A

Tuesday, May 14, 13

Streaming-aware algorithm

55

Kangaroo:

Streaming:

OPT (n, b) = minc2Clusters(n)

0

@entries(c) +X

j2Fringe(c)

OPT (j, . . .)

1

A

Node clusters restricted by:• windows location• window size

New parameter:window location

OPT (n, b, w) = minc2Clusters(n,w)

0

@entries(c) +X

j2Fringe(c)

OPT (j, . . . , NewLoc(w, j, c))

1

A

Tuesday, May 14, 13

Streaming-aware algorithm

55

Kangaroo:

Streaming:

OPT (n, b) = minc2Clusters(n)

0

@entries(c) +X

j2Fringe(c)

OPT (j, . . .)

1

A

Node clusters restricted by:• windows location• window size

New parameter:window location

Updated location for subgraphs

OPT (n, b, w) = minc2Clusters(n,w)

0

@entries(c) +X

j2Fringe(c)

OPT (j, . . . , NewLoc(w, j, c))

1

A

Tuesday, May 14, 13

Algorithm performance

56

O(|E||V|dk)

Method

40b TCAM(8b state +

2 x 16b inputs)

56b TCAM(8b state +

3 x 16b inputs)

Naive 342 entries0.48s

641 entries0.48s

Algorithm(excluding non-tree logic)

177 entries2.6s

170 entries5.5s

Algorithm 112 entries128.7s

106 entries207.6s

Tuesday, May 14, 13

Benefits of parallel lookups?

57

32 640

30

60

90

120

Tabl

e en

trie

s re

quir

ed

Data arrival rate (bits/cycle)

1234

Lookups

Tuesday, May 14, 13

Benefits of parallel lookups?

57

32 640

30

60

90

120

Tabl

e en

trie

s re

quir

ed

Data arrival rate (bits/cycle)

1234

Lookups

Tuesday, May 14, 13

Benefits of parallel lookups?

57

32 640

30

60

90

120

Tabl

e en

trie

s re

quir

ed

Data arrival rate (bits/cycle)

1234

Lookups

Unable to process at arrival rate

Tuesday, May 14, 13

Benefits of parallel lookups?

57

32 640

30

60

90

120

Tabl

e en

trie

s re

quir

ed

Data arrival rate (bits/cycle)

1234

Lookups

0

2000

4000

6000

8000T

CA

M b

its r

equi

red

Unable to process at arrival rate

Tuesday, May 14, 13

Benefits of parallel lookups?

57

32 640

30

60

90

120

Tabl

e en

trie

s re

quir

ed

Data arrival rate (bits/cycle)

1234

Lookups

0

2000

4000

6000

8000T

CA

M b

its r

equi

red

Unable to process at arrival rate

Minimize parallel lookups

for single instance

Tuesday, May 14, 13

Contributions

58

Parser generator

Parser design trade-off analysis & principlesFixed parsersSingle parser instances area → minimize by reducing width power → minimize by reducing clock

Aggregating instances for throughput area → independent of instance rate & count power → minimize using few fast instances

Extracted field buffer dominates area

Programmable parsersCost of programmability is low (1.5-3x)State table dominates area increase

RMT model

State table generation algorithm

Tuesday, May 14, 13

Publications

59

Forwarding Metamorphosis: Fast Programmable Match-Action

Processing in Hardware for SDNBosshart, P., Gibb, G., et. al. SIGCOMM 2013 [to appear]

Outsourcing network functionalityGibb, G., Zeng, H., and McKeown, N., HotSDN '12.

Initial Thoughts on the Waypoint ServiceGibb, G., Zeng, H., and McKeown, N., WISH '11.

Can the Production Network be the Testbed?Sherwood, R., Gibb, G., et. al, OSDI '10.

A Packet Generator on the NetFPGA platformCovington, G.A., Gibb, G., et. al. FCCM '09,.

NetFPGA – An Open Platform for Teaching How to Build Gigabit-rate

Network Switches and RoutersGibb, G., et. al. IEEE Transactions on Education ’08.

NetFPGA: Reusable Router Architecture for Experimental

ResearchNaous, J., Gibb, G., et. al. PRESTO '08.

Building a RCP (Rate Control Protocol) Test Network

Dukkipati, N., Gibb, G. et. al. Hot Interconnects ’07.

NetFPGA—An Open Platform for Gigabit-Rate Network Switching and Routing

Lockwood, J., et. al. MSE '07

Tuesday, May 14, 13

top related