hardware-accelerated signaling

36
Hardware-Accelerated Signaling Haobo Wang November 15, 2004 ―Design, implementation and Implications

Upload: avalon

Post on 09-Feb-2016

53 views

Category:

Documents


0 download

DESCRIPTION

Hardware-Accelerated Signaling. ―Design, implementation. and Implications. Haobo Wang November 15, 2004. Outline. Background and problem statement OCSP: a performance-oriented signaling protocol and its hardware implementation - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Hardware-Accelerated Signaling

Hardware-Accelerated Signaling

Haobo WangNovember 15,

2004

―Design, implementationand Implications

Page 2: Hardware-Accelerated Signaling

223/4/22

Outline Background and problem statement OCSP: a performance-oriented signaling

protocol and its hardware implementation A subset of RSVP-TE signaling protocol and its

hardware implementation Comparison of signaling transport options Implications of hardware-accelerated signaling Conclusions and future work

Page 3: Hardware-Accelerated Signaling

323/4/22

Outline Background and problem statement OCSP: a performance-oriented signaling

protocol and its hardware implementation A subset of RSVP-TE signaling protocol and its

hardware implementation Comparison of signaling transport options Implications of hardware-accelerated signaling Conclusions and future work

Page 4: Hardware-Accelerated Signaling

423/4/22

Background Signaling protocol

Set up and tear down connections in connection-oriented networks

Control-plane protocol Signaling protocols are primarily implemented in

software Two reasons: complexity and the requirement for flexibility Price paid: poor performance

RSVP-TE for GMPLS Support a wide range of connection-oriented networks Being implemented by switch vendors

Page 5: Hardware-Accelerated Signaling

523/4/22

Network and node views

Control plane

User plane . .

Line card Line card

.

.

.

. Line card

Line card

.

. Switch Fabric

Output Input Interfaces Interfaces

Routing process Signaling

process . . .

.

.

. NIC

P

Hardware Signaling Accelerator NIC NIC

NIC Input signaling Interfaces Output signaling

Interfaces

LMP process

Page 6: Hardware-Accelerated Signaling

623/4/22

Two questions Question 1: why connection-oriented (CO)?

Inherent support for QoS Can connectionless networks provide QoS? Yes, but

Over-provisioning -> low utilization Question 2: What the drawbacks of CO?

Call setup overhead – signaling message propagation delay, processing delays, and transmission delays

Call handling capacities of today’s switches are limited

Page 7: Hardware-Accelerated Signaling

723/4/22

Problem statement How to overcome the drawbacks of CO ―

Hardware-accelerated signaling? Determine whether signaling protocols can be

implemented in hardware and demonstrate it with an actual implementation

Study how to reduce signaling message trans-mission delays

Explore the impact of hardware-accelerated signaling protocol implementations

Page 8: Hardware-Accelerated Signaling

823/4/22

Related work How to achieve fast signaling?

New, simplified signaling protocols: YESSIR, PCC Hardware implementation: FRP (ASIC, not

flexible) A simplified version of RSVP-TE intended for

hardware implementation “Keep It Simple” Signaling – still on the blueprint

Other comparable protocols implemented in hardware TOE: TCP/IP Offload Engine TCP switching

Page 9: Hardware-Accelerated Signaling

923/4/22

Outline Background and problem statement OCSP: a performance-oriented signaling

protocol and its hardware implementation A subset of RSVP-TE signaling protocol and its

hardware implementation Comparison of signaling transport options Implications of hardware-accelerated signaling Conclusions and future work

Page 10: Hardware-Accelerated Signaling

1023/4/22

Optical Circuit-switching signaling Protocol - OCSP Performance-oriented, optimized for

hardware implementation Specifically designed for SONET switches Implemented on WILDFORCE FPGA board

Page 11: Hardware-Accelerated Signaling

1123/4/22

Hardware platform of the implementation

P C IBUS

P C IC HI P

F I F O 05 1 2 b y 3 6

F I F O 15 1 2 b y 3 6

F I F O 45 1 2 b y 3 6

S R AM3 2 k b y 3 2

C P E 0

M E Z Z A N I N EC A R D

D PMC 0

C R O S S BAR

PE 1

M E Z Z A N I N EC A R D

D PMC 1

PE 2

M E Z Z A N I N EC A R D

D PMC 2

PE 4

M E Z Z A N I N EC A R D

D PMC 4

L O C AL BUS

L O C AL BUS

HO S T

3 2

3 2

PE 3

M E Z Z A N I N EC A R D

D PMC 3

3 2 3 2FIFO 0 FIFO 1

SignalE ngine

M e zzanine M e m o ry

1 5

3 62

FIFO 0C o ntro l le r

M e m o ryC o ntro l le r

FIFO 1C o ntro l le r

C P E 0

Fro m H o s t To H o s t

3 6

R o uting, C AC ,C o nne c tivi ty table s

S tate , Switc hM apping table s

2 4

1 5

P E 1

2

3 28

2

Page 12: Hardware-Accelerated Signaling

1223/4/22

Simulation and implementation results for OCSP

Assuming a 25 MHz clock Total setup and teardown time: 5.9 to 6.8 us Call handling capacity of 150,000 calls/sec

Setup Setup Success Release Release

ConfirmClockcycles 77-101 9 51 10

Device Resource Eq.GatesCPE0 XC4036XL

A 62% 22,000

PE1 XC4013XLA

8% 1,000

Page 13: Hardware-Accelerated Signaling

1323/4/22

Outline Background and problem statement OCSP: a performance-oriented signaling

protocol and its hardware implementation A subset of RSVP-TE signaling protocol and its

hardware implementation Comparison of signaling transport options Implications of hardware-accelerated signaling Conclusions and future work

Page 14: Hardware-Accelerated Signaling

1423/4/22

Challenges for hardware implementation of RSVP-TE

A large number of messages, objects Maintaining state information Many data tables Support for timers Global connection reference Flexible TLV style object……

L e ngth (1 6 ) C las s -N um (1 ) C -Type (7 )IP v4 tunne l e nd po int addre s s

M us t be ze ro Tunne l IDE xte nde d Tunne l ID

A SESS ION o bje c t de f ine d in R FC 3 2 0 9 (R SVP -TE )

L e ngth (1 2 ) C las s -N um (1 ) C -Type (1 )IP v4 D e s t Addre s s

P ro to c o l ID Flags D s t P o r t

A SES S ION o bje c t de f ine d in R FC 2 2 0 5 (R SVP )

TypeLength

Value

Can be overcomeby defining a sub-set of RSVP-TE

Page 15: Hardware-Accelerated Signaling

1523/4/22

Processing of Path message

IP 5.7.1.1

IP 7.4.1.4

IP 5.7.1.3 IP 7.4.1.2

IP 4.8.1.1

Index ReturnNext_IP_Addr_User Next_IP_Addr_Ctrl

7.4.1.2 7.4.2.2

Incoming Connectivity table

Int.#1

Int.#5 Int.#3Int.#10

Index ReturnDest_IP_Addr. Next_Hop_Addr_User

7.4.1.4 7.4.1.2

Routing tableIndex1 Return

Out.I/F ID Avail. BW.3 0101 0011

1111

Outgoing CAC table

State tableUser/Control Mapping table

Index ReturnPrev_IP_Addr_User Prev.I/F ID In.I/F ID

5.7.1.3 1 5

Outgoing Connectivity tableIndex Return

Next_IP_Addr_User Seq.# Out.I/F ID7.4.1.2 1 3

Index ReturnGlobal Conn

RefCtrl plane

info.User plane

infoTraffic State

… … … … … …

Page 16: Hardware-Accelerated Signaling

1623/4/22

Architecture of the hardware signaling accelerator (FPGA)

M e s s ageAs s e m ble r

O bje c tD is patc he r

Inc o m ingM s g B uf

P C I L o c al B us Inte r fac e

R e s o urc eM anage m e nt

D ata TableM anage m e nt

R e gis te rB ank

C AC Table

R e trans m is s io nM anage m e nt

GbE

Inte

rfac

e

F IFO Inte r fac e C ro s s -c o nne c t Inte r fac e

TCA

M &

SRA

M Intefaces

SRAM

I/F

Message parsing Message processing

Message assembling

Page 17: Hardware-Accelerated Signaling

1723/4/22

Functional modules of the hardware signaling accelerator Incoming message buffer

Two-level message buffering and FIFO interface Object dispatcher

Two-level dispatching and distributed decoding – TLV challenge

Data table management Table access arbiter and TCAM/SRAM interfaces

Resource management Hierarchical resource allocator and CAC table

Retransmission management Retransmission buffers, timers, exponential back-off

Page 18: Hardware-Accelerated Signaling

1823/4/22

Architecture of the prototype board

FPGA(XC2V3000)

TCAM(IDT75P52100)

SRAM(IDT71V2556)

FIFO(IDT72V36110)

GbE MAC(L8104)

Switch Fabric(VSC9182)

SerDes(HDMP-1636A)

15-bit

72-bit

Command Bus

Req. Data Bus

21-bitIndex Bus

CE#

OE#

Data Bus36-bit

D[35:0]

RxD[35:0]

TxD[35:0]

RX[9:0]

TX[9:0]

RxEn#

TxEn#

RxSOF

TxSOF

RxWM2

TxWM2

CS#

Data

Address

ALE

WR#

RD#

CONFIG

INT#

AD

[31:

0]

C/B

E#[3

:0]

PAR

Fram

e#TR

DY

#IR

DY

#St

op#

DEV

SEL#

IDSE

L

PER

R#

SER

R#

REQ

#G

NT#

CLK

RST

#

Q[35:0]

EF#

FF#

RCLK

WCLK

REn#

WEn#

REGA[7:0]

REGD[15:0]

REGCLK REGCS#

REGINTREGWR#REGRD#

RxD[63:0]+/- TxD[63:0]+/-

SYNCP#

SYSCLK+/-

DO

UT-

DIN

+

RBC0RBC1

REFCLK

Optical Transceiver

(HDMP-1636A)

Duplex SCConnector

PCI Connector

Clock OSC(CO43S)

3 REFCLKD

IN-

DO

UT+

Message Buffer

SRAM

Page 19: Hardware-Accelerated Signaling

1923/4/22

Main on-board modules Hardware signaling accelerator: FPGA

957-pin BGA, 6 separate clock signals, 3 I/O levels High-speed (100MHz)

1Gbps signaling channel: GbE MAC+SerDes+ optical transceiver Demonstrate 250,000 calls/sec call handling rate High-speed interface: 125MHz

Incoming message buffer: FIFO Hardware/software interface

Data tables: TCAM and SRAM User plane device: switch fabric

High-speed LVDS signals

Page 20: Hardware-Accelerated Signaling

2023/4/22

Organization of the data tables

7 2 - b it

6 4 K

3 2 -b it

D es t_ I P _ Ad d r3 2 K

6 4

6 4

2 K

S r c _ I P _ Ad d r ( 3 2 -b it) + LS P ID (1 6 - b it) + D es t_ IP _ Ad d r ( 3 2 - b it) + T u n n e l I D ( 1 6 - b it)+ Ex ten d ed T u n n el ID (3 2 -b it)1 K

1 2 8 - b it

3 8 - b it

3 8 - b it

3 2 - b it

6 4 -b it

6 4

0 x 0 0 0 0

0 x 0 BC D1 2 8 .2 3 8 .3 4 .1 1 7 1 2 8 .1 4 3 .7 4 .1 9 3

3 2 -b itTC A M ( I n de x ) S R A M (R e t u rn v a lu e )

N ex t_ I P _ Ad d r _ Us er

S tate in f o r m atio n

0 x F F F F

0 x 0 0 0 0

0 x F F F F

( 2 2 8 - b it)

1 2 - b it6 4

FPG A

Un u s e d

Un u s e dUn u s e d

N ex t_ I P _ Ad d r_ Us er N ex t_ I P _ Ad d r _ C tr l

P r e_ I P _ Ad d r+ I n ter f ac e_ I D

N ex t_ I P _ Ad d r+ In ter f ac e_ I D

CAC table Routing table

Incoming Conn tbl

Outgoing Conn tblUser/Ctrl

Mapping tblState table

Page 21: Hardware-Accelerated Signaling

2123/4/22

Clock and power distribution schemes

1 0 0 M H z

7 8 M H z

D C M2

100M

Hz

5 0 M H z

D C M 5 0 M H z

3 3 M H zL 8 1 0 4SC L K

ID T7 2 V3 6 1 0 0

X C 2 V3 0 0 0

ID T7 1 V2 5 5 6 A

ID T7 1 V2 5 5 6 A

ID T7 5 P 5 2 1 0 0VSC 9 1 8 2

GC L KD C M

D C M

5 0 M H z

C L K 2 X

W C L K /R C L K

T C L K

C L K

C L K

SY SC L K P /N

H 1 6 3 6 AR E F C L K 1 2 5 M H z T B C

F r o m P C I bus3 3 M H z

P C L K

R E GC L K

1 2 5 M H z

Del

aylin

e

D e la y lin e

T C A M

SR AM

SR AM

FP G A

S w itc h fa br icF IF O

S e rD e s

G bE M A C

Clock distribution scheme

Power distribution scheme Two extra power supplies

Page 22: Hardware-Accelerated Signaling

2223/4/22

Processing of signaling message — simulation results

P re vio uss tage r e ady

TC AMinte r fac e

SR AMinte r fac e

Switc h fabr icinte r f ac e

R e s vm e s s age

R e ad Statetable

P ro grams witc h fabr ic

E nd o fpro c e s s ing

P athm e s s age

TC AMinte r f ac e

SR AMinte r fac e

C ACtable

P re vio uss tage re ady

R o utingtable

Inc o m ingC o nn tab le

O utgo ingC o nn table

U 2 C M appingtable

S tatetable

E nd o fpro c e s s ing

TC AMinte r fac e

SR AMinte r f ac e

C ACtable

P athTe arm e s s age

R e ad Statetable

R e le as e al lo c ate dt im e s lo t

P re vio uss tage r e ady

E nd o fpro c e s s ing

Page 23: Hardware-Accelerated Signaling

2323/4/22

Implementation and simulation results

Device PCI core Resource

Eq.Gates

Max freq.

XC2V3000 w/o PCI 12% 360,000 90MHzw/ PCI 21% 630,000 50MHz

Implementation results

Path Resv PathTear/ResvTear

Clock cycles 40 32 19

Simulation results (@50MHz)

Page 24: Hardware-Accelerated Signaling

2423/4/22

Outline Background and problem statement OCSP: a performance-oriented signaling

protocol and its hardware implementation A subset of RSVP-TE signaling protocol and its

hardware implementation Comparison of signaling transport options Implications of hardware-accelerated signaling Conclusions and future work

Page 25: Hardware-Accelerated Signaling

2523/4/22

In-band signaling and out-of-band signaling

S W 1

S W 2 S W 3

S W 6

S W 4 S W 5

S W 2 S W 3

S W 4 S W 5

S W 1 S W 6

IP netw o rk

S ignalingU s er

R 1

R 2 R 3

R 6R 4 R 5

( a) In-band s ignal ing (b) O ut-o f -band s ignal ing

Page 26: Hardware-Accelerated Signaling

2623/4/22

Models for in-band signaling and out-of-band signaling

proc

1IBtx

2IBtx

nIBtx

1q

2q

nq

( a) In-B and S ignal ing, whe re1

1n

ii

q

OOBtx

(b) O ut-o f -B and S ignal ing

proc

Page 27: Hardware-Accelerated Signaling

2723/4/22

Set-up delay analysis

2

2 30

0

1 ((1 ) 2 (1 ) 3 (1 ) ...)

( (1 ) 3 (1 ) 7 (1 ) ...)1 1

1 1 2

n txproc

n txproc

E T T E T p p p p p

T p p p p p ppT E T T

p p

Total delay = processing delay + network delay + transmission delay (retransmissions)

Assuming T0 = 3Tn, M/D/1 queue for E[Ttx], we have

11 1 1 12[ ] ,1 1 1 2 n

proc tx

pE T Tp p

tx

: aggregate signaling message arrival rate

: service rate of the

signaling processor : service rate of the

signaling transmitter

proc

p

tx

: packet loss rate : transmission time : one-way network delay : initial time-out valuen

o

TTT

Page 28: Hardware-Accelerated Signaling

2823/4/22

In-band/out-of-band signaling with hardware signaling, metro area, μtx=μproc

Numerical results In-band/out-of-band signaling with hardware signaling, wide area,

μtx=μproc

In-band/out-of-band signaling with hardware signaling, metro area, μtx<<μproc

In-band/out-of-band signaling with hardware signaling, wide area, μtx<<μproc

In-band/out-of-band signaling with software signaling, metro area In-band/out-of-band signaling with software signaling, wide area

Page 29: Hardware-Accelerated Signaling

2923/4/22

A sum-up of the comparison With hardware signaling accelerators

In-band signaling is the way to go Network delays dominate the total delay

With software signaling processors In wide area –> in-band signaling In metro area –> out-of-band signaling

is a good choice

Page 30: Hardware-Accelerated Signaling

3023/4/22

Outline Background and problem statement OCSP: a performance-oriented signaling

protocol and its hardware implementation A subset of RSVP-TE signaling protocol and its

hardware implementation Comparison of signaling transport options Implications of hardware-accelerated signaling Conclusions and future work

Page 31: Hardware-Accelerated Signaling

3123/4/22

End-to-end circuits for large file transfers End-to-end circuits only justifiable for large files

( [ | ( )]) /[ ] ( [ | ( )]) /

cc

setup c

E X X ruE T E X X r

[ ] : avg file size[ ] : avg call setup delay

: crossover file size : circuit rate

setup

c

E XE T

r

Assuming 10Mbps signaling link, 100Mbps data link, 20 switches, in order to achieve 90% per-circuit utilization

Crossover file size

HW sig. (4us)

SW sig. (200ms)

Metro area (0.1ms)

40KB 80MB

Wide area (50ms) 330KB 80MB

Define a crossover file size , per-circuit utilization is given by

Page 32: Hardware-Accelerated Signaling

3223/4/22

Fractional offered load Assuming file size follows pareto distribution

Define fractional offered load 1' ( ) [ ] | ] ( )

( )kP X E X X

E X

' % of40KB 81%330KB 71%80MB 51%

With hardware-accelerated signaling In metro area, 81% of the offered load can be

transferred through end-to-end circuits In wide area, 71% of the offered load can be

transferred through end-to-end circuits With software signaling, this number is 51%

Page 33: Hardware-Accelerated Signaling

3323/4/22

Hardware-accelerated signaling and network survivability Two approaches for network survivability

Protection requires pre-allocated resources and reacts to failure rapidly SONET APS requires 100% resource redundancy (1+1) and can

recover in 50ms Restoration dynamically sets up a secondary path after a

failure - less resource redundancy but longer recovery delay Hardware-accelerated signaling

Less resource redundancy and acceptable recovery delay (<200ms)

A sample network with 13 nodes and 22 links Recover from one link failure in 200ms with 9% extra resources

Page 34: Hardware-Accelerated Signaling

3423/4/22

Outline Background and problem statement OCSP: a performance-oriented signaling

protocol and its hardware implementation A subset of RSVP-TE signaling protocol and its

hardware implementation Comparison of signaling transport options Implications of hardware-accelerated signaling Conclusions and future work

Page 35: Hardware-Accelerated Signaling

3523/4/22

Conclusions and future work Hardware-accelerated signaling is feasible and

our implementation demonstrates a 100x-1000x speedup vis-à-vis software implementations

Applications like large file transfer and network restoration can benefit from hardware signaling and a well-devised signaling transport scheme

Future work Finish the design and testing of the prototype board New architectures and applications that can fully

utilize the benefit of hardware-accelerated signaling

Page 36: Hardware-Accelerated Signaling

3623/4/22

Thank you!