Transcript

The Concurrent Matching Switch Architecture

Bill Lin (University of California, San Diego)

Isaac Keslassy (Technion, Israel)

IEEE INFOCOM, Barcelona, April 23-29, 2006 2

Motivation

Traffic demands expected to grow, driven in part by increasing broadband adoption 10x increase in broadband subscription in just last 3

years, already over 100 million subscribers 1.25-2.4 Gbps fiber to homes emerging (GPON,

GEPON, EPON, BPON …)

Larger routers needed for consolidation

Operators need scalable routers that provide good performance

IEEE INFOCOM, Barcelona, April 23-29, 2006 3

Limitations of Previous Routers

Output-Queueing (OQ) Switch Well-known to provide good performance, but

scalability hampered by need for internal N speedup

Crossbar Switches, using Input-Queueing (IQ) or Combined Input-Output Queueing (CIOQ)

Huge body of literature, but scalability hampered by need for centralized scheduling and arbitrary per-packet switch configurations

IEEE INFOCOM, Barcelona, April 23-29, 2006 4

Limitations of Previous Routers

Load-Balanced Routers No centralized scheduler Scalable fixed configuration switch fabric in optics Guarantees 100% throughput 100 Tb/s design with 160 Gb/s linecards shown

But packets may be delivered “out-of-order”

IEEE INFOCOM, Barcelona, April 23-29, 2006 5

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R

R

R

R/N

R/N

R/NR/N

R/N

R/N

R/N

Basic Load-Balanced Router

R/NR/N

R/NR/N

In

In

In

LinecardsLinecards LinecardsA1A1A2A2A3A3

B1B1

C1C1C2C2

B1B1B2B2

C1C1

IEEE INFOCOM, Barcelona, April 23-29, 2006 6

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R

R

R

R/N

R/N

R/NR/N

R/N

R/N

R/N

Basic Load-Balanced Router

R/NR/N

R/NR/N

In

In

In

LinecardsLinecards Linecards

A1A1

A2A2

A3A3

B1B1C1C1

C2C2B1B1

B2B2C1C1

Many Fabric Options (any spreading device)

Space: Full uniform mesh Wavelength: Static WDM Time: Round-robin switches

Just need fixed uniform rate channels at R/N

No dynamic switch reconfigurations

Many Fabric Options (any spreading device)

Space: Full uniform mesh Wavelength: Static WDM Time: Round-robin switches

Just need fixed uniform rate channels at R/N

No dynamic switch reconfigurations

IEEE INFOCOM, Barcelona, April 23-29, 2006 7

Out

Out

Out

R

R

R

R/N

R/N

R/N

R/N

R/N

R/N

R/N

R

R

R

R/N

R/N

R/NR/N

R/N

R/N

R/N

Basic Load-Balanced Router

R/NR/N

R/NR/N

In

In

In

LinecardsLinecards Linecards

A1A1

A2A2

A3A3

B1B1C1C1

C2C2B1B1

B2B2C1C1

Out ofOrder !

IEEE INFOCOM, Barcelona, April 23-29, 2006 8

Packet Ordering Problem

Out-of-order packet delivery is undesirable(e.g. bad for TCP)

Previous techniques (e.g. EDF, UFS, FOFF) Accumulate and delay packets at input/middle ports And/or delay and re-order packets at middle/output ports

However, these techniques are unsatisfactory because they add substantial delays

IEEE INFOCOM, Barcelona, April 23-29, 2006 9

Impact on Avg. Delay(N = 128, uniform traffic)

Basic Load-Balanced

UFSFOFF

SignificantDelay

IEEE INFOCOM, Barcelona, April 23-29, 2006 10

Concurrent Matching Switch (CMS)

Basic idea Retain load-balanced router structure and scalability of a

fixed optical mesh, no dynamic reconfiguration Instead of packets, load-balance “request tokens” to N

parallel “schedulers” Each scheduler independently solves its own matching Packets delivered in order based on matching results

Goal is to provide much lower average delay than accumulation-based methods for ensuring packet

order while retaining 100% throughput and scalability

Goal is to provide much lower average delay than accumulation-based methods for ensuring packet

order while retaining 100% throughput and scalability

IEEE INFOCOM, Barcelona, April 23-29, 2006 11

Out

Out

Out

R

R

R

R

R

R

ArchitectureLinecards LinecardsLinecards

A1A1

B1B1

C1C1C2C2C1C1C1C1

B2B2

C2C2

Retain Fixed Configuration

Meshes

BUT move packet buffers

to INPUT

A2A2A3A3A4A4

IEEE INFOCOM, Barcelona, April 23-29, 2006 12

Out

Out

Out

R

R

R

R

R

R

ArchitectureLinecards LinecardsLinecards

A1A1

B1B1

C1C1C2C2C1C1C1C1

B2B2

C2C2

A2A2A3A3A4A4 201

101

100

001

001

011

010

000

000

Add N2 Token

Counters

IEEE INFOCOM, Barcelona, April 23-29, 2006 13

Out

Out

Out

R

R

R

R

R

R

Arrival PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2C1C1C1C1

C2C2

A2A2A3A3A4A4 201

101

100

001

001

011

010

000

000

B1B1B1B1B2B2

A1A1A1A1A2A2

B1B1B2B2

C2C2C3C3C4C4

IEEE INFOCOM, Barcelona, April 23-29, 2006 14

Out

Out

Out

R

R

R

R

R

R

Arrival PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2C1C1C1C1

C2C2

A2A2A3A3A4A4 201

101

100

101

001

011

110

000

100

B1B1B1B1B2B2

B1B1B2B2

C2C2C3C3C4C4

A1A1A1A1A2A2

IEEE INFOCOM, Barcelona, April 23-29, 2006 15

Out

Out

Out

R

R

R

R

R

R

Arrival PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2C1C1C1C1

C2C2

A2A2A3A3A4A4 211

101

100

101

011

011

110

010

100

B1B1B1B1B2B2

A1A1A1A1A2A2

B1B1B2B2

C2C2C3C3C4C4

IEEE INFOCOM, Barcelona, April 23-29, 2006 16

Out

Out

Out

R

R

R

R

R

R

Arrival PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2C1C1C1C1

C2C2

A2A2A3A3A4A4 211

101

100

101

011

012

111

010

101

B1B1B1B1B2B2

A1A1A1A1A2A2

B1B1B2B2

C2C2C3C3C4C4

IEEE INFOCOM, Barcelona, April 23-29, 2006 17

Out

Out

Out

R

R

R

R

R

R

Matching PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2

A2A2A3A3A4A4 211

101

100

101

011

012

111

010

101

B1B1B1B1B2B2

A1A1A1A1A2A2

B1B1B2B2

C1C1C2C2C1C1C2C2C3C3C4C4

IEEE INFOCOM, Barcelona, April 23-29, 2006 18

Out

Out

Out

R

R

R

R

R

R

Matching PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2

211

101

100

101

011

012

111

010

101

B1B1

A2A2A3A3A4A4B1B1A1A1

A1A1A2A2 C1C1

B1B1B2B2B2B2

C2C2C1C1C2C2C3C3C4C4

IEEE INFOCOM, Barcelona, April 23-29, 2006 19

Out

Out

Out

R

R

R

R

R

R

Matching PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2C2C2

111

001

000

100

001

002

110

000

100

B1B1 A2A2

A3A3

A4A4B1B1

B2B2

C3C3C4C4

A1A1A1A1A2A2 C1C1

B1B1C1C1

B2B2C2C2

IEEE INFOCOM, Barcelona, April 23-29, 2006 20

Out

Out

Out

R

R

R

R

R

R

Departure PhaseLinecards LinecardsLinecards

A1A1

C1C1C2C2C2C2

111

001

000

100

001

002

110

000

100

B1B1 A2A2

A3A3

A4A4B1B1

B2B2

C3C3C4C4

A1A1A1A1A2A2 C1C1

B1B1C1C1

B2B2C2C2

IEEE INFOCOM, Barcelona, April 23-29, 2006 21

Distributed Operation

All linecards operate in parallel in a fully distributed manner

Arrival, matching, and departure phases overlap in a pipeline manner

IEEE INFOCOM, Barcelona, April 23-29, 2006 22

Main Ideas

Each middle linecard acts as a “micro-router” with 1/Nth of the arrival traffic

Therefore, it gets N time slots to think about the schedule, time complexity amortized by a factor of N

If each micro-router can guarantee 100% throughput, so can the overall switch

Each micro-router can work the way that it wants, leveraging huge body of existing work on scheduling

CMS provides a new way of aggregating routers together. Therefore, provides a new way of thinking

about scaling routers.

CMS provides a new way of aggregating routers together. Therefore, provides a new way of thinking

about scaling routers.

IEEE INFOCOM, Barcelona, April 23-29, 2006 23

Practicality

Well-studied randomized approximations to Maximum Weighted Matching have been shown to achieve very good results [Tassiulas 1998] [Giaccone, Prabhakar & Shah, 2003]

These algorithms only require O(N) complexity using sequential hardware, but can provide 100% throughput guarantees with no speedup and good delay results

Amortized over N time slots, CMS with these scheduling algorithms can achieve O(1) time complexity (independent of switch size) 100% throughput Good delay results Packet ordering

IEEE INFOCOM, Barcelona, April 23-29, 2006 24

Experimental Results(N = 128, uniform traffic)

Basic Load-Balanced

UFSFOFFCMS

Difference of N time slots for matching phase

IEEE INFOCOM, Barcelona, April 23-29, 2006 25

Conclusions

CMS is scalable Leverages scalability of fixed optical meshes Fully distributed Can achieve O(1) time complexity

CMS achieves good performance Guarantees 100% throughput Guarantees packet ordering Experimentally achieves low packet delays

CMS provides new way of thinking about scaling routers and connects huge body of existing literature on scheduling to load-balanced routers

Thank You


Top Related