lecture 1

13
1 Switch and Router Architectures Prof. David Hay Rothberg A411 [email protected] Introduction 1-1 Our Course Lectures twice a week Tuesdays, 12:0013:45, Rothberg A510 Wednesday 10:0011:45, Rothberg A510 50% Home assignments 50% Final Project Introduction 1-2 Calendar Sunday Monday Tuesday Wednesday Thursday 1 2 3 4 5 8 9 10 11 12 15 16 17 18 19 12 13 14 15 16 19 20 21 22 23 26 27 28 29 30 Introduction 1-3 Passover Vacation March April Syllabus Algorithms at the end of the wire Design principles Scheduling algorithms for packet switches Packet classificaVon (algorithms/hardware) AcVve queue management (=buffer management) Switch and router architectures Middleboxes So_ware Defined Networks/Network FuncVon VirtualizaVon Introduction 1-4

Upload: ijazkhan

Post on 07-Nov-2015

213 views

Category:

Documents


1 download

DESCRIPTION

Lecture 1

TRANSCRIPT

  • 1

    Switch and Router Architectures

    Prof. David Hay Rothberg A-411 [email protected]

    Introduction 1-1

    Our Course

    Lectures twice a week Tuesdays, 12:00-13:45, Rothberg A-510 Wednesday 10:00-11:45, Rothberg A-510

    50% - Home assignments 50% - Final Project

    Introduction 1-2

    Calendar

    Sunday Monday Tuesday Wednesday Thursday

    1 2 3 4 5

    8 9 10 11 12

    15 16 17 18 19

    12 13 14 15 16

    19 20 21 22 23

    26 27 28 29 30

    Introduction 1-3

    Passover Vacation

    Mar

    ch

    Apr

    il

    Syllabus

    Algorithms at the end of the wire Design principles Scheduling algorithms for packet switches Packet classicaVon (algorithms/hardware) AcVve queue management (=buer management)

    Switch and router architectures Middleboxes So_ware Dened Networks/Network FuncVon VirtualizaVon

    Introduction 1-4

  • 2

    Syllabus

    Focuses on the a single device (mostly) Switches Routers Firewalls Network Intrusion DetecVon Systems (NIDS)

    Interdisciplinary subject System design Hardware Algorithms Queuing Networking

    Course Material

    Mostly, only in scienVc papers (very recent stu)

    Some material appears in Computer Networking: A Top Down Approach. Jim Kurose, Keith Ross, 2007.

    Network Algorithmics: An Interdisciplinary Approach to Designing Fast Networked Devices. George Varghese, 2004.

    Lecture notes at Stanford University hip://www.stanford.edu/class/ee384x/

    Introduction 1-6

    http://www.adalovelace.com

    001100 011001 011001 11011 110101

    Ada Lovelace

    http://www.adalovelace.com

    Data Head Head Head

    Routers process headers

  • 3

    DeniVons

    1 2

    3 4

    5 6

    7 8

    N

    R

    N = number of linecards. Typically 8-32 per chassis R = line-rate. 1Gb/s, 2.5Gb/s, 10Gb/s, 40Gb/s, 100Gb/s

    Capacity of router = N x R

    For R=40 Gb/s, packet= 40b:

    Packets per second = 109 Time for each packet = 1 ns

    What a Big Router Looks Like Cisco GSR 12816 Juniper T640

    1.8 m

    0.5 m

    0.6 m

    Capacity: 640Gb/s Power: 5kW

    0.9 m

    0.75 m

    0.5 m

    Capacity: 320Gb/s Power: 3kW

    What MulVrack Routers Looks Like?

    Cisco CRS-1 Juniper T1600 + TX Matrix

    01000111100010101001110100011001

    1. Internet Address 2. Age 3. Checksum to protect header

    Header Data

  • 4

    Lookup internet address Check and update age

    Check and update checksum

    Barebones Router

    Barebones Router Barebones Router

  • 5

    1 2

    Boilenecks Memory, memory,

    3

    Early days: Modied Computer

    R R R R

    R R R R

    Must run at rate N x R

    Bottlenecks

    2nd GeneraVon Router

    R R R R

    Early days: Modied Computer

    FuncVon more important than speed 1993 (WWW) changed everything We badly needed

    Some new architecture Some theory Some higher performance

  • 6

    N x R

    3rd GeneraVon Router: Switch

    1 x R

    Arbiter

    Arbiter Arbiter Arbiter Arbiter

    Arbiter Arbiter Arbiter Arbiter

    Arbiter

  • 7

    4th Generation Router Multirack; optics inside

    Switch Linecards

    Optical links

    100s of metres

    4th GeneraVon Router Mul$rack; op$cs inside

    Alcatel 7670 RSP Juniper TX

    TX

    More 4th GeneraVon Routers

    Avici TSR Cisco CRS-1

    Power consumpVon per chassis

    02468

    10121416

    1990 1993 1996 1999 2002 2003 2004

    Pow

    er (k

    W)

    4th Generation Router Multirack; optics inside

    Optical Switch

    Linecards

    Optical links

    100s of metres

    (Future) 5th GeneraVon Router? Op$cal switch fabric

  • 8

    Backbone Router Capacity

    29

    1987 1990 1993 1996 1999 2002 2005 2008

    1 Gb/s

    100 Gb/s

    10 Tb/s

    x 5 every 3 years

    10 Mb/s

    30

    Trend Does Not Seem to Stop

    Larger Bandwidth

    Bandwidth-Hungry Applications

    Why it is dicult to build faster routers?

    Link speeds followed Moores Law User demand doubled every year

    Router capacity limited by memory speed DRAM is no faster now than 10 years ago a Routers should have fallen behind

    Routers are even more complex

    End of the Internet end-to-end model +

    Uncertainty +

    Lack of compeVVon a Explosion of complexity in routers

    IPv6, mulVcast, ACLs, rewall, virtual rouVng, MPLS, Diserv, IntServ, RSVP, ATM, IP Tunnels, IPSec, VPNs, Virtual rouVng, Calea,

    aPower-hungry, expensive, unreliable aMany people predict signicant change in next few years (So_ware Dened Networks)

  • 9

    1 2

    Boilenecks Memory, memory,

    3 34

    Packet Processing Examples

    Address Lookup (IP/Ethernet) Where to send an incoming packet? Use output-port 3, to send packets to MAC address 01:23:45:67:89:ab Exact Match Use output-port 4, to send packets to des$na$on network 111.15/16 - (Longest Prex Match)

    Firewall, ACL

    Which packet to accept or deny? Drop all packets from evil source network 66.66/16 on ports 6-666

    Usually needs 5 elds: source-address, dest-address, source-port, dest-port, protocol

    35

    Packet Processing Examples

    Intrusion DetecVon Schemes Deep Packet inspecVon (DPI) Drop all packets that contains the string EvilWorm anywhere within the packet

    SNORT rule set

    Packet Processing Rate

    125 40Gb/s 31.25 10Gb/s 7.81 2.5Gb/s

    1.94 622Mb/s

    40B packets (Mpkt/s)

    Line

    1. Lookup mechanism must be simple and easy to implement 2. (Surprise?) Memory access time is the long-term bottleneck

  • 10

    Memory Technology (not up to date)

    Technology Single chip density

    $/chip ($/MByte)

    Access speed

    Watts/ chip

    Networking DRAM

    64 MB $30-$50 ($0.50-$0.75)

    40-80ns 0.5-2W

    SRAM 4 MB $20-$30 ($5-$8)

    4-8ns 1-3W

    TCAM 1 MB $200-$250 ($200-$250)

    4-8ns 15-30W

    Note: Price, speed and power are manufacturer and market dependent. Numbers are a bit outdated but give the general idea

    Simplest Task: Exact Matching

    Mostly in Layer 2 (bridges/switches) Connects two Ethernet networks Wire-speed forwarding:

    Each Vme a packet arrives at a switch, forward it according to the desVnaVon MAC address

    Store/update also the source MAC address (learning)

    Should be done at wire speed

    a b

    c d

    SoluVon 1: Binary Search

    MAC addresses have values which can be sorted Thus, when keeping them sorted, one can perform a binary search on the array and nd the right MAC address

    However, each iteraVon is a memory access log N memory accesses works ne (even using DRAM) for small speed, N (around 10Mb/s, 8K values) but doesnt scale for large N/higher speeds (not even for 100 Mb/s, 64K values)

    Using faster hardware (SRAM) wont really solve the problem (and it is more expensive)

    Scaling using Hashing

    Hashing is much faster than binary search on average, however much slower on the worst case (up to linear Vme)

    However, one can choose (pre-compute) good hash funcVons, so the number of collision can be small and bounded PrecomputaVon takes a lot of Vme, but addresses are not added in rapid

    rate Applying the hash funcVons is done on wire-speed

    More sophisVcated data structure/hashing techniques can also be applied (e.g. to reduce memory) Bloom Filters, ngerprinVng, etc.

  • 11

    Example (Gigaswitch, 1994)

    N = 64K; binary search takes 16 memory accesses For each 48-bit address addr, we rst apply h(addr), to get 48-bit value: 16 LSB are the hash-table entry index (64K entries) Each entry is a balanced binary tree of height at most 3, sorted by the remaining 32 MSB

    The hash funcVon should guarantee that no more than 8 addresses are in the same tree, and that we can disambiguate between addresses using the 32 MSB

    Solve corner-cases separately (CAM); rehashing 4 memory accesses

    IP Addressing

    11111111 00010001 10000111 00000000

    255 0 134 17

    255.17.134.0 Dotted quad notation

    IPv4 addresses have 32 bits From Griffin Tutorial in Sigcomm 2001

    Classful Addresses

    0nnnnnnn

    10nnnnnn nnnnnnnn

    nnnnnnnn nnnnnnnn 110nnnnn

    hhhhhhhh hhhhhhhh hhhhhhhh

    hhhhhhhh hhhhhhhh

    hhhhhhhh n = network address bit h = host identifier bit

    Class A

    Class C

    Class B

    Leads to a rigid, flat, inefficient use of address space

    44

    ExponenVal Growth in RouVng Table Sizes

    Num

    ber o

    f BG

    P ro

    utes

    adv

    ertis

    ed

  • 12

    CIDR: Hierarchal Address AllocaVon

    12.0.0.0/8

    12.0.0.0/16

    12.254.0.0/16

    12.1.0.0/16 12.2.0.0/16 12.3.0.0/16

    : : :

    12.3.0.0/24 12.3.1.0/24

    : :

    12.3.254.0/24

    12.253.0.0/19 12.253.32.0/19 12.253.64.0/19 12.253.96.0/19 12.253.128.0/19 12.253.160.0/19

    : : :

    Prefixes are key to Internet scalability Address allocated in contiguous chunks (prefixes) Routing protocols and packet forwarding based on prefixes Today, routing tables contain ~150,000-200,000 prefixes

    Hierarchical addressing: route aggregaVon

    Hierarchical addressing allows ecient adverVsement of rouVng informaVon:

    Send me anything with addresses beginning 200.23.16.0/20

    200.23.16.0/23

    200.23.18.0/23

    200.23.30.0/23

    Fly-By-Night-ISP

    Organization 0

    Organization 7 Internet

    Organization 1

    ISPs-R-Us Send me anything with addresses beginning 199.31.0.0/16

    200.23.20.0/23 Organization 2

    . . .

    . . .

    RFC 1519: Classless Inter-Domain RouVng (CIDR)

    IP Address : 12.4.0.0 IP Mask: 255.254.0.0

    00001100 00000100 00000000 00000000

    11111111 11111110 00000000 00000000

    Address

    Mask

    for hosts Network Prefix

    Use two 32 bit numbers to represent a network. Network number = IP address + Mask

    Usually written as 12.4.0.0/15

    Classless Forwarding

    Destination =12.5.9.16 ------------------------------- payload

    Prefix Interface Next Hop

    12.0.0.0/8 10.14.22.19 Output-port 2

    12.4.0.0/15

    12.5.8.0/23 10.1.3.23

    Output-port 3

    Output-port 4

    10.1.3.77

    IP Forwarding Table

    0.0.0.0/0 10.14.11.33 Output-port 1

    even better

    OK

    better

    best!

  • 13

    Longest Prex Match is Harder than Exact Match The desVnaVon address of an arriving packet

    does not carry with it the informaVon to determine the length of the longest matching prex

    Hence, one needs to search among the space of all prex lengths; as well as the space of all prexes of a given length

    Current PracVcal Data

    Caching works poorly in backbone routers 250,000 concurrent ows

    Wire speed lookup needed for 40-byte packets 50% are TCP acks 32 nsec/packet in 10 Gbs and 8 nsec/packet in 40 Gbs

    Lookup dominated by memory accesses speed is measured by memory accesses

    Prex length 8-32 Today 150,000 prexes with growth 1 million prexes

    Higher speeds need SRAM Worth minimizing memory

    Problem DeniVon

    192.2.0/22, R2 192.2.2/24, R3 192.2.0/22 200.11.0/22

    192.2.2/24

    200.11.0/22, R4

    200.11.0.33 192.2.0.1 192.2.2.100

    LPM: Find the most specific route, or the longest matching prefix among all the prefixes matching the destination address of an incoming packet

    LPM in IPv4 Use 32 exact match algorithms for LPM!

    Exact match against prefixes

    of length 1

    Exact match against prefixes

    of length 2

    Exact match against prefixes

    of length 32

    Network Address Port Priority Encode and pick

    We can start with prex length 8