ece 526 – network processing systems design

26
ECE 526 – Network ECE 526 – Network Processing Systems Processing Systems Design Design Network Processor Tradeoffs and Examples Chapter: D. E. Comer

Upload: mary-beard

Post on 04-Jan-2016

35 views

Category:

Documents


0 download

DESCRIPTION

ECE 526 – Network Processing Systems Design. Network Processor Tradeoffs and Examples Chapter: D. E. Comer. Outline. Network Processor design tradeoffs Sample Network Processor. NP Architecture. Numerous different design goals Performance Cost Functionality Programmability - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: ECE 526 – Network Processing Systems Design

ECE 526 – Network ECE 526 – Network Processing Systems Processing Systems

DesignDesignNetwork Processor Tradeoffs and Examples

Chapter: D. E. Comer

Page 2: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 2

OutlineOutline• Network Processor design tradeoffs• Sample Network Processor

Page 3: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 3

NP ArchitectureNP Architecture• Numerous different design goals

─ Performance─ Cost─ Functionality─ Programmability

• Numerous different system choices─ Use of parallelism─ Types of memories─ Types of interfaces─ Etc.

• We consider─ Design tradeoffs on high level (qualitative tradeoffs)─ Commercial Network Processors

Page 4: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 4

Processor TopologiesProcessor Topologies• How can processors be arranged on NP?

─ Consider heterogeneity of processing resources and workload

• Multiprocessor─ Parallel processors with shared interconnect─ Problems?

• Pipeline─ Multiple processors per data path─ Problems?

• Data Flow Architecture─ Extreme form of pipelining─ Problems?

• Heterogeneous Architectures

Page 5: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 5

Design Tradeoffs (1)Design Tradeoffs (1)• Low development cost vs. performance

─ ASICs give higher performance, but take time to develop─ NPs allow faster development, but might give lower

performance

• Programmability vs. processing speed─ Similar to tradeoff between ASIC and NP─ Co-processors pose the same tradeoffs─ Complexity of instruction set

• Performance: packet rate, data rate, and bursts─ Difficult to assess the performance of a system─ Even more difficult to compare different systems

• Per-interface rate vs. aggregate data rate─ NP usually limited to one port

Page 6: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 6

Design Tradeoffs (2)Design Tradeoffs (2)• NP speed vs. bandwidth

─ How much processing power per bandwidth is necessary?─ Depends on application complexity

• Coprocessor design: look aside vs. flow-through─ Look aside: “called” from main processor, need state

transfer─ Flow-through: all traffic streams through coprocessor

• Pipelining: uniform vs. synchronized─ Pipeline stages can take different times─ Tradeoff between slowing down or synchronization

• Explicit parallelism vs. cost and programmability─ Hidden parallelism is easier to program─ Explicit parallelism is cheaper to implement

Page 7: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 7

Design Tradeoffs (3)Design Tradeoffs (3)• Parallelism: scale vs. packet ordering

─ Why is packet order important?─ Giving up packet order constraint gives better

throughput

• Parallelism: speed vs. stateful classification─ Shared state requires synchronization─ Limits parallelism

• Memory: speed vs. programmability─ Different types of memories give performance─ Increases difficulty in programming

• I/O performance vs. pin count─ Packaging can be major cost factor─ More pins give higher performance

Page 8: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 8

Design Tradeoffs (4)Design Tradeoffs (4)• Programming languages

─ Ease of programming vs. functionality vs. speed

• Multithreading: throughput vs. programmability─ Threads improve performance─ Threads require more complex programs and

synchronization

• Traffic management vs. blind forwarding at low cost─ Traffic management is desirable but requires processing

• Generality vs. specific architecture role─ NPs can be specialized for access, edge, core─ NPs can be specialized towards certain protocols

• Memory type: special-purpose vs. general-purpose─ SRAM and DRAM vs. CAM

Page 9: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 9

Design Tradeoffs (5)Design Tradeoffs (5)• Backward compatibility vs. architectural

advances─ On component level: e.g., memories DDR DRAM─ On system level: NP needs to fit into overall router

system

• Parallelism vs. pipelining─ Depends on usage of NP

• Summary:─ Lots of choices─ Most decisions require some insight in expected NP

usage─ Tradeoffs are all qualitative

• Lets look at the commercial design

Page 10: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 10

Novel Areas of NP UseNovel Areas of NP Use• TCP/IP offloading on high-performance servers• Security processing: SSL offloading• Storage area networks• Many others: IDSs and etc.

Page 11: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 11

Performance BottlenecksPerformance Bottlenecks• Memory

─ Bandwidth available, but access time too slow─ Increasing delay for off-chip memory

• I/O─ High-speed interfaces available─ Cost problem with optical interfaces─ Otherwise no problem

• Processing power─ Individual cores are getting more complex─ Problems with access to shared resources─ Control processor can become bottleneck

Page 12: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 12

Limitations on ScalabilityLimitations on Scalability• What are the limitations on how fast NPs need to

get?─ Link rates (optical bandwidth limits)─ Application complexity (core vs. edge)

• What are the limitations on how fast NPs can get?─ Parallelism in networks─ Power consumption─ Chip area

Page 13: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 13

Commercial Network Commercial Network ProcessorsProcessors

• Commercial NPs─ Large variety of architectures─ Different applications and performance spaces─ Lots of implementation details and practical issues

• General Themes─ Type and number of processors

• Homogeneous vs. heterogeneous

─ Type and size of memories─ Internal and External communications channels─ Mechanisms of scalability: parallelism and pipelining─ Generality vs. specialization

Page 14: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 14

Intel IXP1200: external Intel IXP1200: external connectionconnection

Page 15: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 15

Intel IXP1200: internal Intel IXP1200: internal architecturearchitecture

Page 16: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 16

Cisco PXFCisco PXF

Page 17: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 17

Motorola C-Port: conceptual Motorola C-Port: conceptual designdesign

Page 18: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 18

Motorola C-Port: internal Motorola C-Port: internal architecturearchitecture

Page 19: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 19

Motorola C-Port: channel Motorola C-Port: channel processorprocessor

Page 20: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 20

IXP2400IXP2400• XScale (ARM compliant) embedded control

processor─ Instruction and data caches

• 8 microengines─ 400 or 600 MHz

• 8 threads per microengine• Multiple instruction stores with 4k instructions• 256 general purpose registers• 512 transfer registers• 2GB addressable DDR-DRAM memory (19.2 Gbps)• 32MB addressable QDR-SRAM memory (12 Gbps

r+w)• 16 words of Next Neighbor Registers• 16kB scratchpad

Page 21: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 21

IXP2400IXP2400• Interconnects

─ Coprocessor bus added (incl. access to T-CAM)

─ Flow control bus for two-chip configurations (e.g., ingress and egress)

• Switch Fabrics─ No IX bus─ Utopia 1, 2, 3─ CSIX-L1─ SPI-3 (POS-PHY

2/3)

Page 22: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 22

Two-Chip ConfigurationsTwo-Chip Configurations• Flow control needed between ingress and

─ 1Gbps over flow control bus (not shown)

Page 23: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 23

IXP2400 Internal IXP2400 Internal ArchitectureArchitecture

Page 24: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 24

IXP2400 MicroengineIXP2400 Microengine• Enhancements over IXP1200 microengines:

─ Multiplier unit─ Pseudo-random number generator─ CRC calculator─ 4 32-bit timers and timer signaling─ 16-entry CAM for inter-thread communication─ Time stamping unit─ Generalized thread signaling─ 640 words of local memory─ Simultaneous access to packet queues without mutual

exclusion─ Functional units for ATM segmentation and reassembly─ Automated byte-alignment─ uE divided into two clusters with independent command

and SRAM buses

Page 25: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 25

SoftwareSoftware• Support for software pipelining

─ “Reflector Mode Pathways” for communication─ Next Neighbor Registers as programming abstraction

• SDK 4.0─ Simulator, debugger, profiler, traffic generator─ Portable modules─ Provides better infrastructure support─ C compiler

Page 26: ECE 526 – Network Processing Systems Design

Ning Weng ECE 526 26

SummarySummary• Network Processor design space is big due to

─ Varying design goals─ Varying implementation choices

• Qualitative tradeoffs• Survey commercial NPs• Network processors are getting more features• Main architecture characteristic is still parallelism• Software support is becoming more important