ixp lab 2012: part 1

40
IXP Lab 2012: Part 1 Network Processor Brief

Upload: carla-medina

Post on 04-Jan-2016

46 views

Category:

Documents


0 download

DESCRIPTION

IXP Lab 2012: Part 1. Network Processor Brief. Outline. Network Processor Intel IXP2400 Processing Element Register Memory Interface IXP Programming Language Programming Model Programming Syntax. Router Development (1). Software Based General Purpose Processor Flexible - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: IXP Lab 2012: Part 1

IXP Lab 2012: Part 1

Network Processor Brief

Page 2: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 2

Outline Network Processor Intel IXP2400

Processing Element Register Memory Interface

IXP Programming Language Programming Model Programming Syntax

Page 3: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 3

Router Development (1)

Software Based General Purpose Processor

Flexible Poor Performance …

Hardware Based ASIC

Best Performance Long Development Time

Page 4: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 4

Router Development (2)

Network Processor (NPU) Based Balance of both How ?

Parallel processors Multi-threaded cores Programmable processors with

nonprogrammble copressors

Page 5: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 5

Network Processor Overview

For high speed packet processing Comprise Multi-Cores for Parallel

executing Multi-Threaded Core Reduced Instruction Set Multiple Memory Interfaces

Page 6: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 6

Hierarchical Layer Data-Plane

Fast-Path Slow-Path

Control-Plane Routing Protocol

Management-Plane Monitor Applications User Interface

Page 7: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 7

Data-Plane

Fast-Path General Packet Handling As fast as possible

Slow-Path Exception Packet Handling

Packet with options Local TCP/IP Stack

Page 8: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 8

Internet eXchange Processor First Generation

IXP1200, IXP1240, IXP1250 Second Generation

IXP2400, IXP2800, IXP2850 IXP2805, IXP2855

Others IXP4XX

Page 9: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 9

Network Flow Processor

By Netronome From Intel IXP2XXX NFP-3240, NFP-3216

Page 10: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 11

Intel IXP2400 Block Diagram

Page 11: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 12

IXP2400 Overview

Functional Block Processing Element Memory Interfaces Coprocessors Other Interfaces

Hierarchical View

Page 12: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 13

Processing Element

Programmability Hierarchical Processing Elements

XScale Microengine (ME)

Page 13: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 14

XScale

RISC based processor (ARMV5TE) Real-time OS

Montavista Linux ME Management

Control ME execution Resource Management

Page 14: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 15

MicroEngine (1)

Eight MEs per IXP2400 (work in parallel)

Eight Threads per ME Instruction set of ME are reduced

for packet processing only Not as powerful as general processor No floating point related instructions No divide instruction

Page 15: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 16

MicroEngine (2)

No OS Not interactive Managed by XScale

Code Store (4K Instrcutions) Executing

Page 16: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 17

MicroEngine Threads

Concurrent Executing No Preemptive Round Robin Executing Each thread own its private set of

registers Zero-Overhead Context Switching

Page 17: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 18

Registers of ME 256 GPRs 256 SRAM Transfer Registers

128 Read 128 Write

256 DRAM Transfer Registers 128 Read 128 Write

128 Next Neighbor Registers

Page 18: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 19

Context Switch

Content of registers needs not be swap-out and swap-in during context switching

With the mechanism, another thread can swap in and doing some useful task to cover the long latency when the previous thread has swapped out for issues a memory request

Page 19: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 20

Memory Interface of IXP2400 Local Memory

Smallest and Fastest Scratchpad

Passing handle of the packet SRAM

Hold data structure for packet processing DRAM

Largest and Slowest Hold packet’s content

Page 20: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 21

Local Memory Per ME Private to Other MEs Private to XScale Size: 2560 Bytes (640 LWs) Usage

Variable Spilling Caching

Latency: 3 cycles

Page 21: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 22

Scratchpad

On-Chip Memory Shared by all MEs Size: 16KB (Fixed) Usage:

Scratchpad Scratch Ring (Hardware FIFO)

Latency: ~60 cycles

Page 22: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 23

SRAM Off-Chip Memory Shared by all MEs (2-channels) Size: 64 MB (Per Channel at

Maximum) Usage:

Hardware FIFO Hold data structure Hold Meta-data of packets

Latency: ~90 cycles

Page 23: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 24

DRAM

Off-Chip Memory Shared by all MEs (1-channels) Size: 1 GB (at Maximum) Usage:

Hold whole packet contents Alternative space for data structure

Latency: ~120 cycles

Page 24: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 25

Coprocessor MSF (Media Switch Fabric)

Receive Packet to DRAM Transmit Packet from DRAM

SHaC Scratchpad Hash Unit CAP

Page 25: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 26

Packet META-DATA (1)

Data for processing packets How to identify packet?

Packet Handle Packet Temporal Information

Non-related to packet content Meta-data

Input Port, Output Port Info for Packet Address in DRAM

Page 26: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 27

Packet META-DATA (2)

How to pass these info between ME? Hardware FIFO

Scratch Ring SRAM Ring Next-Neighbor Ring

Issues

Page 27: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 28

Hierarchical View (Setting #1) Only one IXP2400 based board Data-Plane

Fast-Path: Microengine Slow-Path: XScale

Control-Plane XScale

Management-Plane XScale

Page 28: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 29

Hierarchical View (Setting #2) Multiple IXP2400 based boards Data-Plane

Fast-Path: Microengine Slow-Path: XScale

Control-Plane CPU

Management-Plane CPU

Page 29: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 30

Programming IXP2400

XScale Programming with C

Microengine Programming with MicroC or

Microcode We will focus on this part !

Page 30: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 31

IDE Tool--IXA SDK Workbench

Page 31: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 32

ME Language

MicroC Subset of ANSI C Only limited part of standard C

libraries are implemented Intrinsic Library for supporting

operations of IXP Microcode

High level of assembly

Page 32: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 33

Programming Model (1)

Receive – Processing – Transmit Intel has provided sample code for

receive and transmit. We only focus on the part of

processing.

RX PROCESSING TX

Page 33: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 34

Programming Model (2)

Processing ME Pipeline Model Parallel Model Mixed Model

RX PROCESSING TX

Page 34: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 35

Pipeline Model

RX TXPROC #1 RPOC #2

•Control the whole resource of ME

•Hard to balance between different stage

Page 35: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 36

Parallel Model

RX TX

PROC #1

RPOC #2

•Balance is easy

•Higher Performance

•Resource is limited

Page 36: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 37

Mixed Model

RX TX

PROC #1

RPOC #2

PROC #3

Page 37: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 38

MicroC Example 1 (1)void main () {

_declspec(shared sram) int old_array[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };_declspec(shared sram) int new_array[sizeof(old_array)/sizeof(int)];

global_label("start_reverse");reverse_array(old_array, new_array,

sizeof(old_array)/sizeof(int));global_label("end_reverse");

}

Page 38: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 39

MicroC Example 1 (2)

void reverse_array(volatile int* old, volatile int* new, int

size) { int index = 0;

for (index = 0; index < size; index++) {new[index] = old[size - index - 1];

}}

Page 39: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 40

MicroC Example 2

sram_read(&sram_egt_dim1_2_node, (__declspec(sram) unsigned int *)(PACKET_CLASSIFICATION_SRAM_BASE1 + current*8), 2, sig_done, &sram_read_sig_dim1_2);

__wait_for_all(&sram_read_sig_dim1_2);temp = sram_egt_dim1_2_node.next_dim;

Page 40: IXP Lab 2012: Part 1

NCKU CSIE CIAL Lab 42

1. COPY IXA_SDK_3.51, ixp_book 到 D:\ ; 再 reboot

3.[Ctrl+Enter] 進還原卡總管模式 4.Password: davidchang 5. 解壓縮 ixasdk351cd1windows.zip,

ixasdk351cd3.zip, ixasdk351framework.zip, 再依序安裝 (cd1 裝完後需 reboot)

6. 把 ixp_book 目錄 COPY 到 C:\