ittc high-performance networking - university of kansas

63
High-Performance Networking The University of Kansas EECS 881 End Systems and Network Interface © 2004–2010 James P.G. Sterbenz 16 November 2010 James P.G. Sterbenz Department of Electrical Engineering & Computer Science Information Technology & Telecommunications Research Center The University of Kansas [email protected] http://www.ittc.ku.edu/~jpgs/courses/hsnets rev. 10.0

Upload: others

Post on 03-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

© James P.G. SterbenzITTCHigh-Performance Networking

The University of Kansas EECS 881End Systems and Network Interface

© 2004–2010 James P.G. Sterbenz16 November 2010

James P.G. Sterbenz

Department of Electrical Engineering & Computer ScienceInformation Technology & Telecommunications Research Center

The University of Kansas

[email protected]

http://www.ittc.ku.edu/~jpgs/courses/hsnets

rev. 10.0

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-2

© James P.G. SterbenzITTC

End Systems and Network InterfaceOutline

ES.1 End system componentsES.2 Protocol and OS softwareES.3 End system organisationES.4 Host–network interface architecture

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-3

© James P.G. SterbenzITTC

End Systems and Network InterfaceOutline

ES.1. End system componentsES.2. Protocols & OS software

ES.3. End system organisationES.4. Host–network interface

network

application

session

transport

network

link

end system

network

link

node

network

link

nodenetwork

link

node

application

session

transport

network

link

end system

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-4

© James P.G. SterbenzITTC

Ideal NetworkEnd System Principle

networkCPU

M app

end system

CPU

M app

end system

D = 0

R = ∞

infinite bandwidthzero latency

End System Principle E-II

The communicating end systems are a critical component in end-to-end communications and must provide a low-latency, high-bandwidth path between the network interface and application memory.

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-5

© James P.G. SterbenzITTC

End SystemsApplication Primacy

• End systems have limited resources to be used by– applications– inter-application communication

• Protocol processing – must not significantly interfere with applications themselves– protocol benchmarks must consider this

Application Primacy E-I

Optimisation of communications processing in the end system must not degrade the performance of the applications using the network.

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-6

© James P.G. SterbenzITTC

End Systems and Network InterfaceES.1 End System Components

ES.1 End system componentsES.1.1 End system hardwareES.1.2 End system softwareES.1.3 End system bottlenecksES.1.4 Traditional end system implementationES.1.5 Ideal end system implementation

ES.2 Protocol and OS softwareES.3 End system organisationES.4 Host–network interface architecture

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-7

© James P.G. SterbenzITTC

End System ComponentsHardware

CPU memory

I/O control

networkinterface

user I/O interface mass

storage

network

interconnect

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-8

© James P.G. SterbenzITTC

End System ComponentsSoftware

Operating System

memory management

process scheduler

I/O subsystem

Protocol Stack

Applications

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-9

© James P.G. SterbenzITTC

End System ComponentsTraditional End System Implementation

• Communication– handled as I/O

• I/O mechanisms notoptimised communication

– transfer per packet• multiple per ADU

• Protocol implementation– process per layer– multiple copies of data– many context switches

send request context switch

application program

transport protocol

network protocol

OS scheduler

IOP software

network interface

block

block

send datacontext switch

initiate I/OI/O request

block

process I/Ocontrol setup

transmit packet end I/O

service interupt I/O return

context switch send data

block

context switch

initiate I/OI/O request

block

process I/Ocontrol setup

transmit packet end I/O

service I/O return

context switch done

context switch continue

p e r

p a c k e t

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-10

© James P.G. SterbenzITTC

End System ComponentsEnd System Bottlenecks

Systemic Elimination of End System Bottlenecks E-IV

The host organisation, processor–memory interconnect, memory subsystem, operating system, protocol stack, and host–network interface are all critical components in end system performance, and must be optimised in concert with one another.

• Systemic elimination of bottlenecks is necessary– host organisation – operating system– memory subsystem – protocol stack– processor–memory interconnect

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-11

© James P.G. SterbenzITTC

End System ComponentsEnd System Bottlenecks

End System Layering Principle E-4A

Layered protocol architecture does not depend on a layered process implementation in the end system.

• More efficient protocol implementation– not process per layer

• reduce context switches• reduce copies

– don’t treat communications like I/O

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-12

© James P.G. SterbenzITTC

End System ComponentsImportance of Networking

Importance of Networking in the End System E-I.4

Networking should be considered a first-class citizen of the end system computing architecture, on a par with memory of high-performance graphics subsystems.

• Importance of networking in the end system– networking should be considered a first class citizen

• in system design• in performance specifications• in purchase decisions

– what do users do with their PCs? Web surf. P2P file sharing.

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-13

© James P.G. SterbenzITTC

End System ComponentsProtocol Constraints

Optimise and Enhance Widely Deployed Protocols E-III.7

The practical difficulty in replacing protocols widely deployed on end systems indicates that it is important to optimise existing protocol implementations and add backward-compatible enhancements, rather than only trying to replace them with new protocols.

• Widely deployed protocols are difficult to replace– important to optimise existing protocols– add backward-compatible enhancements for interoperability

• Replace with new protocols only when necessary

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-14

© James P.G. SterbenzITTC

Ideal End System ModelBandwidth and Latency

• Data shifted directly between application memory• But

– non-trivial latency• processor can’t block

– where to put data– channel not reliable

Consequence?

D = 0

R = ∞

ES1

CPU

VRAMapp

ES2

CPU

VRAMapp

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-15

© James P.G. SterbenzITTC

Ideal End System ModelBandwidth and Latency

• Data shifted directly between application memory• But

– non-trivial latency• processor can’t block

– where to put data– channel not reliable

• Need transport protocolCopy Minimisation Principle E-II.3

Data copying, or any operation that involves a separate sequential per byte touch of the data, should be avoided. In the ideal case, a host–network interface should be zero copy.

D = 0

R = ∞

ES1

CPU

VRAMapp

ES2

CPU

VRAMapp

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-16

© James P.G. SterbenzITTC

End Systems and Network InterfaceES.2 Protocol and OS software

ES.1 End system componentsES.2 Protocol and OS software

ES.2.1 Protocol softwareES.2.2 Operating systemsES.2.3 Protocol software optimisations

ES.3 End system organisationES.4 Host–network interface architecture

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-17

© James P.G. SterbenzITTC

Protocol and OS SoftwareCritical Path

• Critical path– operations required for data transfer

• bottlenecks

– operations that happen frequently have greater overall impact

criticalpath

I

Il

branch

loop

Critical Path Principle E-1B

Optimise end system critical path protocol processing software and hardware, consisting of normal data path movement and the control functions on which it depends.

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-18

© James P.G. SterbenzITTC

Protocol and OS SoftwareProtocol Processing Classes

• Data transfer– Data movement (to/from network and intra-host)– bit error detection and correction– buffering for retransmission– encryption/decryption– presentation formatting (e.g. ASN.1 or XDR)

+ These functions are part of the critical path

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-19

© James P.G. SterbenzITTC

Protocol and OS SoftwareProtocol Processing Classes

• Transfer control– flow and congestion control– lost and mis-sequenced packet detection– acknowledgements– multiplexing/demultiplexing flows– time stamping and clock recovery of real-time packets– formatting

• framing/delineation• encapsulation/decapsulation• fragmentation/reassembly

o These functions may be part of the critical path– analysis is needed to determine dependency

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-20

© James P.G. SterbenzITTC

Protocol and OS SoftwareProtocol Processing Classes

• Asynchronous control– connection setup and modification– per connection granularity flow and congestion control– routing algorithms and link state updates– session control

– These functions are not part of the critical path

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-21

© James P.G. SterbenzITTC

Protocol and OS SoftwareContext Switch Avoidance

• Context switches– transmission of packets– process per layer

Avoidance?

process 1

data 1

PCB 1a

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-22

© James P.G. SterbenzITTC

Protocol and OS SoftwareContext Switch Avoidance

• Context switches– transmission of packets– process per layer

• Avoidance– thread per layer– ILP

Context Switch Avoidance E-II.6a

The number of context switches should be minimised, and approach one per application data unit.

process 1

data 1

PCB 1a

process 2

data 2

PCB 2a PCB 2b

thread a

thread b

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-23

© James P.G. SterbenzITTC

Protocol and OS SoftwarePolling vs. Interrupts

• Interrupts incur significant overhead– force context switch to OS

• Polling– avoids overhead of context switch– requires knowledge of when information arrives

• polling interval critical to avoid wasted cycles

Interrupt vs. Polling E-4h

Interrupts provide the ability to react to asynchronous events, but are expensive operations. Polling can be used when a protocol has knowledge of when information arrives.

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-24

© James P.G. SterbenzITTC

Protocol and OS SoftwareKernel Crossing Avoidance

• User state– unprivileged– significant overhead

• authorisation• parameter checks• context switch

user space

application

network protocol

transport protocol

kernel

schedule

buffer manage

transmitreceive

multiplexdemux

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-25

© James P.G. SterbenzITTC

Protocol and OS SoftwareKernel Crossing Avoidance

• User state– unprivileged– significant overhead

• authorisation• parameter checks• context switch

• Kernel: trusted

User/Kernel Crossing Avoidance E-II.6k

The number of user space calls to the kernel should be minimised due to the overhead of authorisation and security checks, the copying of buffers, and the inability to directly invoke needed kernel functions.

user space

application

network protocol

transport protocol

kernel

schedule

buffer manage

transmitreceive

multiplexdemux

user space

application

kernel

network protocol

transport protocol

schedule

buffer manage

transmitreceive

multiplexdemux

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-26

© James P.G. SterbenzITTC

Protocol and OS SoftwareMemory Management and Remapping

• Virtual memory– translates addresses– avoids user ↔ kernel copy

• map both to same

r

real memory

virtual address spaces

kernel page table

kernelPCB

user1 page table

user1PCB

v.p v.o vk

kernel

user1

v.p v.o vu

Avoid Data Copies by Remapping E-II.3m

Use memory and buffer remapping techniques to avoid the overhead of copying and moving blocks of data.

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-27

© James P.G. SterbenzITTC

Protocol and OS SoftwareResource Reservation

• Application-to-application QOS requires– network over-provisioning or reservations– end system over-capacity or reservations

• CPU cycles• memory• bus or interconnect bandwidth

Path Protection Corollary E-II.2

In a resource constrained host, mechanisms must exist to reserve processing and memory resources needed to provide the high-performance path between application memory and the network interface and to support the required rate of protocol processing.

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-28

© James P.G. SterbenzITTC

Protocol and OS SoftwareOptimisations: Protocol Bypass

• Protocol bypass– critical path optimisation– receive and send bypass

• data manipulation• critical transfer control • shared data with stack

– normal protocol stack• non-critical path

end system

protocol stack

send bypass

receivebypass

shared data

shareddata

application

network

send filter template

receive filter template

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-29

© James P.G. SterbenzITTC

Protocol and OS SoftwareOptimisations: Integrated Layer Processing

ILP loop transport layer framing

payload checksum end-to-end encryptionnetwork layer framing

DMA copy

network layer framing

DMA transfer

end-to-end encryption

payload checksum

transport layer framing

IILP

Iƒ • Operations in single ILP loop– software or hardware

• Avoids overhead– inter process or thread– eliminates copy

Side effects?

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-30

© James P.G. SterbenzITTC

Protocol and OS SoftwareOptimisations: Integrated Layer Processing

ILP Principle E-4E

All passes over the protocol data units (including layer encapsulations/decapsulations) that take place in a particular component of the end system (CPU, network processor, or network interface hardware) should be done at the same time.

ILP loop transport layer framing

payload checksum end-to-end encryptionnetwork layer framing

DMA copy

network layer framing

DMA transfer

end-to-end encryption

payload checksum

transport layer framing

IILP

Iƒ • Operations in single ILP loop– software or hardware

• Avoids overhead– inter process or thread– eliminates copy

• Side effects– big cache miss penalty

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-31

© James P.G. SterbenzITTC

End Systems and Network InterfaceES.3 End System Organisation

ES.1 End system componentsES.2 Protocol and OS softwareES.3 End system organisation

ES.3.1 Host interconnectsES.3.2 Host–network interconnection alternativesES.2.3 Host–network interface issues

ES.4 Host–network interface architecture

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-32

© James P.G. SterbenzITTC

End System OrganisationHost Interconnects

• PIO– via CPU– 2 bus transfers

• DMA reduces contention• Separate P–M and I/O bus helps isolate I/O effects

CPU M

network DMA controller

peripherals

CPU M

processor–memory bus

IOC

I/O bus

peripheralsnetwork

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-33

© James P.G. SterbenzITTC

End System OrganisationBus Evolution

• Single bus (e.g. ISA)– early PCs and workstations

• Separate memory bus– isolates effects of slow I/O from CPU–memory and DMA– common in later high-performance worksations

• Richer interconnect– first used in 1960s and 1970s mainframes– finally emerging in PCs and workstations (e.g. InfiniBand)

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-34

© James P.G. SterbenzITTC

End System OrganisationNonblocking Host Interconnects

• Scalable host-interconnects– when bus interconnects saturate– used in high-performance systems– crossbar: O (n2) good for small n– n log(n ) for large n

Nonblocking Host–Network Interconnect E-II.4

The interconnect between the end system memory and the network interface should be nonblocking, and not interfere with peripheral I/O, and CPU–memory data transfer.

CPU

M

CPU

M

IOP

IOP

$

$

peripherals

mass storage

IOP network

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-35

© James P.G. SterbenzITTC

End System OrganisationNonblocking Host-Network Interconnects

• MIA – NP access to back end of memory• IIA – NP direct access to P–M interconnect

CPU

M

CPU

M

IOP

IOP

NP network

$

$

peripherals

mass storage

IIA

M M CMM

NP

network

CPU

CPU

IOP

IOP

$

$

peripherals

mass storage

MIA

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-36

© James P.G. SterbenzITTC

End System OrganisationSystem Area Networks

• Unification of host interconnect and network:– bringing the network into the end system– spreading the end system across the network

• System area networks (SAN)– ideas based on 1960s/1970s mainframe architectures– switched inter-CPU and I/O communication– technologies

• ESCON/FICON (enterprise systems / fiber connection)– originally extension of IBM sys/370, sys/390 channels

• FC switching: fibre channel• IBA: InfiniBand architecture

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-37

© James P.G. SterbenzITTC

End System OrganisationExample6.1 InfiniBand Architecture

• Infiniband SAN– 2.5 – 96 Gb/s

• Packet Switched communications architecture for:– IPC: HCA – host channel adapter– I/O: TCA – target channel adapter

• Switched interconnection– intra-subnet switches– inter-subnet routers

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-38

© James P.G. SterbenzITTC

End System OrganisationExample6.1 InfiniBand Architecture

• Single processor node architecture– single InfiniBand switch to I/O

[InfiniBand 1.2.1 2007]

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-39

© James P.G. SterbenzITTC

End System OrganisationExample6.1 InfiniBand Architecture

[InfiniBand 1.2.1 2007]

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-40

© James P.G. SterbenzITTC

End System OrganisationExample6.1 InfiniBand Architecture

• Protocol architecture [InfiniBand 1.2.1 2007]

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-41

© James P.G. SterbenzITTC

End System OrganisationExample6.1 RDMA Architecture

• RDMA: remote direct memory access [RFC 4050]– copy to/from remote memory without intermediate copies

• DDP: direct data placement [RFC 5041]– info to place incoming data directly into receive buffer

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-42

© James P.G. SterbenzITTC

End System OrganisationParallel Host–Network Interfaces

• Limited value in uniprocessors– protocols don’t parallelise well

• Useful for NUMA systems– e.g. hypercubes

Nonuniform Memory Multiprocessor–Network E-II.4m

Interconnect Message passing multiprocessors need sufficient network interfaces to allow data to flow between the network and processor memory without interfering with the multiprocessing applications.

network

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-43

© James P.G. SterbenzITTC

End Systems and Network InterfaceES.4 Host–Network Interface Architecture

ES.1 End system componentsES.2 Protocol and OS softwareES.3 End system organisationES.4 Host–network interface architecture

ES.4.1 Offloading of communication processingES.4.2 Network interface designES.4.3 High-speed encryption

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-44

© James P.G. SterbenzITTC

Host–Network InterfaceOffloading Functionality

• Determine which functionality to implement in NI– trend in 1980’s research to offload everything to hardware– but systemic analysis required

• Candidate processing to offload– best done between NI and memory – done efficiently in specialised hardware (esp. commodity) – places significant burden on host (e.g. per bit/byte)

Host–Network Interface Functional Partitioning and E-4C

Assignment Carefully determine what functionality should be implemented on the network interface rather than in end system software

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-45

© James P.G. SterbenzITTC

Host–Network InterfaceOffloading Functionality

• Determine which functionality to implement host– implementing in hardware may not increase performance– some processing should take place in host

• ALF• part of ILP loop

Application Layer to Network Interface Synergy and E-4C

Functional Division Application and lower-layer data unit formats and control mechanisms should not interfere with one another , and the division of functionality between host software and the network interface should minimise this interference.

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-46

© James P.G. SterbenzITTC

Host–Network InterfaceOffloading TCP/IP Functionality

• Partial datapath offload to network interface– TCP segmentation offload / large send offload

• large VMTUs (jumbograms) moved host ↔ network interface

– TCP checksum offload

• TOE: TCP offload engines– datapath (partial) TOE

• reduced copies or RDMA (remote DMA) for zero copy• only beneficial for long flows

– full TOE• control and datapath

• Many emerging products: jury still out

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-47

© James P.G. SterbenzITTC

Host–Network InterfaceFunctional Partitioning

• Functional partitioning– hardware

• custom, ASIC,gate array

– software• network processor,

embedded controller

Network Interface Hardware Functional Partitioning E-1Ch

and Assignment Carefully determine what functionality should be implemented in network interface custom hardware, rather then on an embedded controller. Packet interarrival timedriven by packet size is a critical determinant of this decision.

# instruction cycles / packet Th

roug

hput

ssmall slarge

small packet

large packet

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-48

© James P.G. SterbenzITTC

Host–Network InterfaceNP Instruction Budgets

functionality: significant must be optimised infeasible

1B 32B 128B 1KB

i i i it

100MHz 1GHz 100MHz 1GHz 100MHz 1GHz 100MHz 1GHz

1 Mb/s 8μs 800 8000 250μs 25k 250k 1ms 100k 1M 8ms 800k 8M

10 Mb/s 800ns 80 800 250μs 2500 25k 100μs 10k 100k 800μs 80k 800k

100 Mb/s 80ns 8 80 250μs 250 2500 10μs 1000 10k 80μs 8000 80k

1 Gb/s 8ns 0 8 250ns 25 250 1μs 100 1000 8μs 800 8000

10 Gb/s 800ps 0 0 25ns 2 25 100ns 10 100 800ns 80 800

ttt

size

rate

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-49

© James P.G. SterbenzITTC

Host–Network InterfaceDesign Parameters

• Bandwidth– line rate / 8 determines required clock frequency

• Latency– latency budget needed by application

• interactive ≈ 100 ms• real time process control significantly lower

– fraction of end-to-end latency• LAN ≈ 10 μs for 1 km diameter

• Granularity– pipeline major cycle and buffer size

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-50

© James P.G. SterbenzITTC

Host–Network InterfaceNetwork Interface Design

host interconnect

receive pipeline

transmit pipeline

controlCMM

network

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-51

© James P.G. SterbenzITTC

Host–Network InterfaceNetwork Interface Design

• Receive pipeline

• Transmit pipeline

n e t wo r k

line coding

byte →serial

di

encryptbyte order

check sum

header /trailer

shift delay

rate / sched

me mo r y

n e t wo r k

me mo r y

line coding

serial → byte

decryptbyte order

check sum

header decode

shift delay

error control

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-52

© James P.G. SterbenzITTC

Host–Network InterfaceHigh-Speed Encryption

• Cipher types– stream: bit stream– block

• Encryption modesECB electronic codebook – single blockCBC cipher block chaining – parallelisation possible with muxCFB cipher feedbackOFB output feedbackCTR counter – fully parallelisable since blocks independent

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-53

© James P.G. SterbenzITTC

Host–Network InterfaceHigh-Speed Encryption

• Desirable characteristics– pipelinable: no feedback dependencies

• loop unrolling for multiple encryption rounds

– parallelisable: no interblock dependencies• CTR mode only needs block id

• Challenges– maintaining cryptographic synchronization

• out-of-band block-id for CTR mode

Critical Path Optimisation of Security Operations T-6Dc

Encryption and per packet authentication operations must be optimised for the critical path.

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-54

© James P.G. SterbenzITTC

Host–Network InterfaceHigh-Speed Encryption

• Encryption functions f– n pipeline stage delays over b blocks (parallel speedup)

f12

f11

f1n

f22

f21

f2n

fb2

fb1

fbn

ci

pi pi+1 pi+b

ci+1 ci+b

plaintext

ciphertext

key k

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-55

© James P.G. SterbenzITTC

Host–Network InterfaceExample6.2 Advanced Encryption Standard

• AES: advanced encryption standard [NIST FIPS-197]– replacement for DES for commercial/consumer encryption

• Rijndael algorithm chosen by competition– high-speed implementation was one criteria

• Designed for high-performance implementation– pipelinable sequence of rounds (internally pipelinable)– parallalisable in CTR mode

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-56

© James P.G. SterbenzITTC

Host–Network InterfaceExample6.2 AES Encryption Round

• S: substitute bytes (table)• : shift rows (permute)

• Ж: mix columns (matrix ×)• ⊕: add (xor) round key w

w

mix columns

S S SS S S SSS S SS S S SS

mix columns mix columns mix columns

⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕ ⊕

shift rows

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-57

© James P.G. SterbenzITTC

C C

PP

Host–Network InterfaceExample6.2 AES Encryption

decryption

encryption

S

Ж⊕

S

S

Ж⊕

round1

round9

round10

S–1–1

Ж–1

S–1–1

S–1–1

Ж–1

round9

round1

round10

k w

• 128b blocks• 128/192/256b key k

expanded to 1408/1664/1920b round key w; 128b/round

• 10/12/14 reversible encryption rounds

• fully pipelinable(no feedback)

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-58

© James P.G. SterbenzITTC

Host–Network Boundary BlurringDistributed Storage Area Networks

• System area network for CPU access to disk storage– using SAN network architectures and protocols

• Remote access to storage over long distance– LAN, MAN, WAN access over IP

• iSCSI: Internet SCSI• FCIP: fibre channel over TCP/IP

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-59

© James P.G. SterbenzITTC

Storage Area NetworksExample6.i iSCSI Background

• Internet distributed storage– based on SCSI (small computer systems interface)

• T10 reference• standard interface for storage devices

• iSCSI (Internet small computer systems interface)– [RFC 3347, 3720]

• Session layer protocol

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-60

© James P.G. SterbenzITTC

Storage Area NetworksExample6.i iSCSI Protocol Stack

SCSI device

TCP

iSCSI session

SCSI initiator

TCP

iSCSI session

ADU ADU

ADUH ADUH

applicationI/O request/response

ADU ADU

I/O device

SCSI request/response

iSCSI protocol

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-61

© James P.G. SterbenzITTC

Storage Area NetworksExample6.i iSCSI PDU Format Overview

• BHS Header [48b]– basic header segment

• AHS (optional)– additional header segment– requests only

• Header digest (optional)• Data segment• Data digest

additional header segment(optional; variable number)

header digest(optional digest)

data segment(optional)

data digest(optional)

48B

32 bits

basic header segment

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-62

© James P.G. SterbenzITTC

Network ProcessorsAdditional References

• Comer,Network Systems Design using Network Processors:Intel IXP 2xxx Versionhttp://ww.npbook.cs.purdue.edu

• Lekkas,Network Processors: Architectures, Protocols, and PlatformsMcGraw-Hill

• Carlson,Intel Internet Exchange Architecture and Applications:A Practical Guide to IXP2XXX Network Processorshttp://www.intel.com/intelpress/sum_ixa.htm

16 November 2010 KU EECS 881 – High-Speed Networking – End Systems HSN-ES-63

© James P.G. SterbenzITTC

Network ProcessorsAcknowledgements

Some material in these foils comes from the textbook supplementary materials:

• Sterbenz & Touch,High-Speed Networking:A Systematic Approach toHigh-Bandwidth Low-Latency Communicationhttp://hsn-book.sterbenz.org