andy gospodarek, broadcom simon horman, netronome tom

37
How NICs Work Today IETF 105, Montreal, Tuesday July 23, 2019 1 Tom Herbert, Intel Simon Horman, Netronome Andy Gospodarek, Broadcom

Upload: others

Post on 04-Feb-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

How NICs Work Today

IETF 105, Montreal, Tuesday July 23, 20191

Tom Herbert, IntelSimon Horman, NetronomeAndy Gospodarek, Broadcom

Page 2: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Fundamentals

2

Page 3: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Terminology● Network Interface Card (NIC): Host’s interface to

physical network● Host Stack: Software stack that performs host side

processing of L2, L3, or L4 protocols● Kernel Stack: Host stack implemented in an OS kernel● Offload: Do something in NIC HW that could be done in

host SW stack● Acceleration: Offload for performance gains

3

Page 4: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Network Interface Cards

4

Page 5: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Evolution of Network Interface Cards● Fundamental support (1990s)

○ Transmit and receive packets○ Basic offloads (Ethernet Checksum Offload!)

● Data plane acceleration (early to mid 2000s)○ Optimization for multi-core CPUs○ Hardware data plane offload — mostly fixed function devices○ Tunneling, IPsec, QoS offloads

● Programmability (2010 onwards)○ FPGAs and NPUs with programmable data plane○ General purpose processor with programmable data and control planes

5

Page 6: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Offload: Motivation

6

● Free up CPU cycles for application● Specialized processing can be more efficient● Save host resources● Scaling performance (low latency/high throughput)● Power savings for some use cases

=> Reduced TCO (marketing slant!)

Page 7: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Less is More Principle● Protocol agnostic is better than protocol specific

○ Avoid protocol ossification○ New protocol support without needing completely new solutions

● Common open APIs are better than proprietary ones○ Avoid vendor lock in○ Differentiation by features, performance, implementation

● Programmability is (generally) good○ Be adaptable, don’t dictate to the user what they are allowed to do○ Aspiration: “write once, run anywhere” model across devices

7

Page 8: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Basic offloads

8

Page 9: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Offload Considerations● TX and RX ● Protocol agnostic versus protocol specific● Stateful versus stateless● Encapsulation● “Always on” versus “opportunistic”● IPv6 and IPv4● How to build protocols to be NIC offload friendly

9

Page 10: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Basic Offloads● Checksum offload● Segmentation offload● Multi-queue and packet steering

10

Page 11: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Checksum Offload

● TCP, UDP, GRE, etc...● NIC offload calculation over data● Checksum offload is ubiquitous● Encapsulation allows multiple checksums in same packet

11

Page 12: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

TX Checksum Offload● Device parses tranports and set checksum

○ Device parse packets and set TCP or UDP checksum

● Instruct device where to start and write checksum○ Init csum field, indicate start offset and offset to write csum○ Generic method

12

Page 13: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

RX Checksum Offload● Checksum unnecessary

○ Device parses packet and verifies UDP or TCP checksum

● Checksum complete○ Device return 1’s complement sum across words in the packet○ Generic method

13

Page 14: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Segmentation Offload● Stack operates more efficiently

on large packets● Combines with checksum offload to

minimize header processing and per packet overhead

14

Page 15: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Transmit Segmentation Offload● Split big packet into smaller one low in the stack● GSO, Generic Segmentation Offload: SW variants● LSO: HW variant

15

Page 16: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Receive Segmentation Offload

● Coalesce small packets into bigger ones low in stack ● Generic Receive Offload, GRO: SW variant● Large Receive Offload, LRO: HW variant● Difficult to make protocol agnostic!

16

Page 17: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Multi-Queue● Multiple queues exposed by NIC● Queues processed by CPUs● Queues can be accessed and processed in parallel,

technique for load balancing● Queues can also have different properties, e.g. priority● Avoid OOO packets, maintain flow to queue affinity

17

Page 18: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Transmit Queue Selection● XPS, Transmit packet steering

○ Send packets on queue associated with CPU or thread

● Driver selects queue○ Device driver operation○ ndo_select_queue in Linux○ Arbitrary properties (e.g. priority)

18

Page 19: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Receive Packet Steering

● Receive Packet Steering○ Steer to queue based on hash○ RPS is SW variant○ RSS, Receive Side Scaling, is HW

● Receive Flow Steering○ Flow to queue association○ RFS is SW variant○ aRFS, accelerated RFS, is HW variant

19

Page 20: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Data Plane in Hardware

20

Page 21: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Data Plane in Hardware● Fixed or minimally configurable pipeline

○ ASIC with TCAM tables used for configuring pipeline

● Programmable Pipeline○ Network Processing Unit/Network Flow Processor

■ Multi-threaded execution environment for data plane programs

○ FPGA■ Gate-Level Programmable

○ General Purpose Processor■ CPU Complex separate from host

Offload NIC

PHY0

PHY1

PCIeOffloaded Data Plane

PHY0

PHY1

ASIC, FPGA, NPU, or CPU

Page 22: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

● Control plane stays in host software stack

● Offload data plane● Hardware Fallback/

Assist datapath in host software stack

● Host software stack implements features of offload data plane

Data Plane Acceleration

22

Offload NICPHY0

PHY1

Host

User Space

Applications

Kernel

Linux

Offloaded Data Plane

Control / DataplaneSoftware Datapath

PHY0

PHY1

PCIe

Page 23: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Data Plane Acceleration● Match/Action● Forwarding● QoS ● TLS and IPsec

23

Page 24: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

● Match packet based on headers and metadata

○ e.g: input-device + 5-tuple

● Execute actions based on match○ Forward / Mirror○ Drop ○ Packet/metadata modification

● Stateful actions○ Policing○ Connection tracking

Host Linux

Offload NIC

Software DatapathMatch Tables Actions

Offload Data PlaneMatch Tables Actions

Match/Action

24

Page 25: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

● L2 -> Ln● Between physical and logical

devices● HW datapath misses can fall

back to host● Optional tunnel encap/decap

○ VXLAN, GRE, Geneve, …

● And tagging: VLAN, MPLS

Forwarding

25

Host Linux

Offload NIC

Software DatapathMatch Tables Actions

Offload Data PlaneMatch Tables Actions

Tunnel

Tunnel

Page 26: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

● Ingress○ No queue○ Police/Meter/Filter

● Egress○ Classifier selects priority○ Scheduler

■ Priority Scheduler, f.e. 802.1p■ Deficit round robin■ TSN■ Shaping: DCB, …

QoS

26

Host Linux

Offload NIC

Software DatapathClassifier Scheduler

Offload Data PlaneClassifier Scheduler

Egress QoS

Page 27: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

User Space

Host Linux

Offload SmartNIC

Fallback PathDevice

OffloadFast Path MQ

RED RED RED

Application

Kernel

ApplicationApplicationMQ + RED Offload

27

● Per-device RED in HW● May ECN mark or drop packets

Page 28: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

TLS Acceleration

Offload NIC

Host

User Space

Application

Kernel

Linux

FramingkTLS Module

Symmetric Crypto Offload

Encrypt/Decrypt

TLS HeaderRecord Payload

Ciphertext Auth Hash

Established TLS be passed to kTLS

TX Path:

● NIC driver marks packets for crypto offload based on packet socket

● NIC performs encrypt and TX

RX Path:

● NIC performs decrypt and auth● Notifies kTLS of queued data● kTLS skips decrypt of plaintext● Handle Out-Of-Order

TLS HeaderRecord Payload

Plaintext 0000 0000

Auth Hash

28

Page 29: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Crypto Offload

● HW: Encrypt/Decrypt/Integrity/LSO/Checksum● Kernel: Padding/Anti-replay/Counters/Security Policy DB ● User-Space: IKE

Full Offload

● HW: Replay/Encap/Decap/SPD/LSO/Checksum/LRO● Kernel: IP fragmentation/Counters/Configuration● User-Space: IKE

IPsec Acceleration

29

Page 30: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Programmability

30

Page 31: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Programmability● Facilitates rapid protocol development● Quickly fix bugs and security problems● Two main types used today:

○ FPGA/NPU

○ General Purpose Processors

● Emerging trend: What is niche today can be broad tomorrow○ IETF 104 “Forwarding Plane Realities”

31

Page 32: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

● Control plane stays in host● Flexible offload data plane controlled

through kernel or user space● Data plane could be expressed by

P4, eBPF, NPL, or other native instruction set

● Dynamically programmed

Programmability with FPGA or NPU

32

User Programmable NICPHY0

PHY1

Host

User Space

Applications

Kernel

Linux

FPGA/NPU Data Plane

Control/Data PlaneSoftware Datapath?

PHY0

PHY1

PCIe

Page 33: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

General Purpose Processor

33

● Move host software stack down to the NIC● Dataplane offload to general purpose processor on NIC● Control plane offload

○ Useful in bare metal or multi-tenant deployments○ Network admin can control server networking

● No host resources consumed forwarding network traffic

Page 34: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

● Capable of running complete Operating System

● Forwarding functionality moved completely away from server cores down to NIC

Programmable NIC with General Purpose Processor

34

Programmable NICPHY0

PHY1

Host

User Space

Applications

Kernel

Linux

Software Datapath

Control/Data Plane

PHY0

PHY1

General Purpose Processor

Page 35: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Programmable NIC with General Purpose Processor

35

● Programmable NICs also have offload-capable devices

Programmable NIC

Host

User Space

Applications

Kernel

Linux

FPGA/NPU/ASIC

Offloaded Datapath

Control/Data Plane

PHY0

PHY1

Control/Data Plane

Software Datapath

General Purpose Processors

Page 36: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Conclusion and Futures● Networking trends

○ Insatiable need for more bandwidth and lower latency○ Deployment of forward looking IETF protocols

● NICs work with hosts to make this happen○ Offloads will be relevant for foreseeable future○ Programmability and flexibility spur innovation

36

Page 37: Andy Gospodarek, Broadcom Simon Horman, Netronome Tom

Thank You!

37