Transcript
Page 1: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Enabling Fast, Dynamic Network Processing with ClickOS

Joao Martins*, Mohamed Ahmed*, Costin Raiciu§, Roberto Bifulco*, Vladimir Olteanu§, Michio Honda*, Felipe Huici*

* NEC Labs Europe, Heidelberg, Germany

§ University Politehnica of Bucharest

[email protected], [email protected]

Page 2: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

The Idealized Network

Physical

Datalink

Network

Transport

Application

Physical

Datalink

Network

Transport

Application

Physical

Datalink

Network

Physical

Datalink

Page 2

Page 3: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

A Middlebox World

Page 3

carrier-grade NAT

load balancer

DPIQoE monitor

ad insertion

BRAS

session border controller

transcoder

WAN accelerator

DDoS protection

firewall

IDS

Page 4: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Hardware Middleboxes - Drawbacks

▐ Middleboxes are useful, but…ExpensiveDifficult to add new features, lock-inDifficult to manageCannot be scaled with demandCannot share a device among different tenantsHard for new players to enter market

▐ Clearly shifting middlebox processing to a software-based, multi-tenant platform would address these issuesBut can it be built using commodity hardware while still

achieving high performance?

▐ ClickOS: tiny Xen-based virtual machine that runs Click

Page 4

Page 5: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Click Runtime

▐ Modular architecture for network processing

▐ Based around the concept of “elements”▐ Elements are connected in a configuration

file▐ A configuration is installed via a command

line executable (e.g., click-install router.click)

▐ An element Can be configured with parameters

(e.g., Queue::length) Can expose read and write variables available

via sockets or the /proc system under Linux

(e.g., Counter::reset, Counter::count) Compiled 262/300 elements Programmers can write new ones to extend

Click runtime

Page 5

Page 6: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

A simple (click-based) firewall example

Page 6

in :: FromNetFront(DEVMAC 00:11:22:33:44:55, BURST 1024);

out :: ToNetFront(DEVMAC 00:11:22:33:44:55, BURST 1);

filter :: IPFilter(

allow src host 10.0.0.1 && dst host 10.1.0.1 && udp,

drop all);

in -> CheckIPHeader(14) -> filter

filter[0] -> Print(“allow”) -> out;

filter[1] -> Print(“drop”) -> Discard();

Page 7: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

What's ClickOS ?

domU

paravirt

apps

guestOS

ClickOS

paravirt

Click

miniOS

Page 7

▐ Work consisted of:Build system to create ClickOS images (5 MB in size)Emulating a Click control plane over MiniOS/XenReducing boot times (roughly 30 miliseconds)Optimizations to the data plane (10 Gb/s for almost all pkt sizes)

Page 8: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Performance analysis

Page 8

netback

Driver Domain (or Dom 0) ClickOS Domain

Xen bus/store

Event channel

netfront

Xen ring API(data)

NW driver Linux/OVS bridge

vif

Click

FromNetfront

ToNetfront

300* Kp/s 350 Kp/s 225 Kp/s* - maximum-sized packets

pkt size (bytes) 10Gb rate

64 14.8 Mp/s

128 8.4 Mp/s

256 4.5 Mp/s

512 2.3 Mp/s

1024 1.2 Mp/s

1500 810 Kp/s

Page 9: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Main issues

© NEC Corporation 2009Page 9

▐ Backend switch ( bridge / openvswitch ) are slow

▐ Copying pages between domains (grant copy) greatly affects packet I/O– These are done in batches, but still expensive

▐ Packet metadata (skb or mbufs) allocations

▐ MiniOS netfront not as good as Linux – 225 Kpps VS 430 Kpps Tx– only 8 Kpps Rx

Page 10: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Optimizing Network I/O – Backend Switch

Page 10

VALE

netback

Driver Domain (or Dom 0) ClickOS Domain

netfrontXen bus/store

Event channel

Xen ring API(data)

NW driver(netmap mode)

port

Click

FromNetfront

ToNetfront

▐ Introduce VALE as the backend switch

– NIC switches to netmap-mode

▐ Slight modifications to the netback driver only

▐ Batch more I/O requests through multi-page rings

▐ Removed packet metadata manipulation

▐ 625 Kpps (1500 size, 2.7x improvement) and 1.2 Mpps (64 size, 4.2x improvement)

Page 11: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Background - Netmap

Page 11

▐ Fast packet I/O framework

– 14.88 Mpps on 1 core at 900 Mhz

▐ Available in FreeBSD 9+

– Also runs on Linux

▐ Minimal device driver modifications

– Critical resources (NIC registers, physical buffer addresses, and descriptors) not exposed to the user

– NIC works in special mode, bypassing the host stack

▐ Amortize syscalls cost by using large batches

▐ Preallocated packet buffers, and memory mapped to userspace

Netmap – a novel framework for fast packet I/Ohttp://info.iet.unipi.it/~luigi/netmap/Luigi RizzoUniversita di Pisa

Page 12: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Background - VALE Software Switch

Page 12

▐ High performance switch based on netmap API (18 Mpps between virtual ports, one CPU core)

▐ Packet processing is “modular”

– Default as learning bridge

– Modules are independent kernel modules▐ Applications use the netmap API

VALE, a Virtual Local Ethernethttp://info.iet.unipi.it/~luigi/vale/Luigi Rizzo, Giuseppe LettieriUniversita di Pisa

Page 13: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

VALE

Optimizing Network I/O

Page 13

Driver Domain (or Dom 0) ClickOS Domain

netfront

NW driverClick

FromNetfront

ToNetfront

netback

Xen bus/store

TX/RX Event channels

Netmap API(data)

▐ No longer need the extra copy between domains

▐ Netmap rings (in the VALE switch) are mapped all the way to the guest

▐ An I/O request doesn't require a response to be consumed by the guest

▐ Event channels are used to proxy netmap operations from/to guest and VALE

▐ Breaks other (non-MiniOS) guests :(

– But we have implemented a netmap-based Linux netfront driver

Page 14: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Vale

Netback (Xen)

netback

netfront app.netmap API

Driver Domain

Mini-OS

3. ring/bufs pages granted

Initialization

buf slot [0]buf slot [1]buf slot [2]

slots KB (per ring)

# grants(per ring)

64 135 33

128 266 65

256 528 130

512 1056 259

1024 2117 516

2048 4231 1033

Optimizing Network I/O – Initialization and Memory usage

4. ring grant refs read from the xenstore buffer refs read from the mapped ring slot

VALE

1. opens netmap device2. registers a VALE port

▐ Netmap buffers are contiguous pages in guest memory

▐ Buffers are 2k in size, each page fits 2 buffers

▐ Ring fits 1 page for 64 and 128 slots; (2+ for 256+ slots)

netmap buffers pool

Page 15: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Vale

Netback (Xen)

VALE

netback netfront app

Domain-0

Guest (Mini-OS)

Backend finished

Packets to transmit

TX event channel

buf slot 0buf slot 1buf slot 2

Optimizing Network I/O – Synchronization

buf slot 0buf slot 1buf slot 2

(mapped)

▐ In netmap application, operation is done in sender context

▐ Backend/Frontend private copy not included in the shared ring page(s)

▐ Event channels used for synchronization

Page 16: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

EVALUATION

Page 17: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

ClickOS Base Performance

RX TX

Intel Xeon E1220 4-core 3.2GHz, 16GB RAM, dual-port Intel x520 10Gb/s NIC. One CPU core assigned to VM, the rest to dom0

Page 18: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Scaling out – Multiple NICs/VMs

Intel Xeon E1650 6-core 3.2GHz, 16GB RAM, dual-port Intel x520 10Gb/s NIC. 3 cores assigned to VMs, 3 cores for dom0

Page 19: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Linux Guest Performance

Page 20: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

ClickOS (virtualized) Middlebox Performance

Page 21: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

ClickOS Delay vs. Other Systems

Page 22: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Conclusions

Presented ClickOS:Tiny (5MB) Xen VM tailored at network processingCan be booted (on demand) in 30 millisecondsCan achieve 10Gb/s throughput using only a single core.Can run a varied range of middleboxes with high throughput

Page 22

Future work:Improving performance on NUMA systemsHigh consolidation of ClickOS VMs (thousands)Service chaining

Page 23: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC
Page 24: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

MiniOS (pkt-gen) Performance

RX TX

Page 25: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

Scaling Out – Multiple VMs TX

Page 26: XPDS13: Enabling Fast, Dynamic Network Processing with ClickOS - Joao Martins, NEC

ClickOS VM and middlebox Boot time

30 milliseconds

220 milliseconds


Top Related