application-level communication services in edge routers · application-level communication...
TRANSCRIPT
![Page 1: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/1.jpg)
Application-level Communication Services in Edge Routers
Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom
www.cercs.gatech.edu/projectsW. Lee, K. Mackenzie, S. Pande, D.
Schimmel and many other GT researchers
CERCS, Georgia TechIntel IXA Meeting, Sept. 2003
![Page 2: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/2.jpg)
IHPCLClusters
TeraStream ServerCluster Machine
SimulationAccess Grid Nodes
EngineeringClients
PlannedGT 10GBbackbone
Application Services
Storage
capture, transport, filter, transform, intrusion detection, …
Context: Interactive Information Grids:GT Teragrid
Real-timeVisualization
Mobile Sensors
Wireless Clients:ipaqs, 802.11a/b/g
ScienceClients
Real-timeVisualization
ETF
RemoteCollaborators
Access Grid Nodes
Access Grid Nodes
NationalLightrail
Data staging, caching, …
Graphics/Visualizationand Sensor Services
![Page 3: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/3.jpg)
Edge Routers for Terastream Services - Cluster Machines
TeraStream ServerCluster Machine
Terastream Engine
X
M
P P
Infiniband
gigE
IXP
Runtime Layer
Extension Layer
Stream ManagementStream Manipulation
Examples: •Stream scheduling for real-time response•Data mirroring for 24/7 operation
Attached Network Processors
![Page 4: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/4.jpg)
Edge Routers for Terastream Services - Wireless Clients
DisplayEngines
Wireless Clients:ipaqs, 802.11a/b/g
DisplayEnginesDisplay
Engines
Future wired-wirelessedge routers - 4xx:•data reduction•scalable client-specific operation•personalization
IXA Edge Routers
Graphics/Visualizationand Sensor Services
![Page 5: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/5.jpg)
Programmable Edge Routers
• Focus on Attached Network Processors (ANPs):– Real-time collaboration, delivering camera- or sensor-
captured data, enterprise services (e.g., OIS)– Application-specific stream customization occurs at nodes in
overlay networks mapped to suitable host/NP (ANP) pairs
• Host/ANP services address dynamically changing application needs and platform resources with application-specific stream customization:– Data mirroring, selection, downsampling– Selectively lossy data exchange and stream scheduling– Scalable, client-specific functionality– New services:
• Intrusion detection• Remote graphics• `XML’ support
![Page 6: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/6.jpg)
Why`Push’ Application Services into Network Infrastructure?
Cost/Performance– NPs have optimized hardware:
• Efficient access to and movement of network packets– Services can be implemented on packets’ fast path,
using available headroom• existing work provides network-centric services: routing,
network monitoring, intrusion detection, differentiated services, …
• our research focuses on application-specific functionality
This talk: New Services:– Remote graphics, `XML’
![Page 7: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/7.jpg)
Technical ApproachStream Handlers
Use Stream Handlers – computational units which implement application-level services on NPs
Split executionSplit execution of application-level services across
stream handlers on ANPs and host kernel- or host user-level based resource needs
Dynamic configurationDynamically create, configure, and deploy stream
handlers
![Page 8: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/8.jpg)
`Split’ Architecture
Receive Transmit
Access user
kernel
protocol plane
host
ANP
from network to network
• IXP-level receive- and transmit- blocks fragment/re-assemble application-level messages and execute application-specific functions
• Additional functionality is implemented via data accesses at IXP or host level
![Page 9: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/9.jpg)
IXP-level Stream Handlers• Lightweight, composable, parameterizable,
computational units, executed by the NPs; can access information ‘beyond’ packet headers, i.e., message headers and payloads
• Implementation utilizes:– Efficient protocol to assemble application-level data
(RUDP) - Future: utilize NP-resident UDP/TCP stacks– Self-describing portable data formats (PBIO) that
define payload structure
• Stream handler execution can be linked with host-based kernel or user actions
![Page 10: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/10.jpg)
`Split’ Operation
• IXP-side:– At protocol receive- or
transmit-side, or in IXP memory
– Using limited IXP resources• Host-side:
– At kernel- or user-level– Necessary to support
functionality of arbitrary complexity under varying conditions
• Compositions of handlers can implement more complex services
kernel
application
? EnginesIXP Mm
data pathpossible locations forstream handler execution
from network
to network
![Page 11: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/11.jpg)
Experimental Evaluation
Viability:– Low overheads of stream handler implementation
in terms of latency and bandwidth - previous workNew services:
– Efficient implementations of services such as client-customized multicast
Performance benefits:– Performance benefits include offloading the host
CPUs, and load reduction on the underlying network and memory infrastructure
![Page 12: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/12.jpg)
• IXP-based forwarding improves end-to-end latency:
• Comparable to host-level performance forsmaller messages
• Improvements more profound as message sizes increase (i.e., consider remote visualization)
Performance Benefits/Viability:Improved Message Latencies
8.4ms15.4ms100kB4.2ms6.8ms50kB840us896us10kB131us132us1.5kB82us83us1kB28us32us100B
IXP-sideHost-sidedata size, u
![Page 13: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/13.jpg)
Performance Effects: Application-level Services
mirroring multicast customizedbased on destination
Mirroring & destination-specific multicast more efficient on ANP, as part of the Rx/Tx code
![Page 14: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/14.jpg)
Need for ‘Split’ Handlers: Complex Handlers and ‘Headroom’
intensive computation
• Complexity of ‘format’ increases with data size, available headroom is exceeded, and performance degrades
• Need for intermediate threads/processing
![Page 15: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/15.jpg)
New Services:Client-specific OpenGL Image Cropping on
the IXP
• Can perform computationally intensive tasks likeimage cropping efficiently
• Performance Benefits: CPU load when performed at host: 99.95%
![Page 16: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/16.jpg)
`Split’ Handlers and Additional Resources: NIDS System Design
A Layered and pipelined architecture: – Maximize performance by assigning
tasks to the most appropriate device:• StrongArm/Xscale: configuration,
control, I/O• Microengines: sequential, repetitive
packet processing• FPGA: massively concurrent
processing
–Prototype system developed for 1 Gbps networks using IXP1200 and Xilinx Virtex FPGA
–Moving to IXP2400 and Virtex2 to support faster networks
![Page 17: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/17.jpg)
Conclusions• `Split’ Architecture:
– Use headroom to implement middleware- and application-level services on fast path through NPs
– Benefit from network-near execution of stream handlers and flexible mapping across host-ANP
• Deliver new functionality and performance gains to applications while meeting network performance requirements
• Issue: `Vertical’ system programming
![Page 18: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/18.jpg)
Ongoing and Future Work
Rx SH SH SH Tx
Control Mgt
DataMgt
Control Data
Data Buffers
resource stateANP-HOST
INTERFACE
HOST
ANP
Resource Monitor
Admission Control
Application/Middlewareh h• Dynamic deployment
of complex services across ANP-host boundaries.
• Focus on Enterprise Applications: dynamicXML-formatinterpretation and code generation.
• Admission control• Request: host/NP
proximity: beyond PCI
SystemArchitecture
![Page 19: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/19.jpg)
Research Overview
• `Split’ Services: K. Mackenzie, K. Schwan, S. Yalamanchili
• NIDS System: D.Contis, D. Schimmel, W. Lee
• Efficient Host/ANP Intrusion Detection - W. Lee
• Automatic Register Allocation for Micro-engine Code - S. Pande
![Page 20: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/20.jpg)
Support Tools: GT IXP Driverkenmac@cc, austen@cc, ganev@cc
• User interfaces: 2 so far (host side)– faux “ethernet” interface (in-kernel)– DEC “CLF” message system (user)
• “Hacker’s Driver” (host side)– exposes all ENP2505 card resources
to host kernel and/or user• Msg-over-PCI protocol (host &
uEngine)• Extensible NI (uEngine)
• IXP2400 operational soon
ENP2505
host
![Page 21: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/21.jpg)
IXP Driver - Some Detail• Currently supports:
– IXP1200 boards (Radisys ENP-2505)– IXP2400 boards (Radisys ENP-2611)
• Exports hardware resources to host kernel/user space code:– PCI bridge config/status registers– IXP chip config/status registers– IXP SDRAM
• Provides physically contiguous host SDRAM to user/kernel space code
• Integrates Intel’s pciDg driver on top– Completed for IXP1200 boards– In progress for IXP2400 boards
![Page 22: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/22.jpg)
Related Work
• Extensible network architectures– SPINE, VCM, WUGS/DHP, ANTS, CANEs…– IXP1200: Princeton Vera, Columbia Netbind,
microACE, IXP as NIC…• Composable computation
– microprotocols, CANs, Protocol Boosters…• Stream customization
– publish/subscribe (Echo/Jecho, Gryphon…) and peer-to-peer (Chord, Pastry…)
![Page 23: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/23.jpg)
Dual-bank Register Constraint
?Dual-bank Constraint? Only for ALU instructions? Two source operands must
come from different banks? Why—fetch them in parallel to
achieve 1 cycle latency for all ALU instructions
ALU[dest_op,source_op_a,+,source_op_b]
source_op_a source_op_b Bank A, Bank B source_op_a source_op_b Bank B, Bank A
OR
64 A-Bank GPRs
64 B-Bank GPRs
Thread 1 Thread
2 Thread3 Thread
4
![Page 24: Application-level Communication Services in Edge Routers · Application-level Communication Services in Edge Routers Ada Gavrilovska, Karsten Schwan, Hailemelekot Seifu, Ola Nordstrom](https://reader031.vdocuments.us/reader031/viewer/2022021802/5b5ad1f27f8b9a905c8cb9dd/html5/thumbnails/24.jpg)
Our Approaches
Two observationsBreaking smaller cycles may break bigger cycles as well.Most odd-cycles are small.
Problem modelingBuild Register Conflict subGraph (RCG), then detect and break all odd-cycles on the RCG.
Algorithm ComplexityBrute-force algorithm takes exponential time. Based on our algorithm, in most cases, it is polynomial-time solvable.
Combine with Register AllocationWe propose 3 algorithms: Pre-RA, Post-RA, Combined, depending on the phase-ordering of our algorithm and the register allocation. Current results show Post-RA is best, but more potential improvements are possible for the Combined approach.