the need for complex analytics from forwarding pipelines
TRANSCRIPT
The Need for Complex Analytics from Forwarding PipelinesTom Tofigh, AT&T Nic Viljoen, NetronomeBryan Sullivan, AT&T
• Problem Statement• Gaps in Real Time Observability • Proposed SDN Based Observability • Importance of Real-time Programmable Analytics • Data Plane Programmability for Complex Analytics • Programmable NIC Cards• Summary
2
Agenda
• Require real time observability at data plane and control plane level
• Require programmable granular systems without the unscalable approach of metering all the data all the time
Looking for the Call Drop Reason!
Problem Statement
4
• Achieve autonomous control through programmable data plane analytics
• Real time dynamic instrumentation-virtual probes that gather trend data
• Targets specific flows, SOC/SmartNICs, VMs or containers for observation
• Enables instant root cause analysis • Provide scalable solutions for fine grained observation
Gaps: Dynamic & Real-Time Programmable Analytics
Autonomous Control System Concept
Measure
Analyze
Proposed Evolution for Dynamic Probing
Dynamic Probe & Measurement Examples
QoE• Flow jitter, latency measurement• Packet drop rate• Application analysis
• DDoS detection• Deep packet inspection• Stateful flow monitor
Customer Care
• Custom statistics• Flow tracing• Root cause analysis
Optimization• Load estimation• Traffic matrix calculation• Elephant flow identification
compile
disseminate
configurecollect
analyze
present
dynamic P4 query Models
Complex analytics
Security
RO
AD
M(C
ore)
Spine Routers
Leaf-Spine Fabric
Spine Routers
Spine Routers
Spine Routers
Leaf routersLeaf RoutersLeaf
RoutersLeaf RoutersLeaf RoutersLeaf Routers
VM VM VM VM
OVS VM VM VM
GP
ON
(Acc
ess)
PONOLT MACs
MeasurementAbstractionInterface
Analytics Platform(XOS + Services)
Apps Apps Apps Customer Care Security Diagnosis
ONOS + XOS
SmartNIC
ACORD Observability @ L0 – L7
2.8Tbps
The SmartNIC
Nic Viljoen, Netronome Systems
The Programmable SmartNIC
Challenges with Fixed-Function NICs• Networking applications have diverse requirements• Fixed-function ASICs have “baked-in” functionality and
lack flexibility
Programmable NIC Advantages• Develop custom networking applications• High performance at network• Preserve CPU cycles
• CPU OVS @40Gbps-12 cores• Offload OVS @40Gbps-1 core
• Dynamic analytics• High-level languages-P4/C• Examples of SmartNICs: Netronome’s Agilio, Cavium
LiquidIO
Programmable NIC Architecture
“Sea of Workers” for customized networking workloads
Support for P4 and Match/Action structures
Optimized memory architecture
vProbe Application• Interpret flow stats and features• Aggregate info to controllers-More
on next slide
Flow Cache• Keep state for >million flows• Programmable state based on
vProbe application requirements
• 25G/40G line rate• Programmable payload
size/number of flows tradeoff• Self-learning
Augmenting Netronome’s Agilio OVS Software for Virtual Probing
Compute Node
vProbe Application VMVM
OVS Userspace Processes(ovs-dbserver, ovs-vswitchd)
Action Arguments
Linux Kernel
Agilio-CXAdapter
OVS Datapath
ActionsMatch Tables
Controller
Tunnels
Deliver to Host
Update Statistics
OVS Datapath
Kernel Flow Table, Fallback Path Actions
Exact Match Flow
Cache
Flow Stats and Features
Offload
First Packet of Flow
Remaining Packets of Flow
Flow Stats and Features
PacketRx/Tx
vProbe Application
• Flow-based data and stat aggregation using techniques such as machine learning
• Enables powerful use-cases through use of flow analytics:• Dynamic configuration for DDoS at VM level
using high speed clustering/classification algorithms (next slide)
• Network shaping based on predictive flow characteristics-Work with University of Arizona has shown 50% improvement in offload utilisation
• Elastic VM resource provisioning• Filtering and grouping for analysis at various
levels of visibility• Rack, Data Center, Metro, Regional, National
Classify
Aggregate
Analyze
React and
Configure
Cycle Required in < 12s
1
2
3
4
OVS
vProbe
vProbe
OVS
East/West DDOS Use CasePer VM egress clustering
Drop traffic (targeted/all), Reduce VM resources,
Shut down VM
• E/W DDoS attacks are prevalent • Use vProbe to quickly identify infected
VMs and react by modifying flow rules or VMs
• Policy dictated by higher-level orchestrator
• Aggregated data can be disseminated to multiple orchestration levels
• Enables distributed response at server/rack/DC/regional levels
1) Classify2) Aggregate3) Analyze4) Configure
1
3
42
•Intelligent network would benefit from programmable switches, NICs and CPU
•NIC based offload is essential as CPU power is not scaling at the rate of Network traffic increase
•AT&T’s John Donovan estimated our traffic has increased by 150,000% since 2007
•This means offload is essential to negate cost and maintain performance
•Flexible offload opens up potential analytics use cases that have previously not been tenable
Observability-Intelligence at the Edge
Overview-What do you need to find a needle
OBSERVABILITYthe ability to
statefully observe connections
COMPUTABILITYthe ability to monitor and
aggregate complex data in real time
FLEXIBILITYthe ability to create
a real time feedback loop using dynamic data plane
and control functions
With Dynamic Programmable vProbe
•We are looking to gather a list of use cases for a dynamic analytics platform currently being developed
•Email: Tom Tofigh ([email protected]) or Nic Viljoen ([email protected])-email address with an k!
•Join us for the next series of POCs
Thank You!
Call to Action-We Need Your Use Cases!