sdx: a software-defined internet exchange points defined networking ... arpit gupta, nick feamster,...
TRANSCRIPT
SDX: A Software-Defined Internet Exchange Points
Software Defined Networking
• Changing how we design & manage networks – Data centers, backbones, enterprises, …
• But, so far, mostly inside these networks – Network virtualization, traffic engineering, …
• Software-Defined Exchange (SDX):
– Fundamentally change interdomain traffic delivery – Starting at the boundaries between domains
The Interdomain Ecosystem is Evolving ...
Flatter and densely interconnected Internet*
*Labovitz et al., Internet Inter-Domain Traffic, SIGCOMM 2010
Interdomain Routing is Not Flexible Enough!
• Routing only on destination IP address blocks(No customization of routes by application or sender)
• Can only influence immediate neighbors(No ability to affect path selection remotely)
• Indirect control over packet forwarding(Indirect mechanisms to influence path selection)
• Enables only basic packet forwarding (Difficult to introduce new in-network services)
How to overcome BGP’s limitations?
Valuable Wide-Area Services
• Application-specific peering – Route video traffic one way, and non-video another
• Blocking denial-of-service traffic – Dropping unwanted traffic further upstream
• Server load balancing – Directing client requests to different data centers
• Steering through network functions – Transcoders, scrubbers, caches, crypto, …
• Inbound traffic engineering – Splitting incoming traffic over multiple peering links
Enter Software-Defined Networking
• Match packets on multiple header fields (not just destination IP address)
• Control entire networks with one program (not just immediate neighbors)
• Direct control over packet handling (not indirectly via routing protocol arcana)
• Perform a variety of actions on packets (beyond basic packet forwarding)
How to incrementally deploy SDN for Interdomain Routing?
Deploy SDN at Internet Exchanges
• Leverage: SDN deployment even at singleIXP can benefit tens to hundreds of providers– Without providers deploying new equipment!
• Innovation hotbed: Incentives to innovate,as IXPs on front line of peering disputes
• Growing in numbers:– 350-400 IXPs– ~100 new IXPs established in past few years
https://prefix.pch.net/applications/ixpdir/summary/growth/
“SDX: A Software-Defined Exchange Point” (SIGCOMM 2014)
Arpit Gupta, Nick Feamster, Laurent Vanbever, Muhammad Shahbaz, Sean Donovan, Brandon Schlinker, Scott Shenker, Russ Clark, Ethan Katz-Bassett
“An Industrial-scale Software Defined Internet Exchange Point” (NSDI 2016)
Arpit Gupta, Robert MacDavid, Rudiger Birkner, Marco Canini, Nick Feamster, Jennifer Rexford, Laurent Vanbever
Conventional IXPs
AS A Router
AS C Router
AS B Router
BGP Session
Switching Fabric IXP
Route Server
SDX = SDN + IXP
AS A Router
AS C Router
AS B Router
BGP Session
SDN Switch
SDX Controller
SDX
SDX Opens Up New Possibilities
• More flexible business relationships– Make peering decisions based on time of day, volume of
traffic & nature of application
• More direct & flexible traffic control– Define fine-grained traffic engineering policies
• Better security– Prefer “more secure” routes– Automatically blackhole attack traffic
SDX Architecture
IXP Fabric
Central Services
IXP Controller
BGP Relay
ARP Relay
Participant Controller
ARP Handler
BGP Handler
RIBs
Fabric Manager
BGP Updates
ARP Requests
Forwarding Table Entries
Update Handler
Policy Compression Library
Use Case: Prevent DDoS Attacks
AS 2
AS 1
AS 3
SDX 1 SDX 2
Attacker
Victim
AS1 can remotely block attack traffic at SDX(es)
SDX-based DDoS protection vs. Traditional Defenses/Blackholing
• Remote influencePhysical connectivity to SDX not required
• More specificDrop rules based on multiple header fields, source address, destination address, port number …
• CoordinatedDrop rules can be coordinated across multiple IXPs
Inbound Traffic Engineering
AS A Router
AS C Routers
AS B Router
SDX Controller
SDX
C1 C2 10.0.0.0/8
Inbound Traffic Engineering
AS A Router
AS C Routers
AS B Router C1 C2
Incoming Data
10.0.0.0/8
Incoming Traffic Out Port
Using BGP
Using SDX
dstport = 80 C1
Incoming Traffic Out Port
Using BGP
Using SDX
dstport = 80 C1 ? match(dstport =80)à fwd(C1)
AS A Router
AS C Routers
AS B Router C1 C2
Incoming Data
10.0.0.0/8 Enables fine-grained traffic engineering policies
Inbound Traffic Engineering
SDX Design Scenarios • Unoptimized
– Data-plane policy in a single rule table• SDX paper (SIGCOMM’14)
– Encoding outbound neighbor in BGP next-hop– Single SDX rule table (OpenFlow 1.0)
• iSDX paper (NSDI’16)– Encoding BGP reachability in BGP next-hop– Multi-stage SDX rule table– Partitioning of forwarding equivalence
computation
Building SDX is Challenging
• Programming abstractions– How networks define SDX policies and how are they
combined together?
• Interoperation with BGP– How to provide flexibility w/o breaking global routing?
• Scalability– How to handle policies for hundreds of peers, half
million prefixes and matches on multiple header fields?
Building SDX is Challenging
• Programming abstractions– How networks define SDX policies and how are they
combined together?
• Interoperation with BGP– How to provide flexibility w/o breaking global routing?
• Scalability– How to handle policies for hundreds of peers, half
million prefixes and matches on multiple header fields?
Directly Program the SDX Switch
B1 A1
C1 C2 match(dstport=80)!fwd(C1)
match(dstport=80)!drop
Switching Fabric
AS A & C directly program the SDX Switch
Conflicting Policies
drop? C1? B1 A1
C1 C2
Switching Fabric
How to restrict participant’s policy to traffic it sends or receives?
match(dstport=80)!drop match(dstport=80)!fwd(C1)
Virtual Switch Abstraction
Each AS writes policies for its own virtual switch
AS A
C1 C2
B1 A1
AS C
AS B
match(dstport=80)!drop
match(dstport=80)!fwd(C1)
Virtual Switch
Virtual Switch Virtual Switch
Switching Fabric
Combining Participant’s Policies
Policy(p) = PolA ! PolC
AS A
C1 C2
B1 A1
AS C
AS B
match(dstport=80)!fwd(C1)
Virtual Switch
Virtual Switch Virtual Switch
Switching Fabric
p
match(dstport=80)!fwd(C)
PolA
PolC
Building SDX is Challenging
• Programming abstractions– How networks define SDX policies and how are they
combined together?
• Interoperation with BGP– How to provide flexibility w/o breaking global routing?
• Scalability– How to handle policies for hundreds of peers, half
million prefixes and matches on multiple header fields?
SDX: Integration with BGP Interdomain Rules
SDN policies of AS A can only apply to (destination) prefixes from a participant AS B who has announced the prefixes; In other words, SDN polices must comply with routes announced by BGP – but may select different next-hops for forwarding (from the default BGP best route selection
SDX Rule Compositions
SDX Rule Compositions
Rule Compositions: PA è PA’: PA’ + DefA è PA”
Final Combined Rule Compositions at SDX: SDX = (PA’’ + PB’’ + PC”) >> (PA’’ + PB’’ + PC”)
Building SDX is Challenging
• Programming abstractions– How networks define SDX policies and how are they
combined together?
• Interoperation with BGP– How to provide flexibility w/o breaking global routing?
• Scalability– How to handle policies for hundreds of peers, half
million prefixes and matches on multiple header fields?
Scalability Challenges
• Reducing Data-Plane State: Support for allforwarding rules in (limited) switch memory(millions of flow rules possible)
• Reducing Control-Plane Computation: Fasterpolicy compilation (policy compilation takeshours for initial compilation)
Scalability Challenges
• Reducing Data-Plane State: Support for allforwarding rules in (limited) switch memorymillions of flow rules possible
• Reducing Control-Plane Computation: Fasterpolicy compilationpolicy compilation could take hours
Reducing Data-Plane State: Observations
• Internet routing policies defined forgroups of prefixes.*
• Edge routers can handle matches onhundreds of thousands of IP prefixes.
*Feamster et al.,Guidelines for Interdomain TE, CCR 2003
Reducing Data-Plane State: Solution
10/8 40/8 20/8
Group prefixes with similar forwarding behavior
SDX Controller
SDX Rule Computation, Composition and Compression
SDX Controller Implementation
“Naïve” or Existing Designs of SDX Do Not Scale
“Naïve” or Existing Designs of SDX Do Not Scale
iSDX • Partition policy & forwarding rules
computations across IXP participants – partition the control-plane FEC computations across
IXP participants – distribute forwarding rules and tags via four tables in
the IXP fabric
• Decouple BGP and SDN Forwarding – encode BGP next-hop using “prefixes” in virtual MAC
addresses – encode BGP reachability using bit-masks
(hierarchically) in remainder of MAC addresses
SDX Rule Computation, Composition and Compression
iSDX: Partitioning Control-Plane Computations
iSDX: Distributing Forwarding Rules and Tags
iSDX: Reachability Encoding to Reduce Forwarding Table Sizes
iSDX Architecture
IXP Fabric
Central Services
IXP Controller
BGP Relay
ARP Relay
Participant Controller
ARP Handler
BGP Handler
RIBs
Fabric Manager
BGP Updates
ARP Requests
Forwarding Table Entries
Update Handler
Policy Compression Library
Reducing Data-Plane State: Solution
For hundreds of participants’ policies, few millions è < 35K
flow rules
Reducing Control-Plane Computation
• Initial policy compilation time– Leveraged domain-specific knowledge of policies– Hundreds of participants requires < 15 minutes
• Policy recompilation time– Leveraged bursty nature of BGP updates– Most recompilation after a BGP update < 100 ms
Experimental Evaluation
• BGP RIBs & update traces from large EU IXP
• 511 IXP participants
• 96 million peering routes for 300K IP prefixes
• 25K BGP updates for 2-hour duration
We Can Do This at IXP-Scale!
100 200 300 400 500
Participants
103
104
105
106
107
108
109
Forw
ardin
gT
able
Entr
ies
Unoptimized
MDS SDX-Central
iSDX
Optimal
BGP routes and updates for large EU IXP in a commodity hardware switch
“Naïve” or Existing Designs of SDX vs. iSDX
What’s Happening? • Running code
– Github available from http://sdx.cs.princeton.edu– Used in Coursera course on SDN
• SDX testbeds– Transit Portal for “in the wild” experiments– Mininet for controller experiments
• Ongoing deployment efforts– Inter-agency exchange (NSA)– Large European IXP
What’s Next?
New Technologies at Each Part of the Control Loop
21
Monitoring
ProgrammableSwitches
Writing Rules and ActionsAnalytics
More fine-grained and programmable
(INT)
Customizable Actions Streaming
capabilities, inference
Fully programmable data planes
New Technologies on the Horizon
• Fully programmable data planes– Highly programmable, protocol-independent
packet processing
• Better inference and decision-making,in real-time– Scalable, high-speed, distributed stream
processing
Fully-Programmable Data Planes: Protocol-Independent Packet Processing
Target Switch
SDN Control Plane Populating: Installing and querying rules
Compiler
Configuring: Parser, tables,
and control flow
Parser & Table Configuration
Rule Translator
Switch Customization with P4 • Parser
– Programmable parser: translate to state machine – Fixed parser: verify the description is consistent
• Control Flow – Target-independent: table graph of dependencies – Target-dependent: mapping to switch resources
• Actions – Specification of custom actions – Translation of actions into underlying source code
Compiling to Target Switches
• Software switches – Directly map the table graph to switch tables – Recompile switch with new parse/match/action
• Hardware switches with RAM and TCAM – RAM: hash table for tables with exact match – TCAM: for tables with wildcards in the match
• Switches with parallel tables – Analyze table graph for possible concurrency
New Technologies at Each Part of the Control Loop
Monitoring
ProgrammableSwitches
Writing Rules and ActionsAnalytics
More fine-grained and programmable
(INT)
Customizable Actions Streaming
capabilities, inference
Fully programmable data planes
In-Band Network Telemetry (INT)
• Network elements collect, report, modify state in-real time as data packets go through switch.
• Writing state into packets – (switch, in, out) tuples – Latency – Link Utilization
• Dynamic counters based on different hash buckets • Dynamic actions based on counter thresholds
– Dynamic rule creation – Reactive probing
Better Monitoring #1: Congestion Localization
• Today, localizing the source of congestion is challenging, for a number of reasons – Difficult to measure from end hosts – ISPs have (mostly good) information, but do not want
to divulge all of it (e.g., congestion to specific peers)
Ingress/Egress Latency With INT
• Could send a train of packets with accurate timing information affixed as those packets traverse the network.
Kim et al., “In-Band Network Telemetry via Programmable Data Planes”, ACM SIGCOMM (Demo Session) August 2015.
Better Monitoring #2: Security
• Example: DNS reflection Attacks – Attacker spoofs source IP of DNS request to
open resolver, often from many locations – (Larger) responses go to victim
• Deployment at IXP may provide useful choke point. Can monitor for: – Spikes in lookups from IPs, for domains – …remediation, rate limiting may be possible
What Else May Be Possible/Useful with INT at SDXes
• In-band traceroute/topology discovery • Per-hop latency/loss/utilization recording • Active probing based on counter
thresholds • Dynamic redirection of traffic flows • …
New Technologies at Each Part of the Control Loop
Monitoring
ProgrammableSwitches
Writing Rules and ActionsAnalytics
More fine-grained and programmable
(INT)
Customizable Actions Streaming
capabilities, inference
Fully programmable data planes
Detecting Malicious ASes Through Rewiring
• Malicious ASes tend to change connectivity more aggressively than legit ASes
AS5577 ROOT
AS50215 TROYAK
AS48172 OVERSUN- MERCURY
AS25478 IHOME
AS12383 PROFITLAN AS8287
TABA AS31366
SMALLSHOP
AS42229 MARIAM
AS50390 SMILA
2010.02.01
2010.04.01
AS50215 TROYAK
AS12383 PROFITLAN
AS31366 SMALLSHOP
AS44051 YA Masking
Bulletproof
Legitimate
Bulletproof
Masking
AS29632 NASSIST
Legitimate
Snapshots taken 3 months apart
Konte et al. “ASwatch: A Reputation System to Expose Bulletproof Hosting ASes”, ACM SIGCOMM, August 2015.
BGP routing dynamics
• Malicious ASes routing dynamics are driven by illicit operations e.g. short-lived announcements to perform malicious actions
• In contrast legit ASes dynamics are driven by policy changes, traffic engineering decisions
Advertise 52.34.21.0/24
Malicious AS
Withdraw 52.34.21.0/24 Malicious activity
Malicious AS
Malicious AS
Fragmentation and churn of advertised prefixes
• Malicious ASes rotate their advertised prefixes, e.g. to avoid evasion, blacklisting
• Advertise large number of non-contiguous prefixes
109.196.143/24 62.123.14/24 109.196.141/24 85.124.23/16 42.10.14/23 131.124.14/23 92.112.112/23
Malicious AS
52.34.21/24 31.145.14/24
Malicious AS
Idea: Real-Time Routing Decisions Based on Complex Inputs Kafka
Spouts
Cassandra RIBs
FetchPath BestPath UpdatePath UpdateDP
Input Local Output
BGP Update
Storm Bolts
Process 100K BGP update burst within 50 ms
NetFlow Records
IPS Alerts
SDX Platform • Running code
– Github available from http://sdx.cs.princeton.edu – Used in Coursera course on SDN
• SDX testbeds
– Transit Portal for “in the wild” experiments – Mininet for controller experiments
• Ongoing deployment efforts – Inter-agency exchange (NSA) – Large European IXP
Conclusion • The Internet is changing
– New challenges for content delivery – Increasing importance of IXPs
• SDN can speed innovation – New capabilities and abstractions
• Ongoing – Operational deployments
• Looking forward… – Protocol-independent switches – High-rate stream processing