new directions in enterprise network management aditya akella university of wisconsin, madison msr...
TRANSCRIPT
New Directions inEnterprise Network
Management
Aditya AkellaUniversity of Wisconsin, Madison
MSR Networking SummitJune 2006
Enterprise Network Management
• Very broad topic…– Tuning performance and availability of network-
attached services– Traffic sniffing for trouble-shooting– Monitoring utilization– Mapping network topology and resources, etc.
• Several tools (both commercial and free)– Tailored to enterprises of different sizes, requirements
Outline
• Enterprises desire specific management functionalities that current tools fundamentally cannot provide– Three examples
• Inability arises from how enterprises are designed and operated today (IP-based)– Decentralization and no control over routing
• Thoughts on enterprise network design principles– … Simplified management is a side-effect
So What’s Missing?
• Cumbersome or impossible to support– What-If analysis– Effective trouble-shooting– Fine-grained resource management
• Some tools may provide one of these– No tool provides all of them
1. What-If Analysis
Decentralized config specification– Complex config/policy split across several devices/mechanisms
• Firewalls, Proxies, NATs, router ACLs, VLANs, port filtering
– … And across different network layers– Hard to reason about cross-layer, cross-device interaction
• What will happen if I change X in my network?
– Policy/control plane level– Reason about connectivity before installing changes
New link/network upgrade
New policiesfor sales
Alternate configuration
New config stable? Will bottleneck disappear?Will upgrade violate policy?
2. Trouble-Shooting• What is the current “status”
of my network? – Who is talking to who
and how? Resource consumption?
– Avoid overload; control plane trouble shooting
• Information at arbitrary granularities– Users, machines, groups…– Ability to go back in time– Unexpected patterns of
communication; Protocol usage
How many conns from sales? Who is using access link?
How many connections from guests?Finance grp protocol usage last week?
2. Trouble-Shooting• Today…
– SNMP for tracking resource consumption Coarse-grained– Monitoring key resources Application specific; not network-
wide– Inference Rely on heuristics, error prone– Not fine-grained enough
Distributed decision on whether to allow flows– Distributed and/or local to services and devices– By default all-to-all is allowed
• Something is undesirable local restrictions• Use appropriate mechanism (ACLs, port filters, firewalls etc.)
– Poll to figure out what’s going on, or infer– Hard to archive control-plane events
3. Resource Management• Route around overloaded/failed
switches and links– Connection latency– Availability
• Control levels of resource consumptions– Prioritize applications or users– Restrict bandwidth consumption
of “sales”
• Middle-boxes and proxies– Placed at network choke points – Ideally, deploy at diverse
locations– Route different classes of flows
via different middleboxes
X
Sales virus-1 +image-filter + compression
Products virus-2+ compression
Guests restrict b/w
3. Resource Management
• Limited or no support in enterprises today– SNMP-based/manual tuning, OSPF, load-balancing
using DNS
Lack of tight control over routing– Forwarding tables, hop-by-hop dst IP based routing
inflexible• Very little info used for routing• Additional info into forwarding tables complexity; slow
look-up• Aggregation No control over flows or groups of flows
– Need tighter, app flow-level control• Forwarding tables fundamentally insufficient
A B using HTTPC D using AIM via proxyA D using AIM via filter…
Desiderata
• Centralization: – Of config specification (who can access what and how)– Of enterprise-wide decision-making (should flow X be allowed)– What-if analysis or connectivity becomes trivial
• (Offline) Analysis of a central database of policies– Troubleshooting and forensics is simple
• Current set or complete log of accepted conn requests or active flows
AC
DB
Should AD be allowed?
Desiderata
• Tight control over routing:– Centrally pre-ordain the path of each flow– No more designing around choke-points
• Easy to integrate arbitrary number/type of middle-boxes– Fine-grained resource control– Also aids trouble-shooting and what-if analysis
AC
DB
Route AD (AIM) through s1p1p2s2
Route AD (HTTP) through s1p1s3s2
An Architectural View
• Take all configuration and decision-making out of switches, routers– Put all eggs in one basket
• Central entity tells switches how to forward packets– Wire a circuit for each new flow…– … Or hand out a source route Switches have no forwarding table– Dumb forwarding elements– Under the direct control of the central controller (via
control channels)
Effect on Management
• Control-plane related management or monitoring easy to do– How many connections per users?– Upgrades violate policy?– Who accessed service X?– Route different flows differently– React to failures/overload
• “Data-plane management” harder to do– Band-width related– E.g. Restrictions on users; Monitor Utilization
Data Plane Management
• Switches need to be slightly less dumb– Minimal management support to enable data
plane management?• Counters per-flow?• Per-flow queuing?• Up-to-date link utilization?• Push vs pull based?