new directions in enterprise network management aditya akella university of wisconsin, madison msr...

New Directions inEnterprise Network

Management

Aditya AkellaUniversity of Wisconsin, Madison

MSR Networking SummitJune 2006

Enterprise Network Management

• Very broad topic…– Tuning performance and availability of network-

attached services– Traffic sniffing for trouble-shooting– Monitoring utilization– Mapping network topology and resources, etc.

• Several tools (both commercial and free)– Tailored to enterprises of different sizes, requirements

Outline

• Enterprises desire specific management functionalities that current tools fundamentally cannot provide– Three examples

• Inability arises from how enterprises are designed and operated today (IP-based)– Decentralization and no control over routing

• Thoughts on enterprise network design principles– … Simplified management is a side-effect

So What’s Missing?

• Cumbersome or impossible to support– What-If analysis– Effective trouble-shooting– Fine-grained resource management

• Some tools may provide one of these– No tool provides all of them

1. What-If Analysis

Decentralized config specification– Complex config/policy split across several devices/mechanisms

• Firewalls, Proxies, NATs, router ACLs, VLANs, port filtering

– … And across different network layers– Hard to reason about cross-layer, cross-device interaction

• What will happen if I change X in my network?

– Policy/control plane level– Reason about connectivity before installing changes

New link/network upgrade

New policiesfor sales

Alternate configuration

New config stable? Will bottleneck disappear?Will upgrade violate policy?

2. Trouble-Shooting• What is the current “status”

of my network? – Who is talking to who

and how? Resource consumption?

– Avoid overload; control plane trouble shooting

• Information at arbitrary granularities– Users, machines, groups…– Ability to go back in time– Unexpected patterns of

communication; Protocol usage

How many conns from sales? Who is using access link?

How many connections from guests?Finance grp protocol usage last week?

2. Trouble-Shooting• Today…

– SNMP for tracking resource consumption Coarse-grained– Monitoring key resources Application specific; not network-

wide– Inference Rely on heuristics, error prone– Not fine-grained enough

Distributed decision on whether to allow flows– Distributed and/or local to services and devices– By default all-to-all is allowed

• Something is undesirable local restrictions• Use appropriate mechanism (ACLs, port filters, firewalls etc.)

– Poll to figure out what’s going on, or infer– Hard to archive control-plane events

3. Resource Management• Route around overloaded/failed

switches and links– Connection latency– Availability

• Control levels of resource consumptions– Prioritize applications or users– Restrict bandwidth consumption

of “sales”

• Middle-boxes and proxies– Placed at network choke points – Ideally, deploy at diverse

locations– Route different classes of flows

via different middleboxes

X

Sales virus-1 +image-filter + compression

Products virus-2+ compression

Guests restrict b/w

3. Resource Management

• Limited or no support in enterprises today– SNMP-based/manual tuning, OSPF, load-balancing

using DNS

Lack of tight control over routing– Forwarding tables, hop-by-hop dst IP based routing

inflexible• Very little info used for routing• Additional info into forwarding tables complexity; slow

look-up• Aggregation No control over flows or groups of flows

– Need tighter, app flow-level control• Forwarding tables fundamentally insufficient

A B using HTTPC D using AIM via proxyA D using AIM via filter…

Desiderata

• Centralization: – Of config specification (who can access what and how)– Of enterprise-wide decision-making (should flow X be allowed)– What-if analysis or connectivity becomes trivial

• (Offline) Analysis of a central database of policies– Troubleshooting and forensics is simple

• Current set or complete log of accepted conn requests or active flows

AC

DB

Should AD be allowed?

Desiderata

• Tight control over routing:– Centrally pre-ordain the path of each flow– No more designing around choke-points

• Easy to integrate arbitrary number/type of middle-boxes– Fine-grained resource control– Also aids trouble-shooting and what-if analysis

AC

DB

Route AD (AIM) through s1p1p2s2

Route AD (HTTP) through s1p1s3s2

An Architectural View

• Take all configuration and decision-making out of switches, routers– Put all eggs in one basket

• Central entity tells switches how to forward packets– Wire a circuit for each new flow…– … Or hand out a source route Switches have no forwarding table– Dumb forwarding elements– Under the direct control of the central controller (via

control channels)

Effect on Management

• Control-plane related management or monitoring easy to do– How many connections per users?– Upgrades violate policy?– Who accessed service X?– Route different flows differently– React to failures/overload

• “Data-plane management” harder to do– Band-width related– E.g. Restrictions on users; Monitor Utilization

Data Plane Management

• Switches need to be slightly less dumb– Minimal management support to enable data

plane management?• Counters per-flow?• Per-flow queuing?• Up-to-date link utilization?• Push vs pull based?

new directions in enterprise network management aditya akella university of wisconsin, madison msr...

Documents