traffic measurement for network operations jennifer rexford ip network management and performance...
Post on 19-Dec-2015
215 views
TRANSCRIPT
Traffic Measurement Traffic Measurement for for
Network OperationsNetwork Operations
Jennifer RexfordJennifer Rexford
IP Network Management and IP Network Management and PerformancePerformance
AT&T Labs – Research; Florham Park, AT&T Labs – Research; Florham Park, NJNJ
Outline of Tutorial
• Introduction (1.5 hours)• Measurement techniques (3 hours)
– General terminology– SNMP and RMON– Packet monitoring– Flow measurement– Data interpretation
• Network-wide models (1.5 hours)– Path matrix (trajectory sampling, IP traceback)– Traffic matrix (network tomography, MPLS
MIBs)– Demand matrix (joining flow and routing data)
Introduction: Outline
• Example challenges for network operators– Detect, diagnose, and fix
• Internet Protocol (IP) background– Protocols, addressing, and design goals
• Internet Service Provider networks– ISP architecture and routing protocols
• Responsibilities of network operators– Challenges, timescales, and key tasks
• Network state– Topology, configuration, and routing
Network Operations: Detecting the Problem
overload!
Detecting the problem!• High utilization or loss statistics for the link?• High delay or low throughput for probe traffic?• Complaint from angry customer (via phone network)?
“Don’t IP networks manage themselves?”• Doesn’t TCP adapt automatically to network congestion?• Don’t the routing protocols automatically reroute after a failure?
Network Operations: Excess Traffic
Multi-homed customer
Two large flows of traffic
New egress pointfor first flow
Network Operations: DoS Attack
Denial-of-Service attack
Web server at its knees…
Install packetfilter
Web server back to life…
Network Operations: Link Failure
Link failure
New route overloads a link
Routing change alleviates congestion
Summary of the Examples
• How to detect that a link is congested?– Periodic polling of link statistics – Active probes measuring performance– Customer complaints
• How to diagnose the reason for the congestion?– Change in user behavior – Denial of service attack– Router/link failure or policy change
• How to fix the problem???– Interdomain routing change– Installation of packet filters– Intradomain routing change
Network measurement plays a key role in each step!
IP Protocol Background
Characteristics of the Internet
• The Internet is– Decentralized (loose confederation of peers)– Self-configuring (no global registry of topology)– Stateless (limited information in the routers)– Connectionless (no fixed connection between hosts)
• These attributes contribute– To the success of Internet– To the rapid growth of the Internet– … and the difficulty of controlling the Internet!
ISPsender receiver
IP Connectionless Paradigm
• No error detection or correction for packet data– Higher-level protocol can provide error checking
• Successive packets may not follow the same path– Not a problem as long as packets reach the destination
• Packets can be delivered out-of-order– Receiver can put packets back in order (if necessary)
• Packets may be lost or arbitrarily delayed– Sender can send the packets again (if desired)
• No network congestion control (beyond “drop”)– Sender can slow down in response to loss or delay
Layering in the IP Protocols
Internet Protocol
Transmission ControlProtocol (TCP)
User Datagram Protocol (UDP)
TelnetHTTP
SONET ATMEthernet
RTPDNSFTP
IP Suite: End Hosts vs. Routers
HTTP
TCP
IP
Ethernetinterface
HTTP
TCP
IP
Ethernetinterface
IP IP
Ethernetinterface
Ethernetinterface
SONETinterface
SONETinterface
host host
router router
HTTP message
TCP segment
IP packet IP packetIP packet
Example: HTTP Delay
Browser cache
DNSresolution
TCPopen
1st byteresponse
Last byteresponse
Sources of variability of delay• Browser cache hit/miss, need for cache
revalidation• DNS cache hit/miss, multiple DNS servers,
errors• Packet loss, high RTT, server accept queue• RTT, busy server, CPU overhead (e.g., CGI
script)• Response size, receive buffer size, congestion• … downloading embedded image(s) on the
page
IP Addressing
• 32-bit number in dotted-quad notation (12.34.158.5)
• Divided into network & host portions (left and right)
• 12.34.158.0/23 is a 23-bit prefix with 29 addresses
00001100 00100010 10011110 00000101
Network (23 bits) Host (9 bits)
12 34 158 5
Classless InterDomain Routing (CIDR)
• Prefixes are key to Internet scalability– Address allocation by ARIN/RIPE/APNIC and by ISPs– Routing protocols and packet forwarding based on
prefixes– Today, routing tables contain ~150,000 prefixes
• Forwarding based on the longest prefix match– Destination-based forwarding of IP packets– Forwarding table maps prefix to next-hop link(s)– Router identifies the longest matching prefix
4.0.0.0/84.83.128.0/1712.0.0.0/812.34.158.0/23126.255.103.0/24
12.34.158.5
IP Design Philosophy: Main Goals [Clark’88]
• Effective multiplexed utilization of existing networks– Packet switching, not circuit switching
• Continued communication despite network failures– Routers don’t store state about ongoing transfers– End hosts provide key communication services
• Support for multiple types of communication service– Multiple transport protocols (e.g., TCP and UDP)
• Accommodation of a variety of different networks– Simple, best-effort packet delivery service– Packets may be lost, corrupted, or delivered out of order
• Distributed management of network resources– Multiple institutions managing the network– Intradomain and interdomain routing protocols
Operator Philosophy: Tension With IP
• Accountability of network resources– But, routers don’t maintain state about transfers– But, measurement isn’t part of the infrastructure
• Reliability/predictability of services– But, IP doesn’t provide performance guarantees– But, equipment is not very reliable (no “five-9s”)
• Fine-grain control over the network– But, routers don’t do fine-grain resource allocation– But, network self-configures after failures
• End-to-end control over communication– But, end hosts adapt to congestion– But, traffic may traverse multiple domains
The Role of Traffic Measurement
• Operations (control)– Generating reports for customers and internal groups– Diagnosing performance and reliability problems– Tuning the configuration of the network to the traffic– Planning outlay of new equipment (routers, proxies,
links)
• Science (discovery)– End-to-end characteristics of delay, throughput, and loss– Verification of models of TCP congestion control– Workload models capturing the behavior of Web users– Understanding self-similarity/multi-fractal traffic
• We focus helping operators run the network, and assume we have access to the network infrastructure
Measurement Challenges for Operators
• Network-wide view – Crucial for evaluating control actions – Multiple kinds of data from multiple locations
• Large scale– Large number of high-speed links and routers– Large volume of measurement data
• Poor state-of-the-art– Working within existing protocols and products– Technology not designed with measurement in mind
• The “do no harm” principle– Don’t degrade router performance – Don’t require disabling key router features– Don’t overload the network with measurement data
ISP Background andNetwork Operations
ISP Background: Outline
• Autonomous Systems (ASes)– Definition of an Autonomous System– Peer, provider, and customer relationships
• Internet Service Provider architecture– Example backbone network– Logical view of a backbone– Architecture of a high-end router
• Routing protocols– Border Gateway Protocol (BGP)– Interior Gateway Protocols (IGPs)
Internet Architecture
• Divided into Autonomous Systems– Distinct regions of administrative control (~15,000)
– Set of routers and links managed by a single “institution”
– Service provider, company, university, …
• Hierarchy of Autonomous Systems– Large, tier-1 provider with a nationwide backbone
– Medium-sized regional provider with smaller backbone
– Small network run by a single company or university
• Interaction between Autonomous Systems– Internal topology is not shared between ASes
– … but, neighboring ASes interact to coordinate routing
What is an “Institution”?
• Not equivalent to an AS– Many institutions span multiple autonomous systems
– Some institutions do not have their own AS number
– Ownership of an AS may be hard to pinpoint (whois)
• Not equivalent to a block of IP addresses (prefix)– Many institutions have multiple (non-contiguous)
prefixes
– Some institutions are a small part of a larger address block
– Ownership of a prefix may be hard to pinpoint (whois)
• Not equivalent to a domain name (att.com)– Some sites may be hosted by other institutions
– Some institutions have multiple domain names (att.net)
ISP 1
ISP 2
ISP 3
NAP
private peering
commercialcustomer
access router
gateway router
dial-in access
destination
destination
interdomainprotocols
intradomainprotocols
Connections Between ASes
Internet Service Provider Backbone
modem banks,business customers,web/e-mail servers
neighboring providers
Gateway routers
Backbone routers
Access routers
Inside a High-End Router
SwitchingFabric
Processor
Line card
Line card
Line card
Line card
Line card
Line card
Components of a High-End Router
• Route processor– Implementation of the various routing protocols– Creation of forwarding table for the line cards– Command-line interface for network operators– Handling of packets directed to the Loopback address– Handling of “special packets” (IP options, expired TTL)
• Switching fabric– Forwarding of packet from input to output interface
• Line cards– Link-layer protocol to convert to/from IP packets– Packet handling (filtering, route look-up, buffering,
rate limiting, ToS marking, link scheduling,…)– Transfer of packet to/from the switching fabric
Interdomain Routing (Between ASes)
1
2
3
4
5
67
ClientWeb server
Path: 6, 5, 4, 3, 2, 1
Border Gateway Protocol (BGP)
• ASes exchange info about who they can reach– IP prefix: block of destination IP addresses– AS path: sequence of ASes along the path
• Policies configured by the network operator– Path selection: which of the paths to use?– Path export: which neighbors to tell?
1 2 3
12.34.158.5
“I can reach 12.34.158.0/23”
“I can reach 12.34.158.0/23 via AS 1”
Intradomain Routing: OSPF or IS-IS
• Shortest path routing based on link weights– Routers flood link-state information to each other– Routers compute the “next hop” to reach others
• Weights configured by the network operator– Simple heuristics: link capacity or physical
distance– Traffic engineering: tuning link weights to traffic
3
2
2
1
13
1
4
5
3
Asymmetric Routes: Hot-Potato Routing
Web request and TCP ACKs
Web response
client
server
Network Operations: Outline
• Operating a network– Control loop, timescales, and practical
challenges
• Operator tasks– Reporting, troubleshooting, traffic
engineering, provisioning, capacity planning, architecture
• Network model– Network state and data sources
• Conclusions
Responsibilities of Network Operators
Operating a Network
• Control loop– Detect: note the symptoms– Diagnose: identify the illness– Fix: select and dispense the medicine
• Key ingredients– Measurement of the traffic and the network status– Analysis and modeling of the measurement data– Modeling of the network control mechanism (“what
if”)
• Time scales– Minutes to hours– Days to weeks– Months to years
Practical Challenges
• Increase in the scale of the network– Link speeds, # of routers/links, # of peering points– Large network has 100s of routers and 1000s of links
• Significant traffic fluctuations– Time-of-day changes and addition of new customers/peers– Special events (Olympics) and new applications (Napster)– Difficult to forecast traffic load before designing topology
• Market demand for stringent network performance– Service level agreements (SLAs), high-quality voice-over-IP
• Increase in network capability & feature complexity – New services (Quality of Service, Virtual Private Networks) – New routing protocols (MPLS, multicast)
Network Operations Tasks
• Reporting of network-wide statistics– Generating basic information about usage and reliability
• Performance/reliability troubleshooting – Detecting and diagnosing anomalous events
• Traffic engineering– Adjusting network configuration to the prevailing traffic
• Capacity planning– Deciding where and when to install new equipment
• Provisioning of existing network– Process of adding new customers/peers, routers/links, etc.
• Selecting and testing new network architectures– MPLS routing, multicast, monitoring, quality-of-service, ...
Basic Reporting
• Producing basic statistics about the network– For business purposes, network planning, ad hoc studies
• Examples– Proportion of transit vs. customer-customer traffic– Total volume of traffic sent to/from each private peer– Mixture of traffic by application (Web, Napster, etc.)– Mixture of traffic to/from individual customers– Usage, loss, and reliability trends for each link
• Requirements– Network-wide view of basic traffic and reliability statistics– Ability to “slice and dice” measurements in different ways
(e.g., by application, by customer, by peer, by link type)
Topology and Link Utilization
Utilization: link color (high to low)
Troubleshooting
• Detecting and diagnosing problems– Recognizing and explaining anomalous events
• Examples– Why a backbone link is suddenly overloaded– Why the route to a destination prefix is flapping– Why DNS queries are failing with high probability– Why a route processor has high CPU utilization– Why a customer cannot reach certain Web sites
• Requirements– Network-wide view of many protocols and systems– Diverse measurements at different protocol levels– Thresholds for isolating significant phenomena
Traffic Flow Through Backbone
Color/size of node: proportional to traffic to this router (high to low)Color/size of link: proportional to traffic carried (high to low)
Peering point
Traffic Engineering
• Adjusting resource allocation policies– Path selection, buffer management, and link scheduling
• Examples– Changing IGP weights to divert traffic from congested
links– Changing BGP policies to balance load on peering links– Changing RED parameters to improve TCP throughput– Changing WFQ weights to reduce delay for “gold” traffic
• Requirements– Network-wide view of the traffic carried in the backbone– Timely view of the network topology and configuration– Accurate models to predict impact of control operations
(e.g., the impact of RED parameters on TCP throughput)
BGP Policy Change
Multi-homed customer
Two large flows of traffic
New egress pointfor the flow
Capacity Planning
• Deciding whether to buy/install new equipment– What? Where? When?
• Examples– Where to put the next backbone router– When to upgrade a peering link to higher capacity– Whether to add/remove a particular private peer– Whether the network can accommodate a new customer– Whether to install a caching proxy for cable modems
• Requirements– Projections of future traffic patterns from measurements– Cost estimates for buying/deploying the new equipment– Model of the potential impact of the change (e.g.,
latency reduction and bandwidth savings from a caching proxy)
Network State
Network State: Not Just Traffic Measurement
• Topology– Routers and links, and their connectivity and capacity
– BGP sessions with neighbors and within the backbone
• Configuration– Path selection (e.g., OSPF weights, BGP policies)
– Link scheduling (e.g., FIFO or WFQ weights)
– Buffer management (e.g., drop-tail or RED parameters)
– Packet filters (e.g., ingress filters to prevent DoS)
• Interdomain routing– Reachability to neighboring domains (e.g., BGP updates)
Necessary for a network-wide view for the operator
Network State: Data Sources
• Router configuration files– Router name, OS version, IP address, running
processes
– Individual interfaces and their location in the router
– Set of commands applied against the router
• Polling/trapping of SNMP data– Up/down status of individual links, sessions, etc.
• Router forwarding tables– Next-hop link(s) for each destination prefix
• BGP routing tables or BGP monitors– Routing choices advertised by other domains
Tutorial focuses mainly on traffic measurement data
Example: Router Configuration File
• Language with hundreds of different commands• Cisco IOS is a de facto standard config language• Sections for interfaces, routing protocols, filters,
etc.
version 12.0hostname MyRouter!interface Loopback0 ip address 12.123.37.250 255.255.255.255!interface Serial9/1/0/4:0 description MyT1Customer bandwidth 1536 ip address 12.125.133.89 255.255.255.252 ip access-group 10 in!
interface POS6/0 description MyBackboneLink ip address 12.123.36.73 255.255.255.252 ip ospf cost 1024!router ospf 2 network 12.123.36.72 0.0.0.3 area 9 network 12.123.37.250 0.0.0.0 area 9!access-list 10 permit 12.125.133.88 0.0.0.3access-list 10 permit 135.205.0.0 0.0.255.255ip route 135.205.0.0 255.255.0.0 Serial9/1/0/4:0
Example: Forwarding Table (“show ip cef”)
Prefix Next Hop Interface4.20.90.120/29 12.123.28.134 POS7/0 12.123.28.130 POS6/04.20.90.128/29 12.123.28.130 POS6/04.24.7.104/30 12.123.28.134 POS7/04.36.100.0/23 192.205.32.126 ATM5/0.16.0.0.0/8 12.123.28.134 POS7/0 12.123.28.130 POS6/09.2.0.0/16 192.205.32.126 ATM5/0.19.3.4.0/24 12.123.28.130 POS6/09.3.5.0/24 12.123.28.130 POS6/09.20.0.0/17 192.205.32.178 POS0/3
Random or hash-based tie-break to select among multiple next-hops
Example: BGP Table (“show ip bgp” at RouteViews)
Network Next Hop Metric LocPrf Weight Path* 3.0.0.0 205.215.45.50 0 4006 701 80 i* 167.142.3.6 0 5056 701 80 i* 157.22.9.7 0 715 1 701 80 i* 195.219.96.239 0 8297 6453 701 80 i* 195.211.29.254 0 5409 6667 6427 3356 701 80 i*> 12.127.0.249 0 7018 701 80 i* 213.200.87.254 929 0 3257 701 80 i* 9.184.112.0/20 205.215.45.50 0 4006 6461 3786 i* 195.66.225.254 0 5459 6461 3786 i*> 203.62.248.4 0 1221 3786 i* 167.142.3.6 0 5056 6461 6461 3786 i* 195.219.96.239 0 8297 6461 3786 i* 195.211.29.254 0 5409 6461 3786 i
AS 80 is General Electric, AS 701 is UUNET, AS 7018 is AT&TAS 3786 is DACOM (Korea), AS 1221 is Telstra
Conclusions
• Operating IP networks is hard– Basic design philosophy of the IP protocols– Division of Internet into multiple (competing)
ASes
• Measurement and models play a crucial role– Constructing a real-time, network-wide view– Detecting, diagnosing, and fixing problems
• Next two 1.5-hour parts of the tutorial– Overview of the key measurement techniques
• Final 1.5-hour part of the tutorial– Measurement and models for traffic engineering
Conclusions
• Operating IP networks is hard– Basic design philosophy of the IP protocols– Division of Internet into multiple (competing)
ASes
• Measurement and models play a crucial role– Constructing a real-time, network-wide view– Detecting, diagnosing, and fixing problems
• Next two 1.5-hour parts of the tutorial– Overview of the key measurement techniques
• Final 1.5-hour part of the tutorial– Measurement and models for traffic engineering
Part 2
MeasurementTechniques
Part 2: Measurement Techniques
• General terminology– Measurements/metrics, filter/aggregate/sample
• SNMP and RMON– SNMP MIB-II and RMON1
• Packet monitoring– IP, TCP/UDP, and application-level headers
• Flow measurement– Grouping related packets close together in time
• Data interpretation– IP addresses, port numbers, different identifiers
Terminology
active measurements
packet and flowmeasurements,
SNMP/RMON
topology, configuration,routing, SNMP
Terminology: Measurements vs Metrics
end-to-endperformance
state traffic
average downloadtime of a web page
link biterror rate link utilization
end-to-end delayand loss
active topology traffic matrix
demand matrixactive routes
TCP bulk throughput
Delivery of Measurement Data
• Need to transport measurement data– Produced and consumed in different systems
– Large number of measurement devices
– Small number of aggregation points
– Usually in-band transport of measurement data
• Reliable vs. unreliable transport– Reliable
• better data quality
• measurement device needs to maintain state and be addressable
– Unreliable• additional measurement uncertainty due to lost
measurement data
• measurement device can “shoot-and-forget”
Controlling Measurement Overhead
• Measurement overhead– In some areas, could measure everything– Information processing not the bottleneck– Examples: geology, stock market,...– Networking: thinning is crucial!
• Three basic methods to reduce overhead– Filtering– Aggregation– Sampling– ... and combinations thereof
Filtering
• Examples: only record packets matching– Destination prefix (to a certain customer)– Service class (e.g., expedited forwarding)– Violating an ACL (access control list)– TCP SYN/RST (attacks, aborted Web download)
• Simple implementation– Configure the filtering criterion in advance– Filter on certain bits in the packet header
• Mask register indicates which bits matter• Match register indicates value these bits should have
– Create record only for the matching packets
Aggregation
• Examples: compute total bytes/packets– IP addresses (between each pair of end
hosts)– Autonomous System (to a peer/customer)– Destination prefix (to a certain customer
prefix)– Service class (e.g., expedited forwarding)src dest # pkts # bytes
a.b.c.d m.n.o.p 374 85498
e.f.g.h q.r.s.t 7 280
i.j.k.l u.v.w.x 48 3465
.... .... ....
Sampling
• Sampling techniques– Systematic: every 100th packet or second– Random: coin-flip with 1/100 probability– Hash based: deterministic based on a function
• Drawing inferences– Easy
• Mean or variance of byte count
– Hard • Rare events (e.g., likelihood of SYN from A to B)• Negative information (e.g., no packets to A)• Sequences (e.g., packet to A followed by packet to
B)
Basic Properties
SamplingSamplingFilteringFiltering AggregationAggregation
Generality
LocalProcessing
Local memory
Compression
Precision exact exact approximate
constraineda-priori
constraineda-priori
general
filter criterionfor every object
table updatefor every object
only samplingdecision
none one bin pervalue of interest
none
dependson data
dependson data controlled
SNMP/RMON
SNMP/RMON
• Simple Network Management Protocol (SNMP)– Standardized by IETF– Definition of Management Information Base (MIB)– Protocol for querying and effecting MIB
• MIB-II– Aggregate traffic statistics– State information
• RMON1 (Remote MONitoring)– More local intelligence in agent– Agent monitors entire shared LAN– Too complex for high-speed links
SNMP: Naming Hierarchy + Protocol
• Information model: MIB tree– Naming & semantic convention between
management station and agent (router)
• Protocol to access MIB– GET, SET, GET-NEXT: host-initiated– Notification: agent-initiated– UDP!
MGMT
MIB-2
rmonsystem interfaces
statistics alarmhistory protcolDir protcolDist
RMON1 RMON2
... ...
...
MIB-II Overview: Example Groups
• interfaces – Operational state: interface ok, switched off, faulty
– Aggregate traffic statistics: # pkts/bytes in, out,...
– Use: obtain and manipulate operational state; sanity check (does link carry any traffic?); detect congestion
• ip – Errors: ip header error, destination address not valid,
destination unknown, fragmentation problems,...
– Forwarding tables, how was each route learned,...
– Use: detect routing and forwarding problems (e.g., excessive forwarding errors due to bogus destination)
SNMP MIB-II Limitations
• Statistics hardcoded– No local intelligence to accumulate statistics– Cannot alert NMS to prespecified conditions
• Highly aggregated traffic information– Aggregate link statistics– Cannot drill down into more detail
• Protocol: simple=dumb– Cannot express complex queries over MIB – More expressibility in SNMPv3 expression MIB
RMON1: Remote Monitoring
• Advantages– Local intelligence and memory– Reduce management overhead– Robustness to outages
managementstation
monitor
subnet
RMON: Passive Metrics
• statistics group– For every monitored LAN segment:
• Number of packets, bytes, broadcast/multicast packets
• Errors: CRC, length problem, collisions• Size histogram: [64, 65-127, 128-255, 256-511,
512-1023, 1024-1518]
– Similar to interface group, but computed over entire traffic on LAN
Passive Metrics cont.
• history group– Parameters: sample interval, # buckets– Sliding window: robustness to limited outages– Statistics: almost perfect overlap with statistics
group: # pkts/bytes, CRC & length errors, utilization
counter in statistics group
vector of samples
Passive Metrics (Continued)
• host group– Aggregate statistics per host
• pkts in/out, bytes in/out, errors, broadcast/multicast pkts
• hostTopN group– Ordered access into host group– Order criterion configurable
• matrix group– Statistics per source-destination pair
RMON: Active Metrics
event
alarm
filter & capture
nms
packetsgoing throughsubnet
SNMPnotification
alarm condition met
filter condition met
eventlog
packetbuffer
statisticsgroup
Active Metrics cont.
• alarm group– An alarm refers to one (scalar) variable in the RMON
MIB– Define thresholds (rising, falling, or both)
• absolute: e.g., alarm as soon as 1000 errors have accumulated• delta: e.g., alarm if error rate over an interval > 1/sec
– Limiting alarm overhead: hysteresis– Action as a result of alarm defined in event group
• event group– Define events: triggered by alarms or packet capture– Log events– Send notifications to management system– Example: “send a notification to the NMS if #bytes in
sampling interval > threshold”
Alarm Definition
metric delta-metric
Rising alarm with hysteresis
Filter and Capture Groups
• filter group– Define boolean functions over packet bit
patterns and packet status– Bit pattern: e.g., “if source_address in prefix
x and port_number=53”– Packet status: e.g., “if packet experienced
CRC error”
• capture group– Buffer management for captured packets
RMON: Commercial Products
• Built-in– Passive groups: supported on most modern
routers for the LAN environment– Active groups: alarm usually supported;
filter/capture are too taxing
• Dedicated probes– Typically support all nine RMON MIBs– Vendors: Netscout, Allied Telesyn, 3Com,
etc.– Combinations are possible: passive
supported natively, filter/capture through external probe
SNMP/RMON: Summary
• Standardized set of traffic measurements– Multiple vendors for probes & analysis software
– Off-the-shelf tools are available
– IETF: work on MIBs for diffserv, MPLS
• RMON: edge/LAN only– Full RMON support everywhere would probably
cover all our traffic measurement needs • Passive groups could be supported by high-speed
links
• Active groups require per-packet operations & memory
– Following sections: sacrifice flexibility for speed
Packet Monitoring
Packet Monitoring: Outline
• Definition– Passively collecting IP packets on one or more links– Recording IP, TCP/UDP, or application-layer traces
• Scope– Fine-grain information about user behavior (e.g., URLs)– Passively monitoring parts of the network
infrastructure– Helpful in characterizing traffic and diagnosing
problems
• Outline– Tapping a link and capturing packets– Information available in packet traces– Mechanics of collecting the packet traces– Practical challenges in collecting and analyzing the
data– Existing packet monitoring platforms
Monitoring a LAN Link
Host A Host B Monitor
Shared media (Ethernet, wireless)
Host A
Host B
Host C
Monitor
Switch
Multicast switch
Host A Host BBridge/Monitor
Monitor integrated with a bridge
Monitoring a WAN Link
Router A Router B
Monitor
Splitting a point-to-point link
Router A
Line card that does packet sampling
Selecting the Traffic
• Filter to focus on a subset of the packets– IP addresses/prefixes (e.g., to/from specific
Web sites, client machines, DNS servers, mail servers)
– Protocol (e.g., TCP, UDP, or ICMP)– Port numbers (e.g., HTTP, DNS, BGP, Napster)
• Collect first n bytes of packet (snap length)– Medium access control header (if present)– IP header (typically 20 bytes)– IP+UDP header (typically 28 bytes)– IP+TCP header (typically 40 bytes)– Application-layer message (entire packet)
IP Header Format
4-bitVersion
4-bitHeaderLength
8-bit Type ofService (TOS)
16-bit Total Length (Bytes)
16-bit Identification3-bitFlags 13-bit Fragment Offset
8-bit Time to Live (TTL)
8-bit Protocol 16-bit Header Checksum
32-bit Source IP Address
32-bit Destination IP Address
Options (if any)
Payload
20-byte20-byteheaderheader
Analysis of IP Header Traces
• Source/destination addresses for traffic – Identity of popular Web servers & heavy customers
• Traffic breakdown by protocol (TCP/UDP/ICMP)– Amount of traffic not using congestion control
• Distribution of packet delay through the router– Identification of typical delays and anomalies
• Distribution of packet sizes– Workload models for routers and measurement
devices
• Burstiness of the traffic on the link over time– Provisioning rules for allocating link capacity
• Throughput between each pair of src/dest addresses– Detection and diagnosis of performance problems
TCP Header Format
16-bit destination port number
32-bit sequence number
32-bit acknowledgement number
16-bit TCP checksum
Options (if any)
Payload
20-byte20-byteheaderheader
16-bit source port number
16-bit window size4-bit
headerlength
FIN
SYN
RST
PSH
ACK
URG
16-bit urgent pointer
Tcpdump Output(three-way TCP handshake and HTTP request message)
23:40:21.008043 eth0 > 135.207.38.125.1043 > lovelace.acm.org.www: S 617756405:617756405(0) win 32120 <mss 1460,sackOK,timestamp 46339 0,nop,wscale 0> (DF)
timestamp client address and port # Web server(port 80)
SYN flag
23:40:21.036758 eth0 < lovelace.acm.org.www > 135.207.38.125.1043: S
2598794605:2598794605(0) ack 617756406 win 16384 <mss 512>
23:40:21.036789 eth0 > 135.207.38.125.1043 > lovelace.acm.org.www: . 1:1(0) ack 1 win 32120 (DF)
23:40:21.037372 eth0 > 135.207.38.125.1043 > lovelace.acm.org.www: P 1:513(512) ack 1 win 32256 (DF)
23:40:21.085106 eth0 < lovelace.acm.org.www > 135.207.38.125.1043: . 1:1(0) ack 513 win 16384
23:40:21.085140 eth0 > 135.207.38.125.1043 > lovelace.acm.org.www: P 513:676(163) ack 1 win 32256 (DF)
23:40:21.124835 eth0 < lovelace.acm.org.www > 135.207.38.125.1043: P 1:179(178) ack 676 win 16384
sequence number TCP options
TCP Header Analysis
• Source and destination port numbers– Popular applications (HTTP, Napster, SMTP, DNS)– Number of parallel connections between source-dest pairs
• Sequence/ACK numbers and packet timestamps– Out-of-order/lost packets; violations of congestion control– Estimates of throughput and delay of Web downloads
• Number of packets/bytes per connection– Size of typical Web transfers; frequency of bulk transfers
• SYN flags from client machines– Unsuccessful connection requests; denial-of-service attacks
• FIN/RST flags from client machines– Frequency of Web transfers aborted by clients
Packet Contents
• Application-layer header– HTTP and RTSP request and response headers
– FTP, NNTP, and SMTP commands and replies
– DNS queries and responses; OSPF/BGP messages
• Application-layer body – HTTP resources (or checksums of the contents)
– User keystrokes in Telnet/Rlogin sessions
• Security/privacy– Significant risk of violating user privacy
– More sensitive for information from higher-level protocols
– Traffic analysis thwarted by use of end-to-end encryption
HTTP Request and Response Message
GET /tutorial.html HTTP/1.1Date: Wed, 25 Aug 2003 08:09:01 GMTFrom: [email protected]: Mozilla/4.03CRLF
HTTP/1.1 200 OKDate: Wed, 15 Aug 2003 08:09:25 GMTServer: Netscape-Enterprise/3.5.1Last-Modified: Sun, 12 Mar 2000 11:12:23 GMTContent-Length: 23CRLFTraffic measurement talk
Request
Response
Application-Layer Analysis
• URLs from HTTP request messages– Popular resources/sites; potential benefits of caching
• Meta-data in HTTP request/response messages– Content type, cacheability, change frequency, etc.– Browsers, protocol versions, protocol features, etc.
• Contents of DNS messages– Common queries, frequency of errors, query latency
• Contents of Telnet/Rlogin sessions– Intrusion detection (break-ins, stepping stones)
• Routing protocol messages– Workload for routers; detection of routing anomalies– Tracking the current topology/routes in the backbone
Mechanics: Application-Level Messages
• Application-level transfer may span multiple packets– Demultiplex packets into separate “flows”– Key of source/dest IP addresses, port #s, and protocol– Hash table to store packets from different flows
packet
keyheader
time
Mechanics: Application-Level Messages
• Reconstructing ordered, reliable byte stream– Sequence number and segment length in TCP header– Heap to store packets in correct order & discard
duplicates
• Extraction of application-level messages– Parsing the syntax of the application-level header– Identifying the the start of the next message (if any)
• Logging or online analysis of message– Record URL, header, body, checksum, timestamps,
etc.
– Copy traces or analysis result to separate machine
Placement of the Monitor (Edge)
• Client edge– Capture all traffic to/from a single group of clients– Useful for evaluating effectiveness of a proxy– May not be representative of other clients
• Server edge– Capture all traffic to/from a set of Web sites– Useful for detailed characterization of access patterns– May not be representative of accesses to other sites
client edge server edge
Placement of the Monitor (Core)
• Middle of network– Capture all traffic traversing a particular link– Useful for capturing a diverse mix of traffic– May not see all traffic traveling from host A to host B– May not see the reverse traffic from host B to host A
client edge server edge
System Constraints
• High data rate– Bandwidth limits on CPU, I/O, memory, and disk/tape– Could monitor lower-speed links (near the edge of
network)
• High data volume– Space limitations in main memory and on disk/tape– Could do online analysis to sample, filter, & aggregate
• High processing load– CPU/memory limits for extracting, counting, & analyzing– Could do offline processing for time-consuming analysis
• General solutions to system constraints– Sub-select the traffic (addresses/ports, first n bytes)– Kernel and interface card support for measurement– Efficient/robust software and hardware for the monitor
Packet Sampling
• Random sampling of individual packets– Per-packet: select 1-out-of-m packets
– Per-byte: sample in proportion to packet length
• Appropriate for certain types of analysis– Traffic mix by port #, src/dest IP address, protocol, etc.– Packet size distribution, frequency of IP/TCP header
values– Not appropriate for gleaning application-level information
time
Application-Layer Sampling
• Sample application-layer messages– Select 1-out-of-m flows (all packets or no packets)
– Cannot randomly sample at the packet level
• Sample based on hash of the packet header– TCP/UDP session: addresses and port numbers
– End-host pair: source and destination address
time
Packet Monitoring Platforms: Research
• Tcpdump software– Publically-available, easy to use, widely used– UNIX PC with Ethernet/FDDI interface and running tcpdump
• OCxMon (CoralReef) from CAIDA– Monitoring OC3/OC12 links with ATM and SONET interfaces– First ATM cell in a packet, all ATM headers, or all cells
• AT&T Labs GigaScope– Line-card support for pre-processing; fine-grain timestamps– Software for appl-level measurement (HTTP, multimedia)
• SprintLabs monitor – Special-purpose interface (to capture 44 bytes per packet)– Fine-grain timestamps with GPS and clock synchronization
Packet Monitors: Commercial
• Characteristics of commercial monitors– Interfaces for tapping a variety of types of links
– Collection at IP, TCP/UDP, and application layer
– Exporting of data via SNMP/RMON and using Netflow format
– Software for generating standard and customized reports
• Example products– Narus Semantic Traffic Analyzers
• http://www.narus.com/solutions/analyzers.html
– Niksun NetVCR• http://www.niksun.com/products/netvcr.html
– Sniffer.com• http://www.sniffer.com/products/default.asp
– Agilent NetMetrix• http://www.agilent.com/cm/product/netmetrix/nmx_products_datcol.shtml
Packet Sampling: psamp Working Group
• IETF working group on packet sampling– Minimal measurement functionality
– Suitable for high-speed line cards
– Tunable overhead/accuracy trade-offs– http://www.ietf.org/html.charters/psamp-charter.html
• Basic idea: parallel filter/sample banks– Filter on header fields (src/dest addr, port #s,
…)
– 1-out-of-N sampling (random or hash-based)
– Extract header fields, output link, IP prefix, …
– Send group of records to remote collectors
Conclusions
• Packet monitoring– Detailed, fine-grain view of individual links
• Advantages– Finest level of granularity (individual IP packets)– Primary source of application-level information
• Disadvantages– Expensive to build and deploy– Large volume of measurement data– Difficult to deploy widely over a large network– Challenges collecting on high-speed links– Challenges of reconstructing application-level
info
Flow Measurement
Flow Measurement: Outline
• Definition– Passively collecting statistics about groups of packets– Group packets based on headers and spacing in time– Essentially a way to aggregate packet measurement
data
• Scope– Medium-grain information about user behavior– Passively monitoring the link or the interface/router– Helpful in characterizing, detecting, diagnosing, and
fixing
• Outline– Definition of an IP “flow” (sequence of related packets)– Flow measurement data and its applications– Mechanics of collecting flow-level measurements– Reducing the overheads of flow-level measurement
flow 1 flow 2 flow 3 flow 4
IP Flows
• Set of packets that “belong together”– Source/destination IP addresses and port numbers
– Same protocol, ToS bits, …
– Same input/output interfaces at a router (if known)
• Packets that are “close” together in time– Maximum spacing between packets (e.g., 15 sec, 30
sec)
– Example: flows 2 and 4 are different flows due to time
Flow Abstraction
• A flow is not exactly the same as a “session”– Sequence of related packets may be multiple flows
(due to the “close together in time” requirement)
– Sequence of related packets may not follow the same links (due to changes in IP routing)
– A “session” is difficult to measure from inside the network
• Motivation for this abstraction– As close to a “session” as possible from inside the network
– Flow switching paradigm from IP-over-ATM technology
– Router optimization for forwarding/access-control decisions(cache the result after the first packet in a flow)
– … might as well throw in a few counters
Recording Traffic Statistics (e.g., Netflow)
• Packet header information (same for every packet)– Source and destination IP addresses
– Source and destination TCP/UDP port numbers
– Other IP & TCP/UDP header fields (protocol, ToS bits, etc.)
• Aggregate traffic information (summary of the traffic)– Start and finish time of the flow (time of first & last packet)
– Total number of bytes and number of packets in the flow– TCP flags (e.g., logical OR over the sequence of packets)
start finish
4 packets1436 bytes
SYN, ACK, & FIN
SYN ACK ACK FIN
Recording Routing Info (e.g., Netflow)
• Input and output interfaces– Input interface is where the packets entered the router– Output interface is the “next hop” in the forwarding table
• Source and destination IP prefix (mask length)– Longest prefix match on the src and dest IP addresses
• Source and destination autonomous system numbers– Origin AS for src/dest prefix in the BGP routing table
SwitchingFabric
Processor
Line card
Line card
Line card
Line card
Line card
Line card
BGP tableforwarding table
Measuring Traffic as it Flows By
input output
source AS
source prefix
source
dest AS
dest prefix
dest
intermediate AS
Source and destination: IP headerSource and dest prefix: forwarding table or BGP tableSource and destination AS: BGP table
Packet vs. Flow Measurement
• Basic statistics (available from both techniques)– Traffic mix by IP addresses, port numbers, and protocol– Average packet size
• Traffic over time– Both: traffic volumes on a medium-to-large time scale– Packet: burstiness of the traffic on a small time scale
• Statistics per TCP connection– Both: number of packets & bytes transferred over the link– Packet: frequency of lost or out-of-order packets, and the
number of application-level bytes delivered
• Per-packet info (available only from packet traces)– TCP seq/ack #s, receiver window, per-packet flags, …– Probability distribution of packet sizes– Application-level header and body (full packet contents)
Collecting Flow Measurements
Router A
Route CPU that generates flow records
…may degrade forwarding performance
Router A
Line card that generates flow records…more efficient to support
measurement in each line card
Router A Router B
Monitor
Packet monitor that generates flow records
…third party
CPU
Router Collecting Flow Measurement
• Advantage– No need for separate measurement device(s)– Monitor traffic over all links in/out of router (parallelism)– Ease of providing routing information for each flow
• Disadvantage– Requirement for support in the router product(s)– Danger of competing with other 1st-order router features– Possible degradation of the throughput of the router– Difficulty of online analysis/aggregation of data on router
• Practical application– View from multiple vantage points (e.g., all edge links)
Packet Monitor Collecting Flow Records
• Advantages– No performance impact on packet forwarding– No dependence on support by router vendor – Possibility of customizing the thinning of the data
• Disadvantages– Overhead/cost of tapping a link & reconstructing packets– Cost of buying, deploying, and managing extra
equipment– No access to routing info (input/output link, IP prefix, etc.)
• Practical application– Selective monitoring of a small number of links– Deployment in front of particular services or sites
• Packet monitor vendors support flow-level output
Mechanics: Flow Cache
• Maintain a cache of active flows– Storage of byte/packet counts, timestamps, etc.
• Compute a key per incoming packet– Concatenation of source, destination, port #s, etc.
• Index into the flow cache based on the key– Creation or updating of an entry in the flow cache
#bytes, #packets, start, finish
#bytes, #packets, start, finishpacket
keyheader
key
key
Mechanics: Evicting Cache Entries
• Flow timeout– Remove flows that have not received a packet recently – Periodic sequencing through the cache to time out flows– New packet triggers the creation of a new flow
• Cache replacement– Remove flow(s) when the flow cache is full– Evict existing flow(s) upon creating a new cache entry– Apply eviction policy (LRU, random flow, etc.)
• Long-lived flows– Remove flow(s) that persist for a long time (e.g., 30 min)– … otherwise flow statistics don’t become available– … and the byte and packet counters might overflow
Measurement Overhead: Per Packet/Flow
• Per-packet overhead– Computing the key for the packet– Indexing into the flow cache based on the key– More work when the average packet size is small– May not be able to keep up with the link speed
• Per-flow overhead– Creation and eviction of entry in the flow cache– Volume of measurement data (# of flow records)– Larger number of flows when #packets per flow is
small– Extreme case of one-packet, 40-byte flows (SYN
attack!)– May overwhelm the system collecting/analyzing the
data
Measurement Overhead: Active Flows
• Number of active flows on the link– Depends on the link speed and per-flow bandwidth
– Memory requirements of the flow cache
– Overhead for indexing into the flow cache
– Overhead for sequencing through the flow cache
• “Working set” may exceed the cache size– #of active flows may be larger than # cache entries
– Repeated eviction and creation of new cache entries
– Division of single flow into multiple flow records
– Increase in volume of flow measurement records
Sampling: Packet Sampling
• Packet sampling before flow creation (Sampled Netflow)– 1-out-of-m sampling of individual packets (e.g., m=100)– Create of flow records over the sampled packets
• Reducing overhead– Avoid per-packet overhead on (m-1)/m packets– Avoid creating records for a large number of small flows
• Increasing overhead (in some cases)– May split some long transfers into multiple flow records – … due to larger time gaps between successive packets
time
not sampled
two flowstimeout
Sampling: Flow-Level Sampling
• Sampling of flow records evicted from flow cache– When evicting flows from table or when analyzing flows
• Stratified sampling to put weight on “heavy” flows– Select all long flows and sample the short flows
• Reduces the number of flow records – Still measures the vast majority of the traffic
Flow 1, 40 bytesFlow 2, 15580 bytesFlow 3, 8196 bytesFlow 4, 5350789 bytesFlow 5, 532 bytesFlow 6, 7432 bytes
sample with 100% probability
sample with 0.1% probability
sample with 10% probability
Aggregation
• Define flows at a coarser level– Ignore TCP/UDP port numbers, ToS bits, etc.– Source & destination IP prefix (not full addresses)– Source & destination Autonomous System number
• Advantages– Substantially reduce the size of the flow cache– Substantially reduce the number of flow records– Accurate information for a variety of applications
• Disadvantage– Lost information for basic traffic reporting– Impacted by the view of routing (prefixes) at this router
• Could aggregate after creating fine-grain flows– When evicting flows from the flow cache– When analyzing the flow records (offline)
IETF Standards Activity
• Real-Time Traffic Flow Meter (RTFM)– Past working group on describing and measuring flows– Meter with flow table and packet matching/handling– Meter readers that transport usage data from the meters– Manager for downloading rule sets to the meter– SNMP for downloading rules and reading usage data
• Internet Protocol Flow eXport (IPFX)– Distinguishing flows (interfaces, IP header fields, transport
header fields, MPLS label, diff-serv code point)– Metering (reliability, sampling, timestamps, flow timeout)– Data export (information model, reliability, confidentiality,
integrity, anonymization, reporting times) – http://www.ietf.org/html.charters/ipfix-charter.html
Conclusions
• Flow measurement– Medium-grain view of traffic on one or more links
• Advantages– Lower measurement volume than full packet traces– Available on high-end line cards (Cisco Netflow)– Control over overhead via aggregation and
sampling
• Disadvantages– Computation and memory requirements for the flow
cache– Loss of fine-grain timing and per-packet information – Not uniformly supported by router vendors
Data Interpretation
Data Interpretation: Outline
• Problem– Measurement data collected inside the
network– Ambiguity about applications, hosts, &
institutions– An issue for both packet and flow
measurement
• Outline– Identifying hosts and institutions– Dynamic IP addresses– Port number ambiguity– Multiple identifiers across data sets
Identifying Hosts and Institutions
• nslookup– Translate domain name to an IP address– Translate IP address into domain name(s)– Reverse queries often unsuccessful
• whois– Identify institution responsible for an IP address– Identify institution responsible for an AS number– Information often incomplete or out-of-date
• traceroute– Identify names of interfaces near the destination
address
• BGP tables– Associate an IP address with an IP prefix– Associate an address or prefix with an origin AS– Different vantage points, multiple origin ASes, etc.
nslookup Example
jrex% nslookup 135.207.38.125
*** alliance.research.att.com can't find 135.207.38.125: Non-existent host/domain
jrex% nslookup 199.222.69.151
Name: lovelace.acm.orgAddress: 199.222.69.151
whois Examples
jrex% whois -h whois.arin.net 135.207.38.125AT&T ITS (NETBLK-ATT-135-197-219-B) ATT-135-197-219-B 135.197.0.0 - 135.219.255.255AT&T Corporate ITS (NET-ATT-135-207-B) ATT-135-207-B 135.207.0.0 - 135.207.255.255
jrex% whois -h whois.arin.net 199.222.69.151ANS Communications (NETBLK-ANS-C-BLOCK2) ANS-C-BLOCK2 199.219.0.0 - 199.222.255.255Association for Computing Machiner (NETBLK-ACM-NET) ACM-NET 199.222.69.0 - 199.222.69.255
traceroute example
jrex% traceroute 199.222.69.151traceroute to 199.222.69.151 (199.222.69.151), 30 hops max, 40 byte packets 1 oden (135.207.31.1) 1 ms 1 ms 1 ms 2 opsec (135.207.1.2) 2 ms 2 ms 2 ms 3 argus (192.20.225.225) 33 ms 219 ms 4 ms 4 Serial1-4.GW4.EWR1.ALTER.NET (157.130.0.177) 4 ms 3 ms 4 ms 5 117.ATM4-0.XR1.EWR1.ALTER.NET (152.63.25.186) 4 ms 4 ms 4 ms 6 293.ATM6-0.XR1.NYC4.ALTER.NET (146.188.178.57) 4 ms 5 ms 11 ms 7 189.ATM8-0-0.GW2.NYC4.ALTER.NET (146.188.178.133) 5 ms 5 ms 6 ms 8 acm-gw.customer.ALTER.NET (157.130.17.238) 11 ms 10 ms 10 ms 9 lovelace.acm.org (199.222.69.151) 9 ms 8 ms 8 ms
BGP Table Example (Using RouteViews)
* 199.222.69.0 167.142.3.6 0 5056 701 7046 i* 4.0.0.2 0 1 701 7046 i* 204.42.253.253 0 267 2914 701 7046 i* 212.4.193.253 0 8918 701 7046 i* 205.215.45.50 0 4006 701 7046 i* 193.140.0.1 0 8517 9000 2548 701 7046 i* 165.87.32.5 0 2685 701 7046 e* 206.220.240.223 0 10764 1 701 7046 i* 203.62.248.4 0 1221 16779 1 701 7046 i* 203.62.252.21 0 1221 16779 1 701 7046 i* 157.22.9.7 0 715 1 701 7046 i* 193.0.0.56 0 3333 9057 3356 701 7046 i* 195.219.96.239 0 8297 6453 701 7046 iPrefix 199.222.69.0/24 has origin AS 7046(whois says that 7046 is ASN-UUNET-CUSTOMER)
Associating IP Address With Host
• Problem: dynamic assignment of IP addresses– Difficult to associate traffic with client or user over time
• Track assignment of IP address– Modem records that indicate IP address assignment– Logs or traces of address allocations by DHCP servers
• Timeout heuristic– Assume a single client is responsible for the traffic– … until a period of inactivity (IP address may be
reassigned)– Useful for studying “session-level” user behavior
• Application-level headers– Extract client/user-level information from layer-4
header– Unique user with From: [email protected] or Cookie: user17– Change from User-Agent: Mozilla/4.03 to Mozilla/2.0
Port Numbers: Ambiguity
• Associate well-known ports with applications– Port 53 for DNS, port 80 for HTTP, port 119 for NNTP, …
• Expect unusual use of port numbers– Keep track of de facto ports (e.g., 8000 and 8080 for HTTP)
– Check for anomalous transfers on well-known ports
– Explore sites using unexplained port #s (download games!)
• Expect dynamic assignment of port numbers– Check for parallel transfers between the same end points
(e.g., to find FTP data transfer in parallel to FTP control)
– Parse the control messages that select the dynamic port #s(e.g., parse FTP commands and RTSP messages)
Incoming Traffic Volume by Application
1.0%1.0%HTTPS
3.2%1.3%NNTP
2.9%1.5%SMTP
10.7%2.0%Napster
3.2%3.3%UDP unknown
4.3%5.7%FTP
0.0%9.9%KaZaA
12.9%17.2%TCP unknown
53.2%46.7%HTTP
Sep 13, 2000July 1, 2001Application
Multiple Identifiers of Interfaces
• Flow-level measurements– Netflow record identifies an interface by SNMP index– Flow with input interface “7” and output interface “3”
• SNMP interface MIB– Index (7), IP address (12.123.36.73), and name
(POS6/0)
• Configuration files– Interface IP address (12.123.36.73) and name (POS6/0)– BGP session to a neighbor AS associated with the
interface
• Joining multiple data sets – Netflow: traffic volume and SNMP interface index – SNMP MIB: interface index, IP address, and name– Configuration: neighbor AS associated with interface
Conclusions
• Identifying institutions– Tools like nslookup, whois, traceroute, BGP tables
• IP and port number ambiguity– Heuristics, appl-level info, and de facto standards
• Multiple identifiers for same network element– Join of information from multiple data sets
Part 3
Measurements and Models
for Traffic Engineering
Outline
• Motivation– Traffic engineering and network-wide
views
• Network-wide models– Path matrix, traffic matrix, & demand
matrix
• Populating the models– Inference, mapping, and direct
observation
• Applying the models– Tuning OSPF weights to the offered traffic
Traffic Engineering
• Goal: domain-wide control & management to– Satisfy performance goals– Use resources efficiently
• Knobs:– Topology: provisioning, capacity planning– Routing: OSPF weights, BGP policies,…– Traffic classification (diffserv), admission control,
…
• Measurements are key: closed control loop– Understand current state, load, and traffic flow– Do “what-if” analysis to select control actions– Inherently a coarse-grained activity
Methodology: Measure, Model, and Control
Topology/Configuratio
n
Offeredtraffic
Changes tothe network
Operational network
Analysis & optimization
measure
control
model
Network-Wide Traffic Representations
• Network-wide views– Not supported by IP (stateless, decentralized)
– Combining traffic, topology, and state information
• Challenges– Assumptions about the properties of the traffic
– Assumptions about the topology and routing
– Assumptions about the support for measurement
• Models: traffic, demand, and path matrices– Populating the models from measurement data
– Recent proposals for new types of measurements
End-to-End Traffic & Demand Models
path matrix = bytes per path
traffic matrix =bytes per source-destination pair
Ideally, capturesall the information about the current network state and behavior
Ideally, capturesall the information that isinvariant with respect to the network state
Domain-Wide Network Traffic Models
current state &traffic flow
fine grained:path matrix = bytes per path
intradomain focus:traffic matrix =bytes per ingress-egress
interdomain focus:demand matrix =bytes per ingress andset of possible egresses
predictedcontrol action:impact of intra-domain routing
predictedcontrol action:impact of inter-domain routing
Path Matrix: Operational Uses
• Congested link– Problem: easy to detect, hard to diagnose
– Which traffic responsible? Which traffic affected?
• Customer complaint– Problem: customer has limited visibility
– How is the traffic of a given customer routed?
– Where does the traffic experience loss and delay?
• Denial-of-service attack– Problem: spoofed source addr, distributed attack
– Where is attack coming from? Who is affected?
Traffic Matrix: Operational Uses
• Short-term performance problems– Problem: predicting link loads after routing
change
– Map the traffic matrix onto the new set of routes
• Long-term performance problems– Problem: predicting loads after topology change
– Map traffic matrix onto the new topology
• Reliability despite equipment failures– Problem: allocating spare capacity for failover
– Find link weights so no failure causes overload
Demand Matrix: Motivating Example
Big Internet
Web Site User Site
Coupling of Inter and Intradomain Routing
Web Site User Site
AS 1AS 3
AS 4
U
AS 3, U
AS 3, U
AS 3, U
AS 4, AS 3, U
AS 2
Intradomain Routing: Hot Potato
Zoom in on AS1
200
11010
110
300
25
75
110
300
IN
OUT 2
110
Hot-potato routing: change in internal routing (link weights)
configuration changes flow exit point!
50
OUT 3
OUT 1
Demand Model: Operational Uses
• Coupling problem with traffic matrix approach
Demands: # bytes for each (in, {out_1,...,out_m})– ingress link (in)
– set of possible egress links ({out_1,...,out_m})
Traffic matrix
Traffic Engineering
Improved Routing
Traffic matrix
Traffic Engineering
Improved Routing
Demand matrix
Traffic Engineering
Improved Routing
Populating the Domain-Wide Models
• Inference: assumptions about traffic & routing– Traffic data: byte counts per link (over time)
– Routing data: path(s) between each pair of nodes
• Mapping: assumptions about routing– Traffic data: packet/flow statistics at network edge
– Routing data: egress points per destination prefix
• Direct observation: no assumptions– Traffic data: packet samples at every link
– Routing data: none
4Mbps 4Mbps
3Mbps5Mbps
Inference: Network Tomography
Sources
Destinations
From link counts to the traffic matrix
Tomography: Formalizing the Problem
• Ingress-egress pairs – p is a ingress-egress pair of nodes
– xp is the (unknown) traffic volume for this pair
• Routing– Rlp = 1 if link l is on the path for pair p
– Or, Rlp is proportion of p’s traffic that traverses l
• Links in the network– l is a unidirectional edge
– yl is the observed traffic volume on this link
• Relationship: y = Rx (work back to get x)
Tomography: One Observation Not Enough
• Linear system is underdetermined– Number of nodes n
– Number of links e is around O(n)
– Number of ingress-egress pairs c is O(n2)
– Dimension of solution sub-space at least c - e
• Multiple observations are needed– k independent observations (over time)
– Stochastic model with Poisson iid counts
– Maximum likelihood estimation to infer matrix
– Vardi, “Network Tomography,” JASA, March 1996
Tomography: Challenges
• Limitations– Cannot handle packet loss or multicast traffic
– Statistical assumptions don’t match IP traffic
– Significant error even with large # of samples
– High computation overhead for large networks
• Directions for future work– More realistic assumptions about the IP traffic
– Partial queries over subgraphs in the network
– Incorporating additional measurement data
Promising Extension: Gravity Models
• Gravitational assumption– Ingress point a has traffic vi
a
– Egress point b has traffic veb
– Pair (a,b) has traffic proportional to via * ve
b
• Incorporating hot-potato routing– Combine traffic across egress points to same
peer– Divide a’s traffic proportional to peer loads– “Hot potato” for single egress point for a’s traffic
• Experimental results [SIGMETRICS’03]– Reasonable accuracy, especially for large pairs– Sufficient accuracy for traffic engineering
Mapping: Remove Traffic Assumptions
• Assumptions– Know egress point where traffic leaves the domain– Know path from the ingress to the egress point
• Approach– Collect fine-grain measurements at ingress points– Associate each record with path and egress point– Sum over measurements with same path/egress
• Requirements– Packet or flow measurement at the ingress points– Routing table from each of the egress points
Traffic Mapping: Ingress Measurement
• Traffic measurement data– Ingress point i
– Destination prefix d
– Traffic volume Vid
i dingress destination
Traffic Mapping: Egress Point(s)
• Routing data (e.g., router forwarding tables)– Destination prefix d
– Set of egress points ed
ddestination
Traffic Mapping: Combining the Data
• Combining multiple types of data
– Traffic: Vid (ingress i, destination prefix d)
– Routing: ed (set ed of egress links toward d)
– Combining: sum over Vid with same ed
iingress
egress set
Mapping: Challenges
• Limitations– Need for fine-grain data from ingress points
– Large volume of traffic measurement data
– Need for forwarding tables from egress point
– Data inconsistencies across different locations
• Directions for future work– Vendor support for packet/flow measurement
– Distributed infrastructure for collecting data
– Online monitoring of topology and routing data
Direct Observation: Overcome Uncertainty
• Internet traffic– Time fluctuation (burstiness, congestion control)
– Packet loss as traffic flows through the network
– Inconsistencies in timestamps across routers
• IP routing protocols– Changes due to failure and reconfiguration
– Large state space (high number of links or paths)
– Vendor-specific details (e.g., tie-breaking)
– Multicast trees that send to a set of receivers
• Better to observe traffic directly as it travels
Direct Observation: Straw-Man Approaches
• Path marking– Packet carries the path it has traversed so far
– Drawback: excessive overhead
• Packet or flow measurement on every link– Combine records across all links to obtain paths
– Drawback: excessive overhead
• Sample the entire path for certain packets– Sample & tag a fraction of packets at ingress
point
– Sample all tagged packets inside the network
– Drawback: requires modification to IP
Direct Observation: Trajectory Sampling
• Sample packets at every link without tagging– Pseudo random sampling (e.g., 1-out-of-100)
– Either sample or don’t sample at each link
– Compute a hash over the contents of the packet
• Details of consistent sampling– x: subset of invariant bits in the packet
– Hash function: h(x) = x mod A
– Sample if h(x) < r, where r/A is a thinning factor
• Exploit entropy in packet contents
Trajectory Sampling: Fields in Hashes
Trajectory Sampling: Labeling
• Reducing the measurement overhead– Do not need entire contents of packets
– Compute packet id using 2nd hash function
– Reconstruct trajectories from the packet ids
• Trade-off– Small labels: possibility of collisions
– Large labels: higher overhead
– Labels of 20-30 bits seem to be enough
Trajectory Sampling: Sampling & Labeling
Trajectory Sampling: Summary
• Advantages– Estimation of the path and traffic matrices
– Estimation of performance statistics
– No assumptions about routing or traffic
– Applicable to multicast traffic and DoS attacks
– Flexible control over measurement overhead
• Disadvantages– Requires new support on router interface cards
– Requires use of same hash function at each hop
Populating Models: Summary/Comparison
• Inference– Given: per-link counts and routes per src/dest pair– Network tomography with stochastic traffic model– Others: gravity models, entropy models, …
• Mapping– Given: ingress traffic measurement and routes– Combining flow traces and forwarding tables– Other: combining packet traces and BGP tables
• Direct observation– Given: measurement support at every link/router– Trajectory sampling with consistent hashing– Others: IP traceback, ICMP traceback
Applying the Models: Traffic Engineering
• Network topology– Connectivity and capacity of routers and links
• Routing configuration– Interdomain policies and intradomain weights
• Traffic demands– Expected load between points in the network
• Performance objective– Balanced load, low delay, peering
agreements, …
Given topology and traffic, select routing parameters
Tuning OSPF/IS-IS Link Weights
• Measured inputs– Traffic demands– Network topology
• Objective function– Max link utilization or sum of exp(utilization)
• “What-if” model of intradomain routing– Select a closest exit point based on link weights– Compute shortest path(s) based on link weights– Capture traffic splitting over the shortest paths
• Search for a good setting of the weights
Weight Optimization
• Local search– Generate a candidate setting of the weights– Predict the resulting load on the network links– Compute the value of the objective function– Select solution with min objective function
• Efficient computation– Explore the “neighborhood” around good
solutions– Exploit efficient incremental graph algorithms
• Performance on AT&T’s network– Much better than link capacity or physical
distance– Competitive with multi-commodity flow solution
Incorporating Operational Realities
• Minimize changes to the network– Changing just 1-2 link weights is often enough
• Tolerate failure of network equipment– Weights settings usually remain good after failure– … or can be fixed by changing one or two weights
• Limit the number of distinct weight values– Small number of integer values is sufficient
• Limit dependence on measurement accuracy– Good weights remain good despite random noise
• Limit frequency of changes to the weights– Joint optimization for day & night traffic matrices
Conclusions
• Operating IP networks is challenging– IP networks stateless, best-effort, heterogeneous– Operators lack end-to-end control over the path– IP was not designed with measurement in mind
• Domain-wide traffic models– Needed to detect, diagnose, and fix problems– Models: path, traffic, and demand matrices– Techniques: inference, mapping, and direct
observation– Different assumptions about traffic, routing, and data
• Follow-up references– http://www.research.att.com/~jrex/papers/sfi.ps– http://www.research.att.com/~jrex/papers/ieeecomm02.pdf
• Thanks for coming! Any questions?