internet routing instability

28
Internet Routing Instability Craig Labovitz, G. Robert Malan, Farham Jahanian University of Michigan Presented By Krishnanand M Kamath

Upload: garth

Post on 10-Feb-2016

50 views

Category:

Documents


0 download

DESCRIPTION

Internet Routing Instability. Craig Labovitz, G. Robert Malan, Farham Jahanian University of Michigan Presented By Krishnanand M Kamath. Cause and Effect. Define routing instability Rapid change of network reachability and topology information. Causes Router Configuration Errors - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Internet Routing Instability

Internet Routing InstabilityCraig Labovitz, G. Robert Malan, Farham Jahanian

University of Michigan

Presented ByKrishnanand M Kamath

Page 2: Internet Routing Instability

Cause and EffectDefine routing instability

Rapid change of network reachability and topology information

Causes Router Configuration Errors Transient Physical and data link problems

Problems with leased line, router failures, high levels of congestion Software Configuration Errors

Effects Very many – slew of effects

Page 3: Internet Routing Instability

Effects Increased network latency and time to convergence

Dropped and out of order delivery of packets Miserable end to end performance

Loss of connectivity in national networks Route caching architecture and low end processors for CPU

Pr(Cache Miss) increases, severe CPU load, memory problemsDelays in packet processing, Keep-Alive packets are delayed

Others flag the router as down and transmit updatesDown router reinitiates peering sessionLarge state dump transmissionYet more routers fail- Route Flap Storm

Page 4: Internet Routing Instability

SolutionsRoute Aggregation

Reduces the overall number of networks visible in the coreRequires cooperation between service providersRedundant connectivity to the internet – multi-homing

Route Dampening AlgorithmsNot a panacea – legitimate announcements may be delayed

Overall,Multi-homing exhibiting linear growthInternet topology growing increasingly less hierarchicalIncreasing topological complexity

Page 5: Internet Routing Instability

RecallUpdates

Announcements•New route•New policy decision for an existing route

Withdrawals

Explicit – associated with a withdrawal message

Implicit – existing route isReplaced by announcementOf new route

Page 6: Internet Routing Instability

Types of UpdatesInter-domain routing updates

Forwarding Instability

Legitimate topological changes and affect the paths on which data will be forwarded between AS’s

Routing policy fluctuation

Reflects changes in routing policy information that may not affecting forwarding paths between AS’s

Pathological Updates

Redundant BGP info that reflect neither routing nor forwarding instability

Page 7: Internet Routing Instability

Major ResultsNumber of BGP updates is one or more orders of magnitude larger than expected.Routing information is dominated by pathological updates

Instability and redundant updates exhibit a periodicity of 30 & 60 secs

Instability and redundant updates show a correlation to network usage

Instability is not dominated by a small set of AS or routesDiscounting policy fluctuation and pathological behavior there remains a significant level of internet forwarding instability

Specific architectural and protocol implementation changes in commercial internet routers through collaboration with vendors

Page 8: Internet Routing Instability

TaxonomyData Analyzed

Sequences of BGP updates for each (prefix, peer) tuple

Events Identified•WADiff

A route is explicitly withdrawn as it becomes unreachable and later replacedwith an alternative route to the same destination. The alternative route differsin its ASPATH or nexthop attribute information.(Forwarding Instability)

•AADiffA route is implicitly withdrawn and replaced by an alternative route as the original route becomes unreachable, or a prefferd alternative path becomesAvailable (Forwarding Instability)

Page 9: Internet Routing Instability

Taxonomy(contd.)Events Identified(contd.)•WADup

A route is explicitly withdrawn and then re-announced as reachable. This mayreflect transient topological failure, or it may represent a pathological oscillation.(Forwarding Instability or Pathological Behavior)

•AADupA route is implicitly withdrawn and replaced with a duplicate of the original route.Duplicate Route – is defined as a subsequent route announcement that does notdiffer in nexthop or ASPATH attribute information.(Pathological Behavior or Route Ploicy Fluctuation)

•WWDupThe repeated transmission of BGP withdrawals for a prefix that is currentlyunreachable. (Pathological Behavior)

Page 10: Internet Routing Instability

MethodologyData Collected: BGP routing messagesTime Period: Over the course of 9 months starting Jan 96Where: Five of the major U.S. network exchange pointsTool: Unix based route servers, Multithreaded routing Toolkit(MRT)

Page 11: Internet Routing Instability

Gross ObservationsWe Expect,

Instability (Globally visible addresses, total number of available paths)

We Observe,For 45,000 prefixes and 1500 paths- 3 to 6 million updates per day

Page 12: Internet Routing Instability

Pathological Behavior

Disturbing behaviors,Most of the BGP updates entirely pathological (WWDup)Disproportionate effect that a single service provider can have on global routingCausal relationship between manufacturer of a router and level of pathological behaviorRouting updates have a regular, specific periodicity of either 30 or 60 secondsPersistence of pathological behavior are under five minutes

Page 13: Internet Routing Instability

Origins of PathologiesStateless BGP: Withdrawals are sent for every explicitly and implicitly

withdrawn prefix- no state on info advertised to peers

Plausible Explanations,CSU Timer problemsUnjittered 30 second interval timer, self-synchronization Misconfigured interaction of IGP/BGP protocolsRouter vendor software bugsUnconstrained routing policies

Page 14: Internet Routing Instability

Analysis of InstabilityInstability as the sum of AADiff, WADiff and WADup updates

Page 15: Internet Routing Instability

Fine-grained Instability StatisticsThere is no correlation between the size of an AS and its

proportion of the instability statistics.

Page 16: Internet Routing Instability

Fine-grained Instability StatisticsNo single AS or prefix consistently dominates the instability statistics

Instability is evenly distributed across routes

Page 17: Internet Routing Instability

Temporal Properties of Instability

Plausible causes for the periodicity,Routing software timers, self synchronization, and routing loopsCSU handshaking timeoutsFlaw in routing protocol

Page 18: Internet Routing Instability

Origins of Internet Routing Instability

Craig Labovitz, G. Robert Malan, Farham Jahanian

University of Michigan

Page 19: Internet Routing Instability

IntroductionWe observed,

Several orders of magnitude more routing updatesLarge number of duplicate routing messagesUnexpected frequency components between instability events

Extend earlier analysis by,Identifying the origins of many of the pathological behaviorImpact of specific commercial router software changes suggestedAdditional router software changes that can decrease updates exchanged by an additional 30 percent or more

Page 20: Internet Routing Instability

Major ResultsVolume of inter-domain routing updates has decreased by an order of magnitude since April 1997.

The majority of BGP messages consists of redundant announcements

A growing proportion of instability stems from specific changes in Internet architecture coupled with limitations in router software and algorithms.

Instability is not disproportionately dominated by prefixes of specific lengths.

Persistently oscillating routes dominate the BGP traffic generated by a few Internet providers.Experimentally confirmed a num of origins of pathological routing behavior postulated in the earlier work.

Page 21: Internet Routing Instability

Analysis of Gross Trends

Note,Dramatic decrease in the number of withdrawalsNumber of announcements have doubled over 28 month period Growth of BGP announcements disproportional to any corresponding increase in the number of routing table entries

Page 22: Internet Routing Instability

TaxonomyAnalyze sequences of BGP updates for each (prefix, peer) tuple

Identify the events,•AADup:

A route is implicitly withdrawn and replaced with a duplicate of the original route.We define a duplicate route as a subsequent route announcement that does not differ in any BGP path attribute information.

•AADiff:A route is implicitly withdrawn and replaced by an alternative route as the originalroute becomes unreachable, or a preferred alternative path becomes available.

•Tup and TdownFluctuation in the reachability for a given prefixTup:currently unreachable prefix announced reachable & transitions upTdown: announced route is withdrawn and transitions down

Page 23: Internet Routing Instability

Analysis of Update Categories

AADup Behavior stems from:1. Non – transitive attribute filtering2. Combination of BGP minimum advertising timer with stateless BGP

Page 24: Internet Routing Instability

Analysis of AADiffs

NoteLow percentage of ASPath ASDiffsGrowth in number of origin AADiffs related to architecture and and policy issuesGrowth in number of community AADiffs reflects its recent adoption by many ISPsOscillations in MED due to the IBGP mapped MED policy at two service providers

Page 25: Internet Routing Instability

IBGP Mapped MED

Page 26: Internet Routing Instability

FrequencyRecall,

Frequency defined as inverse of inter-arrival time between routing updatesPredominant frequencies have a 30 sec and 60 sec periodicity

Cause,Frequency components stem from a fixed minimum BGP advertisement timerused by atleast one router vendor

Page 27: Internet Routing Instability

Prefix Length Statistics

Page 28: Internet Routing Instability

ConclusionsVolume of routing update messages decreased by an order of magnitudeby specific software changes on the majority of core Internet backbonerouters. Software changes successfully suppressed the generation ofpathological withdrawals.

Proposed new software changes that may reduce instability levels by an additional thirty percent.

Instability is well distributed across both autonomous system and prefix space. No single service provider or set of network destinations appearsto be at fault.