
SMU CSE 4344 routing

Upload: sybill-velasquez

Post on 01-Jan-2016




0 download


routing. network layer in context. layer 1: physical (PHY) layer layer 2: datalink layer and MAC sublayer transmits frames (containing packets) between adjacent nodes, arbitrates access within collision domains layer 3: network layer routes packets between endpoints - PowerPoint PPT Presentation


SMU CSE 4344


SMU CSE 4344

network layer in context• layer 1: physical (PHY) layer• layer 2: datalink layer and MAC sublayer

– transmits frames (containing packets) between adjacent nodes, arbitrates access within collision domains

• layer 3: network layer– routes packets between endpoints– “end to end”, “source to destination”, “source to sink”

• transport layer– cooperates with transport layer on node at other end to

manage streams of traffic supporting app layer comms

• (layers above 1, 2, 3 differ by network protocol stack)

SMU CSE 4344

challenges in internetworking

• heterogeneity– many different kinds of networks: Ethernets,

wireless, pt-to-pt links, switched rings, etc

• scale– Internet doubled each year for the last 20 years– which paths, thru millions/billions of nodes?

• efficient, loop-free

– unambiguous addressing of all these nodes?

SMU CSE 4344

store-and-forward packet switching

data network --- environment:

SMU CSE 4344

store-and-forward packet switching

• routers perform store-and-forward• receive packet, do checksum, [queue,]

compute egress link, transmit over egress link• routers are aware of each other on the

network layer, but the transport layer is not aware of routers

• what is the transport layer aware of?– a few generic system calls

SMU CSE 4344

network services for transport layer

• ideals

– transport layer makes generic system calls to network• transport layer otherwise ignorant of what goes on in network

– network shows tech-neutral face to transport layer• whether connection-oriented or connectionless

– global network addresses

SMU CSE 4344

network layer in a nutshell

• “datagram subnet”, “packet-switched network”, “statistically multiplexed network”

• Internet Protocol, ATM, others

• for each generic router:– routing computations initialize & renew forwarding table– forwarding table (“next hop” table):

• {(destination, egress line, <protocol specific>)}

– receive packet, [checksum,] [queue,] route toward sink

• next router does the same

SMU CSE 4344

connectionless service

routing within a packet-switching network

(sink, next hop)

SMU CSE 4344

routing algorithms

• the Optimality Principle• shortest path routing• flooding• Distance Vector Routing (DVR)• Link State Routing (LSR)• hierarchical routing• broadcast routing• multicast routing• routing for mobile hosts• routing in ad hoc networks

SMU CSE 4344

graph theory and the network layer

• like a horse and carriage

• set N of nodes, set E of edges– edges are full duplex, arcs are simplex

• graph G = (N,E)

• N = (a,b,c,d)

• E = ((a,b), (a,c), (b,c), (b,c))

SMU CSE 4344

graph theory

• walk: a sequence of adjacent nodes

• path: a walk with no node repeated

• cycle: walk (n1, n

2, ..., n

n), only n

1 & n

n the same

• connected graph

• unconnected graph components

• acyclic graph

• tree: acyclic connected graph – cardinality(E) = cardinality(N) - 1

SMU CSE 4344


• subgraph G' = (N',E')– G' is a graph, N' ⊆N, E' ⊆E

• let spanning tree T be a subgraph ofG– where T is a tree and N' = N

• broadcast routing– with tree: n transmits on all adjacent edges e ∈ E'

• other nodes relay on each non-ingress adjacent e ∈ E' 

– flood: n transmits on all adjacent edges, other nodes relay on each non-ingress adjacent e ∈ E

SMU CSE 4344

flooding algorithms

• send packets everywhere, “flood” the network• static algorithm

– receive flooding packet on one link– transmit flooding packet out on all other links

• robust: if paths exist, flooding finds shortest• benchmark for other routing algorithms

SMU CSE 4344

challenges in flooding

• cycles/loops• massive packet duplication• how to make it stop?

– hop count field in flood packets• hop count “too high”, discard packet

– Selective flooding (only in “right direction”)– only forward “new” packets (higher seq#s) – reverse path forwarding: only accept packets from

source that arrive on the forwarding link to source

SMU CSE 4344

uses for flooding

• downside: very high overhead• good for:

– high risk environment, unreliable nodes• battlefield, ad hoc networks

– distributed databases• concurrent global update

SMU CSE 4344

distributed spanning tree algorithms

• where do you start? stop? is it biased? not?• minimum spanning tree (MST) algorithms

– fragment: a component of MST, a “subtree”– generic MST: given a set of fragments, add a min

weight edge to some fragment (no cycles)– Prim-Dijkstra: some node is first fragment; add min

weight adjacent edge– Kruskal: all nodes are fragments; add min weight


SMU CSE 4344


graph slides adapted from E. Modiano, MIT

SMU CSE 4344

routing algorithms

• network layer routes from source to sink

• routing algorithm

– computation of output line per sink

– building lookup table: {(sink, output line)}

• forwarding: using lookup table

• ideal routing algorithms:

– simple, stable, fair, efficient

– cope well with changes in topology and load

SMU CSE 4344

tuning the network layer

• min( hop count ) => less delay, bandwidth

• packet delay v. overall network tput/capacity• fairness v. overall nwk capacity (optimality)

SMU CSE 4344

optimality v. fairness

Conflict between fairness and optimality.

at max subnet throughput, path x → x’ is starved

SMU CSE 4344

optimality principle

• IF node j is on an optimal path (OP), i→k,– THEN an OP j→k is a subset of OP i→k

• the union of OPs for all nodes to a given sink is a sink tree

• routing algorithms seek to discover (or approximate) the sink trees of every sink in the subnet

• sink trees: max network capacity, min delay

SMU CSE 4344

shortest path (least cost) routing

• ways to measure path goodness (or “cost”):– hop length– mean queuing delay (higher with higher loads)– link latency/distance– capacity

• actual hop cost is abstracted as “distance”• routers choose shortest available path,

whether explicitly, or implicitly

SMU CSE 4344

static routing

• non-adaptive

• routes are computed in advance

• routes are optimal (global knowledge)

• routes do not change in response to topology and traffic pattern changes

• static routing used for:

– long-lasting topologies

– topologies with assigned PHY layer protection

SMU CSE 4344

Dijkstra (1959)

• static algorithm for shortest path (optimal sink tree)• G(N,E)

– set of nodes N; set of (edge, edge length) pairs E

• mark source node permanent, others temporary• source becomes working node• for each working node

– label each node adjacent to a permanent node as (predicate, distance to source)

– of all temps, choose temp with least distance to source– mark temp permanent– temp becomes next working node, until sink is reached

worst case complexity of this algorithm? why?

SMU CSE 4344

shortest path computationfirst 5 steps to compute the shortest path from A to D

(arrows indicate working node)

first working node

(a) input to Dijkstra algorithm; (c) illustration for proof by contradiction

Tanenbaum, Computer Networks, 2003, p. 354

SMU CSE 4344

why is Dijkstra correct?

• in (c) above, E is now permanent• could some AXYZE be shorter than ABE?

– if Z permanent, ZE already rejected– if Z temp, and Z(dist to src) >= E(dist to src)

• then AXYZE is not shorter than ABE

– if Z temp, and Z(dist to src) < E(dist to src)• then Z would be permanent before E• why?

• {perms} = source• REPEAT {perms} += closest temp UNTIL sink added

SMU CSE 4344

• a link to a scan of the hand-drawn image used in class to support the previous illustration of the Dijkstra proof has been placed on the class webpage, and leads to the following file.

• dijkstra-pf.pdf

SMU CSE 4344

dynamic routing

• decisions made on-line• reacts to changes in topology, traffic, delays • dynamic routing algorithms

– distance vector– link state

SMU CSE 4344

distance vector routing

• aka: Routing Information Protocol (RIP), Bellman-Ford algorithm

• iterative, self-terminating, asynchronous, distributed

• nodes maintain adjacent link costs• nodes exchange info, build forwarding tables

– estimated distance to each node

• forwarding table: {(node, line, distance)}

SMU CSE 4344

Bellman-Ford equation

dx(y) = minv{ c(x,v) + dv(y) },

for all nodes y in N,

for all neighbors v of x

dx(y): cost of least-cost path from x to y

SMU CSE 4344

distance vector implementation

• data structures:– cost vector: cost to each direct (adjacent) neighbor– distance vector: estimated cost to each node in N– distance vector of each direct neighbor

• algorithm:– each node regularly sends DV to neighbors– upon DV receipt, each node:

• saves neighbor’s DV• updates own DV using Bellman-Ford equation• if own DV changes, send it out to all neighbors

SMU CSE 4344

distance vector routing example(a) subnet; (b) input for J from A, H, I, K; new routing table for J

Jneighbor links

link delays

SMU CSE 4344

distance vector routing fatal flaw

A goes down; B believes C

fast convergence to shorter path

count-to-infinity problem

A comes up; B propagates

DVR: “where is it now?”

SMU CSE 4344

link state routing

- best-known implementation: OSPF

each router must:• discover each neighbor (globally unique nwk ID)• measure the delay or cost to each neighbor• build packet: neighbor IDs, costs• send packet to all other subnet routers• compute shortest path to every other router

SMU CSE 4344

link state routing implementation

• HELLO– neighbor/address discovery

• ECHO– delay/cost to reach neighbor– queueing delay?

• realistic estimate of fully-loaded delay• risk of oscillation

SMU CSE 4344

building link state (update) packets

• update packet contents– source ID– sequence #– TTL– neighbor list: {(neighbor ID, delay)}

• when to send updates?– triggered by state change– on regular schedule (timer)

SMU CSE 4344

link state packets

(a) subnet; (b) link state packets for subnet

(note: these are symmetric links)

SMU CSE 4344

reliable link state packet distribution• source seq# in each update

– increment seq# each update– higher seq# is more recent

• each router maintains seq# list– for each other router: (router ID, seq#)

• for each update arriving from router X:– if update seq# <= list seq#

• update packet is a duplicate, discard it

– if update seq# > list seq#• copy new seq# to list• forward update packet to remaining adjacent routers (flooding)• update own routing table

SMU CSE 4344

link state packet age

• each update packet includes age of packet– TTL field: time to live– TTL initialized to some reasonable value– TTL “aged” by each router before relaying onward– TTL periodically decremented (e.g., each second)

• when TTL reaches 0, packet is discarded– prevents old info from wandering around forever

SMU CSE 4344

link state update refinements

• update holding time– wait to see if duplicate or fresher updates come in

• ACKs– all update packets are ACKed

SMU CSE 4344

where to relay, where to ACK

packet buffer for router B

(see prev. graph) each row indexes an update packet, ready for processing

SMU CSE 4344

computing routes from link state updates

• global state info– cooperative state distribution– local computation

• routing computation– aggregate link state info of all subnet routers– shortest path or all-pairs-shortest-paths algorithm– update local forwarding table

SMU CSE 4344

DV versus LS Routing

1. only neighbor-to-neighbor info exchange about what each has learned - low overhead

2. link costs: simple hop counts

3. nodes have no idea of globaltopology

4. protocol convergence is slow- count-to-infinity problem

5. due to 3 & 4, routingloops can occur

1. each node gives info to all other nodes (only verified info)

2. link costs: delay, bandwidth, link reliability, distance, etc.

3. each node can contruct global topology from the info it receives from others

4. protocol convergence is fast

5. due to 3 & 4, routing loops do not occur

SMU CSE 4344

hierarchical routing

• link state routing tax on subnet resources– for n routers, averaging k neighbors– memory footprint grows as kn– bandwidth overhead grows as kn– CPU cycles overhead grow as kn

• so, partition the overall subnet into much smaller, more manageable chunks (“regions”)

• delegate regional routing to the much smaller number of routers actually in that region

SMU CSE 4344

the tradeoff

• with small regions, and limited connections into the next level of the hierarchy, routers gain much smaller, more manageable data structures

• the more levels, the longer the hop length of the average path

• for an N node subnet, (ln N) levels are optimal, with (e ln N) table entries per node

SMU CSE 4344

hierarchical routing example

SMU CSE 4344

broadcast routing: data to each host

• unicast to each host– source sends N flows, one to each address

• flood– still uses a lot of bandwidth

• multidestination routing– packet has list of all sinks, parsed by each router

• sink tree routing– each node must know current, global sink tree

... bullets in order of what?

SMU CSE 4344

broadcast routing

• reverse path forwarding– approximate sink tree forwarding– for each broadcast routed packet received, a

router • checks which node sent the packet• checks current forwarding table• if packet comes in on the line going out to the source

– send copies out only on all other lines• if packet comes in on any other line

– drop it

• no extra packet info, no state, self-terminating• simple, easy, effective, efficient• Radia Perlman

SMU CSE 4344

reverse path forwarding example

(a)subnet; (b) sink tree; (c) tree built by reverse path forwarding

sink tree: max 4 hops, 14 packets

reverse path forwarding: max 5 hops, 24 packets ...

... depending on what?

SMU CSE 4344

multicast routing

• sending identical flows to network subset

• what is the great virtue of multicast?

• senders to group use a discrete group address

• group management (using, e.g., IGMP)– group creation, destruction

– dynamic membership

– local process discovers group out-of-band

– host learns group membership from local process

– routers learn group memberships from hosts

– routers tell other routers, maintain state

• registry of which groups are reachable through which routers (here is hierarchical addressing again)

SMU CSE 4344

multicast and distance vector routing

• idea: “simulate” a spanning tree

• “flood and prune” protocols

• flood {(node, group)} info across internetwork

• routers prune themselves from “group tree”

– if no group host/router connections: send PRUNE msg

SMU CSE 4344

DV multicast (DVMRP) in a nutshell

• each host in group G periodically says so to nwk• if group membership changed, router relays DVM

update packet: {(source, cost, {group})}• reverse path broadcast (RPB) updates• routers update forwarding table: {(group, {link})}• source-based “spanning trees” pruned de facto

– empty leaves/branches not in group forwarding tables• reverse path multicast (RPM)

– for G, forward on each listed link, except source link– why “source link”, why not just “arriving link”?

SMU CSE 4344

spanning tree pruning example

(a) network (b) spanning tree for the leftmost router

(c) multicast tree for group 1 (d) multicast tree for group 2

(c) nodes do not have to be in group to be in pruned tree

SMU CSE 4344

scaling issues for pruned spanning tree multicast

• group forwarding table and data structure growth of O(mn) on each router, for

– m, average number of group members

– n, average number of groups

• why?

– for each source, a “tree”

– for each group, m trees

SMU CSE 4344

multicast and link state routing

• spanning tree computed locally with existing global link state

• each host in group G periodically says so to nwk

• group/router list applied to resulting spanning tree, going from each leaf to root

• if node is not in group, prune it from branch• if node is in group, so is its branch toward root

SMU CSE 4344

core-based tree multicast

• each group designates a core node near the middle of the internetwork

• a single group spanning tree is computed, using the core as the root

• each group source sends its transmissions to the core, and thence along the single tree

• average path length is greater, but router overhead is reduced by factor of m

who might have a core-based tree, and where might its core be?

SMU CSE 4344

protocol independent multicast (PIM-SM)

• (PIM dense mode, but prior protocols work)

• prior protocols do sparse mode poorly

– e.g., they broadcast updates by default

• PIM sparse mode (PIM-SM)

– routers explicitly join and leave groups, no global state

• “unicast routing protocol independent”

– PIM-SM depends on Internet Protocol

– particular unicast routing algorithm is irrelevant

SMU CSE 4344

PIM-SM routing state maitenance

• well-known rendezvous point (RP) per group

• RP selection is complex

• router X tunnels Join message to RP

• RP unicasts Join message to router X– creates sender-specific state on router X branch– why?

• Prune culls unused branches, en route to RP

• result: unique, implicit, default, shared tree

SMU CSE 4344


• shared tree can be costly, if shorter paths exist between mcast source and some sinks

• source-specific tree for high-traffic source

• source-specific Join from sink to source creates source-specific tree

• resulting mcast route bypasses RP router

• PIM-SSM: source + group – allocated subset of mcast IP address space

SMU CSE 4344

routing for mobile hosts

• stationary hosts• mobile hosts

– migratory hosts– roaming hosts

• for each host– permanent home location– permanent home address

SMU CSE 4344

mobile host context

WAN, with LANs, MANs, and wireless cells attached

SMU CSE 4344

how to find mobile hosts

• foreign agent in each geographical area– announces presence in area periodically– tracks all visiting mobile hosts

• home agent– tracks all mobile hosts that are away

• mobile host (MH) registers with foreign agent• foreign agent notifies home agent of MH• MH is now registered in foreign area

SMU CSE 4344

routing for mobile hosts

SMU CSE 4344

differences between mobile IP protocols

• protocol division between routers and hosts• host nwk stack layer responsible for protocol• intermediate router participation• host or foreign agent temporary address• forwarding by readdressing• forwarding by tunneling• security protocols

SMU CSE 4344

Mobile Ad hoc NETworks (MANETs)• each node: mobile host/router combo• no infrastructure, cell tower, nor basestation• changing network membership, topology• lossy RF links (each node has antenna)

– links exist based on proximity, RF environment– fading, crosstalk, reflection, absorption, etc, etc

• mobile nodes provide all network services– routing– address assignment– name translation

SMU CSE 4344

context of ad hoc networks

• military vehicles on battlefield– no infrastructure

• fleet of ships at sea– all moving, all the time

• emergency workers at earthquake– infrastructure destroyed

• gathering of people with notebook computers– “come as you are”

• sensor networks– self-organizing, not necessarily stationary

SMU CSE 4344

example ad hoc networking algorithm: AODV

• Ad hoc On-demand Distance Vector routing• C. Perkins & E. Royer (1999, 2001)

• “mobile Bellman-Ford”• no full routing table update broadcasts• respects bandwidth, limited battery power• finds route to sink iff source actively seeking

SMU CSE 4344

AODV route discovery

• link exists iff two nodes can hear each other• history table at each node: {(srcIP, reqID)}• routing table at each node:

– {(sink, next hop, <more>)}– all table info times out if not refreshed periodically

• if destination not in {(sink)}, then:– ROUTE REQUEST– (srcIP, reqID, sinkIP, src seq#, sink seq#, hop#)

SMU CSE 4344

ROUTE REQUEST processing

• ROUTE REQUEST packet received• if (srcIP, reqID) in history table, drop packet

• if (local sink seq#) > (RTE REQ sink seq#)– send ROUTE REPLY (“this node is next hop”)

• else, increment RTE REQ hop#, enter route to src in own table, flood RTE REQ onward

SMU CSE 4344

ROUTE REQUEST processing

• ROUTE REPLY– (srcIP, sinkIP, sink seq#, hop#, TTL)– sent back to neighbor RTE REQ came from– hop# reset, so src can know length of path– RTE REPLY follows (pruned) path of RTE REQ– nodes en route update info on route to sink

• side effect of ROUTE REQUEST flood– all nodes learn route to src, if not to sink

• TTL widening rings

SMU CSE 4344


shaded nodes: new recipients; arrows: possible reverse routes

• (a) Range of A's broadcast.• (b) After B and D have received A's broadcast.• (c) After C, F, and G have received A's broadcast.• (d) After E, H, and I have received A's broadcast.

SMU CSE 4344

AODV route maintenance

• HELLO (“I’m still alive”)• reply to HELLO (“me, too”)

• “disappeared” neighbors prompt table purge• routing table

– {(sinkIP, egress, #hops, {active neighbor}, <etc>)}– “active neighbor”: uses local node to get to sink

• purge proceeds through MANET recursively

SMU CSE 4344

AODV route maintenance example

(a) D's routing table before G goes down

(b) graph after G goes down