performance and resource cost comparisons for the cbt...

Performance and Resource Cost Comparisons

for the CBT and PIM Multicast Routing Protocols

Tom Billhartz, J. Bibb Cain, Ellen Farrey-Goudreau, Doug FiegHarris Corporation, Melbourne, Florida1

Steven BatsellOak Ridge National Laboratory, Washington, DC

Abstract

Researchers have proposed the Core Based Trees (CBT) and Protocol IndependentMulticasting (PIM) protocols to route multicast data on internetworks. In this paper, wecompare the simulated performance of CBT and PIM using the OPNET network simula-tion tool. Performance metrics include end-to-end delay, network resource usage, jointime, the size of the tables containing multicast routing information, and the impact ofthe timers introduced by the protocols. We also offer suggestions to improve PIM SparseMode while retaining the ability to offer both shared tree and source-based tree routing.

1. Introduction

Multicasting is a communications service that allows an application to efficiently transmit cop-

ies of a data packet to a set of receivers that are members of a multicast group. The group is

identified by a location-independent multicast group address. Senders use this address in the des-

tination field of the packet; multicast routers forward the packet to group members using routing

table entries for this address. The entries form a tree, which may be a source-based tree or a center-

based tree depending on the multicast routing protocol.

Multicast group members may be spread across separate physical networks, they may join and

leave a group during the life of the group, and they may be members of multiple groups. (How

members learn of the multicast group is not part of the routing protocol’s function; one method is

for a multicast application to advertise groups using a well-known multicast address.)

1. This work was sponsored by the Defense Advanced Research Projects Agency (DARPA) through contract number N00014-93-C-2186 with theNaval Research Laboratory.

Since 1992, multicast routing has been performed by a multicast-capable, virtual network run-

ning “on top” of the internet called the Multicast Backbone (MBone) [Eri94]. The MBone uses the

Distance Vector Multicast Routing Protocol (DVMRP) [Wai88] or the Multicast Extensions for

Open Shortest Path First Protocol (MOSPF) [Moy94] to route multicast traffic. Common uses of

multicasting include audio- and video-conferencing, Distributed Interactive Simulation (DIS)

activities such as tank battle simulations, and exchanging experimental data and weather maps

[Eri94]. A comparison of the important features of these applications can be found in [Bil96].

DVMRP and MOSPF depend on features of underlying point-to-point (unicast) routing proto-

cols. Efforts to remove this dependency and to develop point-to-multipoint (multicast) routing

protocols that operate in a hierarchical manner with subnet multicast routing protocols led to the

development of the Core Based Tree (CBT) protocol [Bal94] and the Sparse Mode of the Protocol

Independent Multicasting (PIM) protocol [Dee95]. PIM and CBT are defined by Internet Drafts

and are still evolving.

In this paper, we describe the operation of CBT, the Dense Mode of PIM, and the Sparse Mode

of PIM. We discuss the OPNET network models that we built to compare the performances of

these protocols when a large number of groups and multiple senders are active simultaneously. We

discuss the simulation results and related analysis.

Wei and Estrin [Wei94] considered end-to-end delay, network resource usage, and traffic con-

centration for source-based trees and center-based trees by analyzing random graphs. This paper

measures these metrics plus overhead traffic, scalability, and join time for network models running

the PIM and CBT protocols. In our network models, key features of the protocols are implemented,

many groups are active simultaneously, and group membership is dynamic.

2. Multicast Routing Protocols

2.1 Source-Based Trees and Shared Trees

Data packets addressed to a multicast group may be routed on a tree that is specific to the par-

ticular sender and group or a tree that is shared by all of the senders to the group. The first approach

uses a source-based tree (SBT) that is a shortest-path tree rooted at a sender. The branches of the

tree are the shortest paths from the sender to each of the group members. A separate tree must be

constructed for each sender to each active multicast group. A protocol that implements SBTs is the

Dense Mode of PIM (PIM Dense) [Dee95]. The shared tree approach uses a single center-based

tree or core-based tree to route traffic from all senders to the group. The tree is a shortest-path tree

rooted at one or more predefined nodes in the network called Core nodes. A protocol based on cen-

ter-based trees is CBT [Bal94].

The Sparse Mode of PIM (PIM Sparse) [Dee95] first builds a center-based shared tree for each

multicast group. After a group member receives traffic over the shared tree, it may ask (though it

is not required to ask) the sender to send future traffic for that group along the shortest path. This

request triggers the multicast routers to construct a branch of the source-based tree for the group

from the sender to the receiver. We have modeled two special cases of PIM Sparse. The first deliv-

ers all traffic over the shared tree, which we will call the shared tree case of PIM Sparse (PIM-ShT).

In the second case, all group members ask to receive traffic only over source-based trees. We refer

to this case (which is implemented in routers on the market) as the source-base tree case of PIM

Sparse (PIM-SBT).

There is currently some debate over which type of tree provides the best performance [Wei94].

Algorithms that use CBTs construct a single tree for each group, regardless of the number of send-

ers. Because the packets are not guaranteed to travel the shortest path, one expects the end-to-end

delivery delay to be larger for CBT algorithms than for SBT ones. However, SBT algorithms scale

poorly for large numbers of senders because the router resources required to maintain knowledge

of the tree structure is considerable.

2.2 Internet Group Management Protocol (IGMP)

Multicast routers connected to Local Area Networks (LANs) learn which multicast groups the

hosts on the LANs wish to join using the Internet Group Management Protocol (IGMP) [Dee89].

Hosts notify the router of their group memberships and of their decisions to join or leave particular

multicast groups. Routers use this information to construct multicast trees.

2.3 Core-Based Trees (CBT) Multicast Routing Protocol

CBT sets up and maintains a single shared tree for every multicast group that is active in the

network. When a multicast router is notified via IGMP that a local host would like to join the group,

the router sends a join message for that group toward the Core node via the shortest path. A tree

rooted at the Core is constructed as the acknowledgments to the join messages are processed. The

resulting tree is an bidirectional, acyclic graph that reaches every group member. See Figure 1.

Core

Figure 1. CBT shared tree used to route all multicast traffic for group G.

Multicast TreeNetwork Connections

Member of G

Member of G

Member of G

Forwarding packets to the group members using CBT is straightforward. When a node on the

tree receives a packet addressed to the group, it forwards copies of the packet on all branches of

the group’s tree except for the branch on which the packet arrived. Packet delivery is illustrated in

Figure 2.

2.4 PIM Dense Multicast Routing Protocol

The PIM Dense protocol [Dee95] floods the network with data packets to set up a source-based

tree for every sender to every group. Initially, these trees reach every potential receiver in the net-

work. After receiving multicast data for group G, each router that has no members of a group G on

its LANs sends prune messages towards the senders to remove unwanted branches from trees for

group G. This broadcast-type behavior recurs periodically after the pruned interfaces have timed

out. A tree created by PIM Dense is shown in Figure 3.

Core

Figure 2. Packet delivery for group G and sender S along the CBT shared tree.

Packet DeliveryNetwork Connections

Member of G

Member of G Sender S

2.5 PIM Sparse Multicast Routing Protocol

When a new multicast group is introduced to a network that uses the PIM Sparse Protocol, a

node in the network called the Rendezvous Point (RP) is assigned to the group. The RP will

become the center node of a directed, shared tree for the group. (The RP performs a function sim-

ilar to that of the Core node in CBT.)

Each multicast router that learns via IGMP that a local host has joined group G sends a join

message along the shortest path to the RP for that group. The join message triggers each router on

its path to the RP to set up or update a routing entry for the shared tree for group G. The shared tree

that is built by these actions is a directed tree rooted at the RP that can be used to deliver packets

to each member of the group.

Each new sender to a group registers with the RP. In response, the RP initiates construction of

a directed source-based tree from the sender to the RP. At each router on this tree, the port leading

to the RP, as well as the ports leading to branches of the shared tree that do not lead to the source,

are added to the SBT routing table. Routers give precedence to source-based trees when two trees

are available for the same group. See Figure 4.

Figure 3. PIM Dense multicast routing tree for sender S and group G.

Multicast Treefor Sender SNetwork Connections

Member of G

Member of G Sender S

2.5.1 Shared Tree Case of PIM Sparse (PIM-ShT)

The shared tree directed away from the RP overlaid with a source-based tree from each source

provides the same functionality as the simple bidirectional tree of CBT. We refer to packet delivery

using these trees as the shared tree case of PIM Sparse or PIM-ShT.

2.5.2 Source-Based Tree Case of PIM Sparse (PIM-SBT)

Even if we use PIM Sparse to deliver all multicast traffic over SBTs, the trees discussed in the

Section 2.5 must be constructed. Subsequently, more complicated trees are set up.

Each receiver learns of senders to a group when it receives data packets on the trees described

in Section 2.5. The receiver may send join messages toward specific senders to become part of the

shortest-path, source-based trees rooted at those senders. The receiver must also alert routers on

the shared tree that no packets from these sources should be forwarded. The switch from shared

tree delivery to source-based tree delivery introduces a high degree of complexity to PIM-SBT. See

Figure 5.

RP

Figure 4. PIM-ShT shared tree for group G and source-based tree for sender S.

Source-Based Tree

Shared Treefor Sender S

for Group GNetwork Connections

Member of G

Sender SMember of G

3. Simulation Environment

3.1 OPNET Network Models

We modeled CBT, PIM Dense, the Shared Tree Case of PIM Sparse, and the Source-Based

Tree Case of PIM Sparse using the OPNET network simulation tool. OPNET was selected because

it allows construction of detailed protocol models. In addition, because OPNET has optimized its

memory use, large networks that route many packets concurrently can be simulated efficiently. We

used or modified standard OPNET process models including IP and Ethernet. We also developed

new OPNET process models for the IGMP and the multicast routing protocols. Details of our mod-

els can be found in [Bil95].

We constructed three network topologies using the OPNET network simulation tool to study

the multicast protocols:

1)AAI/MAGIC/ATDnet - The top level of this topology is representative of a system being con-

structed by a consortium of government, industry and educational team members. Link band-

widths range from 10 to 600 Mbits/second. This network is called theAAI Network in later sec-

tions.

RP

Figure 5. PIM-SBT shared tree for group G and source-based tree for sender S.

Source-Based Tree

Shared Treefor Sender S

for Group GNetwork Connections

Member of G

Sender SMember of G

• The AAI/MAGIC (ACTS ATM Internetwork / Multidimensional Applications and Gigabit

Internetwork Consortium) network topology integrates research projects that have influenced

the fundamental attributes of the prototype of the DARPA Global Grid program. Our OPNET

model of the AAI/MAGIC network contains eleven sites and is shown in Figure 6. Five of the

sites are part of a star subnetwork labeled MAGIC (the icon is located in Nebraska) in Figure 6.

• The subnetwork labeled NRL_ATDnet connects eight federal agency sites in the Washington,

DC, area in a ring configuration. It was established by ARPA and serves as a testbed for ATM

and SONET technologies.

2) High-bandwidth mesh topology - This topology is composed of nineteen sites arranged in the

grid pattern shown in Figure 7. The octagons symbolize sites. The bandwidths of the WAN links

were selected to match those used in theAAI Network to permit direct comparisons with theAAI

Network. This network is called theMesh Network in later sections.

Figure 6. AAI/MAGIC Network.

3)Low-bandwidth mesh topology - This topology is the grid pattern defined for the high-bandwidth

mesh topology and shown in Figure 7. The bandwidths of the WAN links are set to 3 Mbits/sec-

ond to introduce congested links. This network is calledStressed Mesh Network or S-Meshin

later sections.

In all three networks, each site consists of a WAN router connected to a site router, which in

turn connects to two LAN routers. Each LAN router is connected to two host nodes via Ethernet,

as shown in Figure 8. Each network has seventy-six hosts that generate multicast data traffic.

Figure 7. Mesh Network Topology

The Mesh Network topology is highly connected and provides numerous possible tree patterns.

In contrast, the AAI Network topology has two large loops that are the basis of the trees for every

multicast group. This contrast tests the multicast protocols under vastly different topological

conditions.

The network topologies do not change during the simulations and no packets are lost.

3.2 Simulation Parameters

Using OPNET, we were able to model delays due to the IP packet service rate; packet queueing

at routers and hosts; propagation delays on all links; and transmission delays based on link data

rate and packet size. We also modeled the size of the data packets (1824 bytes plus all appropriate

header information for the application we are interested in modeling) and the size of the overhead

packets defined by the protocol specifications. Because we are primarily interested in modeling the

routing of packets, we chose not to model PIM and CBT processing beyond that of IP; the effect

HostsHosts

Site Router

WAN Router

To WAN

LAN Routers

Figure 8. Site Topology

Ethernets

of operating system overhead on delay in routers and hosts; the effect of finite size packet buffers

in routers; or the effect of nonzero link error rates.

In our simulations, each group was assigned a single RP or Core node selected from a set of

four centrally located nodes. Each of these nodes supported approximately one-quarter of the mul-

ticast groups. The RP for a given group in a PIM Sparse simulation was placed at the same location

as the Core node for that group in the CBT simulation. (Multiple Cores are supported by the CBT

protocol specification. PIM supports only a single RP.)

We performed twelve simulations for each combination of network and protocol. Each of the

parameters in Table 1 was varied independently in the simulations. We distributed membership

among 390 active multicast groups using a uniform distribution defined by the “Groups per Host”

and “Group Distribution Type” parameters explained in Table 1 . A given host in simulations that

have common a network and common parameters but different routing protocols will join the same

multicast groups at the start of the simulations.

Each group member was both a sender and a receiver for the group. Group members joined and

left groups throughout each simulation.

Table 1: Simulation Parameters.

Parameter Description of Parameter and Simulation Values Used

Groups per Host The maximum number of multicast groupsto which each host can belong: 10, 20, or 30.

Group DistributionType

The set of groups that hosts can join. Our model allows a host to join(a) groups with membership spread across the entire network or

(b) groups with membership spread across the entire network ANDgroups with membership restricted to one quadrant of the network.

Join/Leave Dynamics The maximum time between consecutive changes (join or leave)in group membership at a host: 5 or 10 seconds.

The times between changes (joins or leaves) wereuniformly distributed between zero and this parameter.

Traffic GenerationRate

Rate of packet transmission for each group at each host.Hosts sent packets to each group at intervals determined by

a Poisson distribution with mean one packet per second per group.

4. Experimental Results

This section examines simulation results for the CBT, PIM Dense, and two special cases of the

PIM Sparse protocol on the AAI Network, the Mesh Network, and the Stressed Network. The met-

rics used to compare the protocols are discussed in Table 2 .

Each statistic reported in the Section 4. is the average of the twelve simulations performed for

the given network and protocol. One simulation was performed for each combination of simulation

parameters.

4.1 End-to-End Delay

End-to-end delay is the period of time required for a data packet to be routed through the net-

Table 2: Network Performance Metrics.

Metric Description

End-to-End Delay Time elapsed between the generation of a packet at a sourceand the reception of that packet by a group member.

Network ResourceUsage

Total number of hops that copies of a packettravel to reach all group members.

Computed by dividing the total number of hops measuredin a simulation (including overhead packets)

by the number of packets received.

Overhead TrafficPercentage

The percentage of the total number of bits transmittedthat are overhead bits.

TrafficConcentration

A measure of the distribution of the total network traffic on all links.Defined to be the ratio of the maximum throughput

carried by any link to the mean throughput of all links.

ImplementationIssues

(1) Size of the routing table and (2) the number of required timers.These issues impact memory requirements, speed,

and operating system performance.

Join Time The time elapsed between when a host requests to join a groupand when that host receives its first message

from another member of that group.

work from the application where it was created to a destination application. The average end-to-

end delay measurements for each protocol and each network is shown in Figure 9. These values

are normalized by the average end-to-end delay for the CBT protocol under the same test condi-

tions for illustration purposes.

PIM-ShT and CBT use shared trees to deliver data to multicast group members. These proto-

cols route packets for groups with the same members and the same Core and RP nodes on identical

paths. End-to-end delays for these protocols are similar.

PIM-SBT and PIM Dense deliver packets 13% to 31% faster than CBT, depending on the

topology simulated. These protocols use source-based trees to deliver packets along the shortest

path from the source to the receivers. The largest improvements of source-based trees protocols

over shared tree protocols were observed for the highly connected Mesh Networks.

Figure 9. End-to-End Delay relative to CBT.

PIM-ShT PIM-SBT PIM-D CBT

AAI Mesh S-Mesh

PIM

/CB

T R

atio

0.00

0.20

0.40

0.60

0.80

1.00

1.20

4.2 Network Resources Usage

A simple method to route a packet to all interested receivers is to unicast a copy of the packet

to each receiver. However, unicasting is likely to route several copies of the same packet over net-

work links. Multicast protocols send only a single copy of a packet over any link in the network

and require fewer hops to deliver the packet than unicasting in most cases. Our simulation results

were used to study which protocol delivers a copy of a packet to all group members in the fewest

number of hops. We refer to the number of hops as the Network Resource Usage metric.

Source-based tree protocols deliver packets along the shortest paths from the source to each

receiver. Because packets travel direct paths to their destinations, the fewest number of hops are

recorded for a single receiver. If multiple receivers are present in the network, packets still travel

directly to each receiver. Copies of the packet are not made until the paths diverge.

Shared tree protocols are expected to route packets on a common tree for greater distances.

Thus, even though the packets may not travel on the shortest paths to the receivers, copies of the

packets may travel fewer overall hops than source-based tree protocols to reach all of the

destinations.

The histogram shown in Figure 10 illustrates that CBT, PIM-ShT, and PIM-SBT require a sim-

ilar number of hops to deliver copies of a packet to all multicast group members. The measured

values displayed in Figure 10 are normalized by the measured value for CBT under the same test

conditions.

Table 10 shows that the network resource usage is much larger for PIM Dense than for the

other protocols in all of the networks. This difference is due to periodic flooding, which almost

doubled the number of data packets counted in some of our simulations. Examination of individual

simulation results shows that the network resource usage of PIM Dense approaches that of CBT as

the network becomes more densely populated with multicast group members. This result was

expected because PIM Dense was designed for a dense membership environment.

Figure 10. Network Resource Usage relative to CBT.


AAI Mesh S-Mesh

PIM

/CB

T R

atio

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

1.60

4.3 Overhead Traffic Percentage

Overhead messages are used to exchange IGMP information and to set up, maintain, and tear

down routing trees. We measured the number of each type of packet created during each simulation

-- including overhead packets -- and the number of hops that the packets traveled. From these val-

ues, we computed the percentage of bits transmitted during a simulation that were contained in

overhead messages. These percentages are shown in Figure 11 for each protocol and each network.

Figure 11 shows that CBT has the lowest overhead percentage of the protocols examined,

approximately 0.3%. PIM-ShT also has a low overhead percentage, only 5% greater than the per-

centage measured for CBT. The extra overhead bits make up the packets that set up source-based

trees between each sender and the RP.

More complicated source-based trees are set up in PIM-SBT. In addition, each time a host asks

to join (or leave) a multicast group, overhead messages must be sent to each sender to the group to


AAI Mesh S-Mesh

Overh

ead %

0.00

1.00

2.00

3.00

4.00

5.00

Figure 11. Percentage of Bits Contained in Overhead Messages

set up (or tear down) a branch of the multicast tree for that sender. (Only one tree needs to be

updated in CBT or PIM-ShT.) These factors double the percentage of overhead bits measured for

PIM-SBT compared to CBT or PIM-ShT.

Due to prunes that follow flooding, the percentage of overhead bits for PIM Dense is more than

double the percentage for PIM-SBT. In some simulations, the percentage of overhead messages

was measured to be almost 5%. If data packets that are flooded but do not reach a multicast group

member are considered overhead, the performance of PIM Dense with respect to this metric

becomes much worse. In one simulation where each group had few members spread throughout

the network, 43% of all network traffic (measured in bits) was overhead or unwanted flooded data.

There are more paths to flood data in the highly connected Mesh Networks; thus, the overhead per-

centage was greater for these networks.

The number of overhead messages required to maintain the routing trees for an active group

depends on the number of members joining and leaving the group, not on the amount of traffic sent

to that group. Thus, the overhead percentages reported would be smaller for higher levels of traffic.

Because the absolute percentages are related to the level of traffic in the network, the numbers dis-

cussed here should only be used for comparison.

4.4 Traffic Concentration

The ratio of the largest average throughput measured on a network link to the average through-

put measured for all links, which we call the Traffic Concentration metric, can be used to examine

the distribution of traffic on the links in the network. The shared tree protocols CBT and PIM-ShT

are expected to concentrate traffic onto the subset of the network links that compose the shared

trees. The source-based tree protocols PIM Dense and PIM-SBT are expected to distribute the traf-

fic more evenly among all links because they use a different tree for each sender and each group.

In the Mesh and Stressed Mesh Networks, CBT and PIM-ShT have similar Traffic Concentra-

tion Metrics, as shown in Figure 12. The metric for PIM-SBT is 17% less. PIM Dense distributes

traffic most evenly among the links. Recall, however, that the overall traffic level of PIM Dense

(the denominator of the ratio) is much higher.

The results shown in Figure 12 for the AAI Network should not be considered representative

because the link between the AAI/MAGIC Network and the ATD Network is a bottleneck link for

each protocol. Every group that has a Core, RP, or member on the ATDnet and one on the AAI/

MAGIC Network must include this link in its tree.

Only four nodes are used as Core and RP nodes in our simulations. A copy of each multicast

data packet in CBT and PIM-ShT always passes through these of nodes, causing a concentration

of traffic on the links leading to them. SBT protocols build trees rooted at each sender for every

group so that traffic is not constrained to pass through a small number of nodes. This result gives

Figure 12. Traffic Concentration Metric.


AAI Mesh S-MeshMaxim

um

-to-M

ean T

hro

ughput R

atio

0.00

1.00

2.00

3.00

4.00

5.00

insight into a way that the locations of the Cores and RPs could be chosen if traffic concentration

is a potential problem: allow each node to serve as Core and RP nodes for only a few multicast

groups. The difference in traffic concentration metric between the shared tree and source-based

tree protocols should be reduced.

4.5 Implementation Issues

The OPNET model of the AAI Network was built to study the performance of a physical net-

work that will be used for large-scale, military simulation exercises. (The physical network will

have more complicated site topologies with hundreds of hosts.) These exercises are expected to be

large; the STOW-97 (Synthetic Theatre of War) simulation may have 20,000 groups with several

hundred members of each group. Many of these members will send to the group. The intricacy of

the protocol, operating system overhead, and routing table size become especially important issues

when the number of senders and groups grow to this size because router speed and memory

requirements are impacted.

This section analyzes the size of the routing tables needed for each protocol as a function of the

number of groups and senders per group. Each routing table entry initiates at least one timer. The

effect on the operating system of these timers is also considered.

4.5.1 Routing Table Size

Both PIM and CBT require that each multicast router maintain a table of multicast routing

information. We examine the size of the tables using the notation in Table 3 .

Table 3: Parameters Used to Analyze the Size of the Routing Tables for CBT and PIM

Variable Meaning

T(*) Average number of routing table entries in a router for protocol *.

N Number of multicast groups active in the network.

n Average number of senders to a group.

CBT sets up only one tree for each group. The average number of entries that a router maintains

is the likelihood that the router will be on a shared tree times the total numberN of shared trees.

(1)

The PIM Dense protocol sets up and maintains a source-based tree for every sender and every

group. Due to the flooding of the data packets, entries are set up throughout the network. Thus,

routing table entries for a group are maintained in portions of the network where no hosts are mem-

bers of that group and no branches of the tree are needed. These entries are removed after a period

of disuse, but re-flooding replaces them.

The average number of entries in each PIM Dense router is approximated by the following

expression.

(2)

A lower bound for this expression is given below.

r Total number of routers in the network.

Average percentage of the routers that are on a shared tree.Also, the average percentage of the routers

that are on a source-based tree for a group and a sender.

Average percentage of the routers that are on the source-based treefor a group and sender but not on the shared tree for that group.

d Average number of hops from a host to a RP.

p Average number of interfaces at a router.

Fraction of groups moving to SBT in PIM Sparse.

Length of timer that initiates PIM-Dense flooding.

Length of timer that initiates the removalof inactive PIM Dense entries < .

Table 3: Parameters Used to Analyze the Size of the Routing Tables for CBT and PIM

Variable Meaning

αC

αS

β

t f

t r

t f

αC

T CBT( ) N αC⋅=

T PIM Dense( ) N n αc 1 αc–( )+ t r t f⁄⋅[ ]⋅ ⋅≅

(3)

For comparison, we compute a lower bound on the ratio of the average number of routing tables

entries needed to implement PIM Dense to the number needed for CBT. We find the routing tables

in routers running PIM Dense are at leastn times larger than those running CBT, on average.

(4)

In our simulations, the ratio was even larger. When , routing table entries existed

at most routers for each active sender of every group plus some out-of-date senders whose entries

had not yet timed out.

PIM-ShT routes packets along shared trees as CBT does (contributing the first term of

Equation 5). PIM-ShT also sets up a source-based tree between each source S and the RP. The

average number of entries that a router maintains for source-based trees is the average number of

hopsd from a host to the RP divided by the number of routersr in the network.

(5)

A lower bound for this expression can be providedif every member is a sender. Every router

that is on the shared tree must lie on the shortest path between a group member and the RP accord-

ing to the rules for tree construction. Therefore, every router on the shared tree must be on the

source-based tree that connects a sender (the group member) to the RP and must have an source-

based tree entry.

(6)

(7)

The PIM routing tables will always be at least twice the size of the CBT routing tables and

T PIM Dense( ) N n αc⋅ ⋅≥

T PIM Dense( )T CBT( )------------------------------------

N n αc⋅ ⋅

N αc⋅------------------------ n=≥

t r t f⁄ 1 2⁄=

T PIM-ShT( ) N αC⋅ N n d r⁄⋅ ⋅+=

T PIM-ShT( ) 2 N αC⋅ ⋅≥ 2 T CBT( )⋅=

T PIM ShT–( )T CBT( )------------------------------------- 2≥

sometimes much larger for the same environment. These bounds are conservative for network

topologies where the average number of senders to a group is greater than the number of network

interfaces at the RP. Also, an entry is required for each source in the group at the RP.

PIM-SBT sets up and maintains many trees for each multicast group G in the network including

• one source-based tree for every sender S, called the (S,G) tree, and

• one shared tree, called the (*,G) tree.

Additional routing table entries called (S,G,RP) entries stop traffic that is delivered on source-

based trees from being forwarded on shared trees. See reference [Dee95] for a more detailed

description of these routing table entries.

A router that is part of a shared tree for group G has a (*,G) entry. If the router is on a source-

based tree for a sender S, it also has an (S,G) for sender S and group G. If the router is not on the

source-based tree for sender S, it has an (S,G,RP) entry. Thus, each router on the shared tree for

group G has another entry -- either an (S,G) or an (S,G,RP) entry -- for each sender to group G.

(8)

If only some of the groups use source-based trees, the average number of routing table entries

is given by the following expression.

(9)

If all PIM traffic is delivered along the shortest path ( ), the ratio of the number of routing

table entries for PIM-SBT to the number for CBT is given by Equation 10. The parameter is

defined to be .

T PIM-SBT( ) N n 1+( ) αC⋅ n αS⋅+[ ]⋅=

T PIM-SBT( ) N β n⋅ 1+( ) αC⋅ β n α⋅ ⋅ S+[ ]1 β–( ) N n d r⁄⋅ ⋅ ⋅+

⋅=

β 1=

ρ

ρ αS αC⁄=

(10)

On average, implementing PIM-SBT in a network will require each router to maintain at least

n+1 times as many routing table entries as implementing CBT in the network.

4.5.2 Timer Requirements

According to our interpretation of the PIM protocol specification, several timers are required

for PIM. Both PIM Dense and PIM Sparse have two types of timers that are kept at each router.

• One timer is maintained for each routing table entry at the router. This timer is updated each

time each entry is used. If the entry is not used for 3 minutes, the entry is deleted. The length

of these timers has been reduced to 30 seconds in our simulations.

• For each routing table entry, one timer is kept for each output interface. In PIM Dense, when

one of these timers expires for a pruned interface, the interface for which it was set is added to

the entry’s output interface list. In PIM Sparse, these timers are updated when routing informa-

tion is exchanged by neighboring routers.

The average number of timers at a router is . Complexity added by

requiring that many PIM routing entries be maintained is compounded by requiring timers for each

entry.

4.5.3 Implications of Complexity on Speed

As we discussed Section 4.5, the STOW-97 exercises may have 20,000 multicast groups, each

with several hundred members. As our analysis of the previous section shows, the number of rout-

ing table entries and the number of timers required to support an exercise of this size are extremely

T PIM SBT–( )T CBT( )

-------------------------------------N n 1+( ) αC× n α× S+[ ]×

N αC×----------------------------------------------------------

1 n ρ 1+( )

n 1+≥

×+

=

=

1 p+( ) T protocol( )⋅

large.

To illustrate how the average size of the routing tables scales with the number of groups and

members, consider an example where all multicast data is delivered on SBTs ( ). (PIM

Sparse routers currently being built support only the source-based tree mode.) Let ,

, and . These numbers are consistent with our observations of AAI Net-

work model. Let = 1/3. Table 4 shows the growth of the routing table size with the growth

in number of groups,N, and the number of members per group,n.

If PIM-SBT or PIM Dense is used for the large number of senders and groups expected in

STOW-97, each router will be forced to maintain a routing table whose average size may be several

million entries. Routers near bottleneck and RP locations are likely to lie on many shared trees and

require many more routing table entries. PIM-ShT and CBT, which deliver traffic using shared

trees, require fewer routing table entries on average. For comparison, internet routers maintained

routing tables for 30,000 routes at last check.

The maximum number of routes that can be supported in a network is a function of memory;

thus, the amount of memory needed to store routing information cannot be overlooked when plan-

Table 4: Average Number of Routing Table Entries Required at Each Router for, , , = 1/3, and .

Protocol

Number of Groups and Senders per Group

N=1000n=10

N=100n=200

N=20,000n=10

N=20,000n=200

CBT 500 500 10,000 10,000

PIM Dense 6,700 133,000 133,000 2,670,000

PIM-SBT 8,000 150,500 160,000 3,010,000

PIM-ShT 1,000 1,500 20,000 210,000

β 1= αC 0.5= αS 0.25= t r t f⁄ d r⁄ 0.05=

β 1=

αC 0.5=

αS 0.25= d r⁄ 0.05=

t r t f⁄

ning large simulations.

As with any routing algorithm that relies on a table of this sort, a larger table leads to slower

performance. Our simulations do not model the time required to look up the route; the packet for-

warding time depends only on the size of the packet. A more complete model would consider the

delay added by searching the routing table. We expect increased end-to-end delay for packets

delivered by PIM.

Each routing table entry set up by PIM initiates at least one timer. Because the operating system

overhead impacts the performance of a router, the effect of several million timers on the operating

system must be addressed. An approach used by router manufacturers to minimize the impact of

the timers is to aggregate the timers in the routing daemons into a single timer. One effect of this

aggregation on performance can be seen in the measured join time.

4.6 Join Time

We define join time to be the time between when a host asks to join a given multicast group

and the time it receives a packet addressed to that group. The mean join times measured for CBT,

PIM-ShT, and PIM-SBT under in our simulations were each 230 ms. This value was highly depen-

dent on the traffic levels. On average, 4.5 packets are created and sent for each group each second;

thus, most of the measured join time is spent waiting for a packet to be created and sent to the

group. If the traffic level in the network was higher, the join times would be smaller. We defined

join time in this way to be consistent with measurements being taken on real networks and to be

consistent among protocols.

PIM Dense join times are much higher than those of the other protocols for every network. The

mean join time for PIM Dense was 1.2 seconds with individual measurements as high as 40 sec-

onds. We have traced the long join times to the behavior of our implementation of the timers that

initiate flooding of data packets.

We cannot model the number of timers required by the PIM specification in OPNET in reason-

able periods of time. We chose to consolidate the timers in each router. This deviation from the

specification is not unreasonable; as we noted earlier, some router manufacturers consolidate tim-

ers in the routing daemons to minimize the impact of the timers on the operating system. However,

consolidating timers did introduce long join time delays (several times the length of the timers) in

both PIM Sparse and PIM Dense in initial simulations. We deactivated the timers in PIM Sparse

in our final simulations, leaving outdated routing table entries in the routers.

5. Suggestions for PIM Sparse

In this section, we note some features of PIM Sparse that could be altered to make the protocol

more attractive for the DIS community and possibly the wider multicast community. We are not

attempting to define a new protocol with all of the details required by a real protocol. We are just

offering some suggestions and thoughts as to how PIM might be improved.

The issues we address arise from the complex features instituted to achieve the laudable objec-

tive of being able to offer either a source-based packet delivery mode or a shared-tree mode.

When using PIM-SBT, the receiver initially receives data packets from the source S for group

G over the shared tree. Then, based upon some unspecified criteria, each receiver may send a

request to the sender S to join the source-based tree. After receiving data on the source-based tree

from S, the receiver sends a prune message up the shared tree to terminate data delivery of packets

from S along the shared tree. This prune request causes complicated routing table entries to be set

up to stop delivery of only the data from S along the shared tree.

We recommend that the decision as to whether thesource-based tree mode orshared tree mode

be used be made by the group initiator rather than individually by receivers, as currently defined

in the specification [Dee95]. We do not see great advantages in allowing the receivers to control

this choice, and it leads to considerable complexity. In almost all cases, the application and the

nature of the group being initiated indicate which type of tree might be optimal. Videoconference

application with a single sender would use a source-based tree; large applications such as distrib-

uted simulation would use shared trees.

The approach we recommend is described briefly below.

1)The group initiator registers the group at the RP established for that group and specifies whether

the traffic should be delivered using shared trees or source-based trees.

2)For shared tree groups, requests to join a group are handled in a straightforward manner much

like CBT or PIM-ShT. A shared tree that is a bidirectional graph (unlike the directed graph used

currently) is constructed. Senders can use this tree immediately; there is no need to require them

to register at the RP or to have source-based routes established from S to the RP. Thus, there is

no need to have source-based routing table entries for groups using the shared tree mode.

3)Operation ofsource-based tree groups could follow these steps.

• A receiver that joins a group that uses source-based trees sends a join message to the RP to be-

come part of a unidirectional tree for the group rooted at the RP. (This tree is used only for

overhead messages.) The RP sends the current list of senders for group G back to the receiver.

• New sources register with the RP. The RP multicasts the arrival of the new source to all mem-

bers of group G along the tree rooted at the RP. The RP may periodically transmit a list of

senders to the group to insure that the receiver knows about all current senders.

• The receiver requests to join the source-based trees of all senders on the list of senders. Note

that the receiver communicates its intention to join group G and learns of a source S over the

shared tree. Traffic is received from S only over the source-based tree. Since data is not for-

warded on the tree based at the RP, no (S, G, RP) entries are needed.

These modifications to the PIM protocol would keep the advantageous properties of the current

PIM specification. One disadvantage is the join time when using source-based trees would be

longer because the receiver would not receive data from a sender until it joined the sender’s source-

based tree.

The advantages of implementing these suggested changes to PIM are that the intricacy of the

protocol is greatly reduced, the routing table sizes are always reduced (greatly reduced in some

cases) and the number of timers required is reduced. For PIM-ShT, the table sizes and the number

of timers are reduced at least by a factor of 2. The reduction is probably a great deal larger for appli-

cations with a large number of senders per group such as STOW-97. In PIM-SBT, no (S,G,RP)

entries are needed along the shared trees since no traffic will be routed on these trees. This again

leads to a savings in routing table size and number of timers.

6. Conclusions

This paper compares the performance of the PIM and CBT protocols with respect to the metrics

outlined in Section 4. These results are summarized in Table 5 . PIM-SBT and PIM Dense have

slightly lower end-to-end delays than CBT and PIM-ShT, but the absolute delays are all very small.

Network resource usage is similar for all of the protocols except PIM Dense, which periodically

floods data on the network. Traffic concentration is observed in CBT and PIM-ShT, but does not

degrade performance significantly.

Table 5: Comparison of Multicast Routing Protocols.

MetricProtocol

CBT PIM-ShT PIM-SBT PIM-Dense

End-to-EndDelay

Low Low Low Low

NetworkResource Usage

Moderate Moderate Moderate High insparse groupenvironment

The size of the routing table and the impact of the timers on the operating system overhead may

become a major factor. Operation of PIM-SBT or PIM Dense for a large number of members and

groups requires each router to maintain large routing tables. For example, the average number of

routing table entries at a single router in the STOW-97 exercises is expected to be several million

entries. Routers near bottleneck locations are likely to lie on many shared trees and require many

more routing table entries. Significant delay may be added each time one of these large tables is

accessed.

On average, PIM-ShT routers have fewer routing table entries and fewer timers than the

source-based tree protocols. CBT routers have even fewer. Based on these observations, we argue

that with current technology CBT is the best suited multicast protocol for environments with a large

number of groups, each with many senders.

OverheadTraffic

Percentage

Small in thecases simulatedbut proportional

to number ofjoins per second





High

Join Time Low Low Low High on average

TrafficConcentration

High Highest Low Lowest

Routing TableSize

Linear with thenumber of

groups

Proportional tothe product of

number ofgroups and

mean number ofsenders per

group


number ofgroups and


group


number ofgroups and


group

ImplementationDifficulty

Low toModerate

Complex Complex Moderate

Table 5: Comparison of Multicast Routing Protocols.

MetricProtocol

CBT PIM-ShT PIM-SBT PIM-Dense

Acknowledgments

We would like to thank Dr. Stuart Milner of DARPA for project guidance. We would also like

to thank the anonymous reviewers for their valuable comments and suggested improvements.

References

[Bal94]T. Ballardie,Core Based Tree (CBT) Multicast: Architectural Overview and Specification,Internet Draft RFC, July, 1994.

[Bil95]Billhartz,T., Cain, J., Farrey-Goudreau, E., Fieg, D., Batsell, S., “Simulation Comparisonof CBT and PIM Multicasting for Distributed Interactive Simulation (DIS),”Proceedings ofthe 1996 Society for Computer Simulation Western Multiconference: Communications Net-works Modeling and Simulation Conference, pp. 246-251, January 14-17, 1996.

[Bil96]Billhartz, T., Cain, J., Farrey-Goudreau, E., Fieg, D., Batsell, S., “Performance and Re-source Cost Comparisons for the CBT and PIM Multicast Routing Protocols in DISEnvironments,”Proceedings of IEEE INFOCOM ‘96, Vol. 1, pp. 85-93, March 26-28 1996.

[Dee89]S. Deering,Host Extensions for IP Multicasting, Request for Comments 1112, DDN Net-work Information Center, August 1989.

[Dee95]S. Deering, D. Estrin, D. Farinacci, V. Jacobson, C. Liu, and L. Wei,Protocol IndependentMulticast (PIM): Protocol Specification, Internet Draft RFC, January 11, 1995.

[Eri94]Eriksson, H. “MBone: The Multicast Backbone,”Communications of the ACM, Volume37, Number 8, pp 54-60, August 1994.

[Moy94]Moy, John, “Multicast Routing Extensions for OSPF,”Communications of the ACM, Vol-ume 37, Number 8, pp 61 - 66, August 1994.

[Wai88]D. Waitzman, C. Partridge, and S. Deering,Distance Vector Multicast Routing Protocol,Request for Comments 1075, DDN Network Information Center, November 1988.

[Wei94]L. Wei and D. Estrin, “The Trade-Offs of Multicast Trees and Algorithms”, Proceedingsof the 1994 International Conference on Computer Communication and Networks, Septem-ber 12-14, 1994.

Authors

Tom Billhartz received the B.S. and M.S. degrees in electrical engineering from the Univer-

sity of Florida, Gainesville, in 1988 and 1990, respectively. He is currently an engineer at Harris

Corporation in Melbourne, Florida, where his work includes analysis and simulation of network

protocols, communication systems, and digital signal processing.

J. Bibb Cain (S’64 - M’69) received the B.S.E.E., M.S.E.E., and Ph.D. degrees in electrical

engineering from the University of Alabama, Tuscaloosa, in 1965, 1966, and 1969, respectively.

Since 1969, he has been with Harris Corporation in Melbourne, Florida, where he is a Senior

Scientist. During this time, he has contributed to the development of a number of different error-

correction decoder designs for both block (BCH and Reed--Solomon), convolutional (with sequen-

tial, threshold, and Viterbi decoders), and concatenated codes. In addition, he has worked

extensively in applications of coding in a variety of systems including digital communications

links such as coded spread spectrum data links, communications networks, and storage systems.

More recently, his research has also included work in the areas of development of algorithms and

protocols for computer and communications networks. This work included the development, per-

formance, evaluation, and validation of link assignments, routing, and flow control protocols. He

has published numerous papers on these subjects, and he has co-authored the textbookError-Cor-

rection Coding for Digital Communications (New York: Plenum, 1981). He also holds five U.S.

patents in the areas of error-correction coding and networking protocols.

Dr. Cain is a member of Tau Beta Pi, Eta Kappa Nu, Sigma Xi, Phi Eta Sigma, and Pi Mu

Epsilon.

Ellen Farrey-Goudreau (S’88 - M’94) earned M.A. and Ph.D. degrees in electrical engineer-

ing from Princeton University in 1991 and 1994, respectively. Her research at Princeton examined

properties of sampled bilinear systems. She also earned a B.S. degree in Electrical Engineering

from Washington University in St. Louis is 1989.

Since 1994, Dr. Farrey-Goudreau has been a member of the systems modeling and analysis

group in the Communications Systems Division of Harris Corporation in Melbourne, Florida. In

addition to studying the performance of multicast protocols, she has worked on adaptive control

algorithms for phased array antennas and applications of fuzzy logic to radar systems. Her interests

include adaptive control and analysis of nonlinear systems.

She is a member of Tau Beta Pi, Eta Kappa Nu, and Pi Mu Epsilon.

Doug Fieg received his B.S. and M.S. degrees in Industrial Engineering from the University

of Missouri at Columbia in 1976 and 1977, respectively. He is currently a systems engineer at Har-

ris Corporation in Melbourne, Florida.

His expertise and experience are in the areas of discrete-event simulation, queuing theory,

mathematical optimization, and probability and statistics. He has applied this expertise to a variety

of applications, ranging from the analysis of network protocols and communications systems to

inventory control and other business applications.

Steven Gordon Batsell received a B.S. in physics and mathematics from Kansas State

University, an M.S.E.E. from the University of Illinois, Urbana-Champaign, and the Ph.D. in

electrical engineering from Texas Tech University.

He is currently the Network Research Group Leader at Oak Ridge National Laboratory where

he is involved in high speed networking. He spent the previous nineteen years in military commu-

nication system engineering, most recently at the Naval Research Laboratory in Washington, DC,

where he worked on the DARPA STOW program.

His current interests are in high-speed network design, network performance issues, and dis-

crete-event simulation of communication networks. He is a member of the IEEE Communications

Society, SIGCOM, the IEEE Computer Society, Sigma Xi, Sigma Pi Sigma, and Phi Kappa Phi.

performance and resource cost comparisons for the cbt...

Documents