performance and resource cost comparisons for the cbt...
TRANSCRIPT
Page 1
Performance and Resource Cost Comparisons
for the CBT and PIM Multicast Routing Protocols
Tom Billhartz, J. Bibb Cain, Ellen Farrey-Goudreau, Doug FiegHarris Corporation, Melbourne, Florida1
Steven BatsellOak Ridge National Laboratory, Washington, DC
Abstract
Researchers have proposed the Core Based Trees (CBT) and Protocol IndependentMulticasting (PIM) protocols to route multicast data on internetworks. In this paper, wecompare the simulated performance of CBT and PIM using the OPNET network simula-tion tool. Performance metrics include end-to-end delay, network resource usage, jointime, the size of the tables containing multicast routing information, and the impact ofthe timers introduced by the protocols. We also offer suggestions to improve PIM SparseMode while retaining the ability to offer both shared tree and source-based tree routing.
1. Introduction
Multicasting is a communications service that allows an application to efficiently transmit cop-
ies of a data packet to a set of receivers that are members of a multicast group. The group is
identified by a location-independent multicast group address. Senders use this address in the des-
tination field of the packet; multicast routers forward the packet to group members using routing
table entries for this address. The entries form a tree, which may be a source-based tree or a center-
based tree depending on the multicast routing protocol.
Multicast group members may be spread across separate physical networks, they may join and
leave a group during the life of the group, and they may be members of multiple groups. (How
members learn of the multicast group is not part of the routing protocol’s function; one method is
for a multicast application to advertise groups using a well-known multicast address.)
1. This work was sponsored by the Defense Advanced Research Projects Agency (DARPA) through contract number N00014-93-C-2186 with theNaval Research Laboratory.
Page 2
Since 1992, multicast routing has been performed by a multicast-capable, virtual network run-
ning “on top” of the internet called the Multicast Backbone (MBone) [Eri94]. The MBone uses the
Distance Vector Multicast Routing Protocol (DVMRP) [Wai88] or the Multicast Extensions for
Open Shortest Path First Protocol (MOSPF) [Moy94] to route multicast traffic. Common uses of
multicasting include audio- and video-conferencing, Distributed Interactive Simulation (DIS)
activities such as tank battle simulations, and exchanging experimental data and weather maps
[Eri94]. A comparison of the important features of these applications can be found in [Bil96].
DVMRP and MOSPF depend on features of underlying point-to-point (unicast) routing proto-
cols. Efforts to remove this dependency and to develop point-to-multipoint (multicast) routing
protocols that operate in a hierarchical manner with subnet multicast routing protocols led to the
development of the Core Based Tree (CBT) protocol [Bal94] and the Sparse Mode of the Protocol
Independent Multicasting (PIM) protocol [Dee95]. PIM and CBT are defined by Internet Drafts
and are still evolving.
In this paper, we describe the operation of CBT, the Dense Mode of PIM, and the Sparse Mode
of PIM. We discuss the OPNET network models that we built to compare the performances of
these protocols when a large number of groups and multiple senders are active simultaneously. We
discuss the simulation results and related analysis.
Wei and Estrin [Wei94] considered end-to-end delay, network resource usage, and traffic con-
centration for source-based trees and center-based trees by analyzing random graphs. This paper
measures these metrics plus overhead traffic, scalability, and join time for network models running
the PIM and CBT protocols. In our network models, key features of the protocols are implemented,
many groups are active simultaneously, and group membership is dynamic.
Page 3
2. Multicast Routing Protocols
2.1 Source-Based Trees and Shared Trees
Data packets addressed to a multicast group may be routed on a tree that is specific to the par-
ticular sender and group or a tree that is shared by all of the senders to the group. The first approach
uses a source-based tree (SBT) that is a shortest-path tree rooted at a sender. The branches of the
tree are the shortest paths from the sender to each of the group members. A separate tree must be
constructed for each sender to each active multicast group. A protocol that implements SBTs is the
Dense Mode of PIM (PIM Dense) [Dee95]. The shared tree approach uses a single center-based
tree or core-based tree to route traffic from all senders to the group. The tree is a shortest-path tree
rooted at one or more predefined nodes in the network called Core nodes. A protocol based on cen-
ter-based trees is CBT [Bal94].
The Sparse Mode of PIM (PIM Sparse) [Dee95] first builds a center-based shared tree for each
multicast group. After a group member receives traffic over the shared tree, it may ask (though it
is not required to ask) the sender to send future traffic for that group along the shortest path. This
request triggers the multicast routers to construct a branch of the source-based tree for the group
from the sender to the receiver. We have modeled two special cases of PIM Sparse. The first deliv-
ers all traffic over the shared tree, which we will call the shared tree case of PIM Sparse (PIM-ShT).
In the second case, all group members ask to receive traffic only over source-based trees. We refer
to this case (which is implemented in routers on the market) as the source-base tree case of PIM
Sparse (PIM-SBT).
There is currently some debate over which type of tree provides the best performance [Wei94].
Algorithms that use CBTs construct a single tree for each group, regardless of the number of send-
ers. Because the packets are not guaranteed to travel the shortest path, one expects the end-to-end
Page 4
delivery delay to be larger for CBT algorithms than for SBT ones. However, SBT algorithms scale
poorly for large numbers of senders because the router resources required to maintain knowledge
of the tree structure is considerable.
2.2 Internet Group Management Protocol (IGMP)
Multicast routers connected to Local Area Networks (LANs) learn which multicast groups the
hosts on the LANs wish to join using the Internet Group Management Protocol (IGMP) [Dee89].
Hosts notify the router of their group memberships and of their decisions to join or leave particular
multicast groups. Routers use this information to construct multicast trees.
2.3 Core-Based Trees (CBT) Multicast Routing Protocol
CBT sets up and maintains a single shared tree for every multicast group that is active in the
network. When a multicast router is notified via IGMP that a local host would like to join the group,
the router sends a join message for that group toward the Core node via the shortest path. A tree
rooted at the Core is constructed as the acknowledgments to the join messages are processed. The
resulting tree is an bidirectional, acyclic graph that reaches every group member. See Figure 1.
Core
Figure 1. CBT shared tree used to route all multicast traffic for group G.
Multicast TreeNetwork Connections
Member of G
Member of G
Member of G
Page 5
Forwarding packets to the group members using CBT is straightforward. When a node on the
tree receives a packet addressed to the group, it forwards copies of the packet on all branches of
the group’s tree except for the branch on which the packet arrived. Packet delivery is illustrated in
Figure 2.
2.4 PIM Dense Multicast Routing Protocol
The PIM Dense protocol [Dee95] floods the network with data packets to set up a source-based
tree for every sender to every group. Initially, these trees reach every potential receiver in the net-
work. After receiving multicast data for group G, each router that has no members of a group G on
its LANs sends prune messages towards the senders to remove unwanted branches from trees for
group G. This broadcast-type behavior recurs periodically after the pruned interfaces have timed
out. A tree created by PIM Dense is shown in Figure 3.
Core
Figure 2. Packet delivery for group G and sender S along the CBT shared tree.
Packet DeliveryNetwork Connections
Member of G
Member of G Sender S
Page 6
2.5 PIM Sparse Multicast Routing Protocol
When a new multicast group is introduced to a network that uses the PIM Sparse Protocol, a
node in the network called the Rendezvous Point (RP) is assigned to the group. The RP will
become the center node of a directed, shared tree for the group. (The RP performs a function sim-
ilar to that of the Core node in CBT.)
Each multicast router that learns via IGMP that a local host has joined group G sends a join
message along the shortest path to the RP for that group. The join message triggers each router on
its path to the RP to set up or update a routing entry for the shared tree for group G. The shared tree
that is built by these actions is a directed tree rooted at the RP that can be used to deliver packets
to each member of the group.
Each new sender to a group registers with the RP. In response, the RP initiates construction of
a directed source-based tree from the sender to the RP. At each router on this tree, the port leading
to the RP, as well as the ports leading to branches of the shared tree that do not lead to the source,
are added to the SBT routing table. Routers give precedence to source-based trees when two trees
are available for the same group. See Figure 4.
Figure 3. PIM Dense multicast routing tree for sender S and group G.
Multicast Treefor Sender SNetwork Connections
Member of G
Member of G Sender S
Page 7
2.5.1 Shared Tree Case of PIM Sparse (PIM-ShT)
The shared tree directed away from the RP overlaid with a source-based tree from each source
provides the same functionality as the simple bidirectional tree of CBT. We refer to packet delivery
using these trees as the shared tree case of PIM Sparse or PIM-ShT.
2.5.2 Source-Based Tree Case of PIM Sparse (PIM-SBT)
Even if we use PIM Sparse to deliver all multicast traffic over SBTs, the trees discussed in the
Section 2.5 must be constructed. Subsequently, more complicated trees are set up.
Each receiver learns of senders to a group when it receives data packets on the trees described
in Section 2.5. The receiver may send join messages toward specific senders to become part of the
shortest-path, source-based trees rooted at those senders. The receiver must also alert routers on
the shared tree that no packets from these sources should be forwarded. The switch from shared
tree delivery to source-based tree delivery introduces a high degree of complexity to PIM-SBT. See
Figure 5.
RP
Figure 4. PIM-ShT shared tree for group G and source-based tree for sender S.
Source-Based Tree
Shared Treefor Sender S
for Group GNetwork Connections
Member of G
Sender SMember of G
Page 8
3. Simulation Environment
3.1 OPNET Network Models
We modeled CBT, PIM Dense, the Shared Tree Case of PIM Sparse, and the Source-Based
Tree Case of PIM Sparse using the OPNET network simulation tool. OPNET was selected because
it allows construction of detailed protocol models. In addition, because OPNET has optimized its
memory use, large networks that route many packets concurrently can be simulated efficiently. We
used or modified standard OPNET process models including IP and Ethernet. We also developed
new OPNET process models for the IGMP and the multicast routing protocols. Details of our mod-
els can be found in [Bil95].
We constructed three network topologies using the OPNET network simulation tool to study
the multicast protocols:
1)AAI/MAGIC/ATDnet - The top level of this topology is representative of a system being con-
structed by a consortium of government, industry and educational team members. Link band-
widths range from 10 to 600 Mbits/second. This network is called theAAI Network in later sec-
tions.
RP
Figure 5. PIM-SBT shared tree for group G and source-based tree for sender S.
Source-Based Tree
Shared Treefor Sender S
for Group GNetwork Connections
Member of G
Sender SMember of G
Page 9
• The AAI/MAGIC (ACTS ATM Internetwork / Multidimensional Applications and Gigabit
Internetwork Consortium) network topology integrates research projects that have influenced
the fundamental attributes of the prototype of the DARPA Global Grid program. Our OPNET
model of the AAI/MAGIC network contains eleven sites and is shown in Figure 6. Five of the
sites are part of a star subnetwork labeled MAGIC (the icon is located in Nebraska) in Figure 6.
• The subnetwork labeled NRL_ATDnet connects eight federal agency sites in the Washington,
DC, area in a ring configuration. It was established by ARPA and serves as a testbed for ATM
and SONET technologies.
2) High-bandwidth mesh topology - This topology is composed of nineteen sites arranged in the
grid pattern shown in Figure 7. The octagons symbolize sites. The bandwidths of the WAN links
were selected to match those used in theAAI Network to permit direct comparisons with theAAI
Network. This network is called theMesh Network in later sections.
Figure 6. AAI/MAGIC Network.
Page 10
3)Low-bandwidth mesh topology - This topology is the grid pattern defined for the high-bandwidth
mesh topology and shown in Figure 7. The bandwidths of the WAN links are set to 3 Mbits/sec-
ond to introduce congested links. This network is calledStressed Mesh Network or S-Meshin
later sections.
In all three networks, each site consists of a WAN router connected to a site router, which in
turn connects to two LAN routers. Each LAN router is connected to two host nodes via Ethernet,
as shown in Figure 8. Each network has seventy-six hosts that generate multicast data traffic.
Figure 7. Mesh Network Topology
Page 11
The Mesh Network topology is highly connected and provides numerous possible tree patterns.
In contrast, the AAI Network topology has two large loops that are the basis of the trees for every
multicast group. This contrast tests the multicast protocols under vastly different topological
conditions.
The network topologies do not change during the simulations and no packets are lost.
3.2 Simulation Parameters
Using OPNET, we were able to model delays due to the IP packet service rate; packet queueing
at routers and hosts; propagation delays on all links; and transmission delays based on link data
rate and packet size. We also modeled the size of the data packets (1824 bytes plus all appropriate
header information for the application we are interested in modeling) and the size of the overhead
packets defined by the protocol specifications. Because we are primarily interested in modeling the
routing of packets, we chose not to model PIM and CBT processing beyond that of IP; the effect
HostsHosts
Site Router
WAN Router
To WAN
LAN Routers
Figure 8. Site Topology
Ethernets
Page 12
of operating system overhead on delay in routers and hosts; the effect of finite size packet buffers
in routers; or the effect of nonzero link error rates.
In our simulations, each group was assigned a single RP or Core node selected from a set of
four centrally located nodes. Each of these nodes supported approximately one-quarter of the mul-
ticast groups. The RP for a given group in a PIM Sparse simulation was placed at the same location
as the Core node for that group in the CBT simulation. (Multiple Cores are supported by the CBT
protocol specification. PIM supports only a single RP.)
We performed twelve simulations for each combination of network and protocol. Each of the
parameters in Table 1 was varied independently in the simulations. We distributed membership
among 390 active multicast groups using a uniform distribution defined by the “Groups per Host”
and “Group Distribution Type” parameters explained in Table 1 . A given host in simulations that
have common a network and common parameters but different routing protocols will join the same
multicast groups at the start of the simulations.
Each group member was both a sender and a receiver for the group. Group members joined and
left groups throughout each simulation.
Table 1: Simulation Parameters.
Parameter Description of Parameter and Simulation Values Used
Groups per Host The maximum number of multicast groupsto which each host can belong: 10, 20, or 30.
Group DistributionType
The set of groups that hosts can join. Our model allows a host to join(a) groups with membership spread across the entire network or
(b) groups with membership spread across the entire network ANDgroups with membership restricted to one quadrant of the network.
Join/Leave Dynamics The maximum time between consecutive changes (join or leave)in group membership at a host: 5 or 10 seconds.
The times between changes (joins or leaves) wereuniformly distributed between zero and this parameter.
Traffic GenerationRate
Rate of packet transmission for each group at each host.Hosts sent packets to each group at intervals determined by
a Poisson distribution with mean one packet per second per group.
Page 13
4. Experimental Results
This section examines simulation results for the CBT, PIM Dense, and two special cases of the
PIM Sparse protocol on the AAI Network, the Mesh Network, and the Stressed Network. The met-
rics used to compare the protocols are discussed in Table 2 .
Each statistic reported in the Section 4. is the average of the twelve simulations performed for
the given network and protocol. One simulation was performed for each combination of simulation
parameters.
4.1 End-to-End Delay
End-to-end delay is the period of time required for a data packet to be routed through the net-
Table 2: Network Performance Metrics.
Metric Description
End-to-End Delay Time elapsed between the generation of a packet at a sourceand the reception of that packet by a group member.
Network ResourceUsage
Total number of hops that copies of a packettravel to reach all group members.
Computed by dividing the total number of hops measuredin a simulation (including overhead packets)
by the number of packets received.
Overhead TrafficPercentage
The percentage of the total number of bits transmittedthat are overhead bits.
TrafficConcentration
A measure of the distribution of the total network traffic on all links.Defined to be the ratio of the maximum throughput
carried by any link to the mean throughput of all links.
ImplementationIssues
(1) Size of the routing table and (2) the number of required timers.These issues impact memory requirements, speed,
and operating system performance.
Join Time The time elapsed between when a host requests to join a groupand when that host receives its first message
from another member of that group.
Page 14
work from the application where it was created to a destination application. The average end-to-
end delay measurements for each protocol and each network is shown in Figure 9. These values
are normalized by the average end-to-end delay for the CBT protocol under the same test condi-
tions for illustration purposes.
PIM-ShT and CBT use shared trees to deliver data to multicast group members. These proto-
cols route packets for groups with the same members and the same Core and RP nodes on identical
paths. End-to-end delays for these protocols are similar.
PIM-SBT and PIM Dense deliver packets 13% to 31% faster than CBT, depending on the
topology simulated. These protocols use source-based trees to deliver packets along the shortest
path from the source to the receivers. The largest improvements of source-based trees protocols
over shared tree protocols were observed for the highly connected Mesh Networks.
Figure 9. End-to-End Delay relative to CBT.
PIM-ShT PIM-SBT PIM-D CBT
AAI Mesh S-Mesh
PIM
/CB
T R
atio
0.00
0.20
0.40
0.60
0.80
1.00
1.20
Page 15
4.2 Network Resources Usage
A simple method to route a packet to all interested receivers is to unicast a copy of the packet
to each receiver. However, unicasting is likely to route several copies of the same packet over net-
work links. Multicast protocols send only a single copy of a packet over any link in the network
and require fewer hops to deliver the packet than unicasting in most cases. Our simulation results
were used to study which protocol delivers a copy of a packet to all group members in the fewest
number of hops. We refer to the number of hops as the Network Resource Usage metric.
Source-based tree protocols deliver packets along the shortest paths from the source to each
receiver. Because packets travel direct paths to their destinations, the fewest number of hops are
recorded for a single receiver. If multiple receivers are present in the network, packets still travel
directly to each receiver. Copies of the packet are not made until the paths diverge.
Shared tree protocols are expected to route packets on a common tree for greater distances.
Thus, even though the packets may not travel on the shortest paths to the receivers, copies of the
packets may travel fewer overall hops than source-based tree protocols to reach all of the
destinations.
Page 16
The histogram shown in Figure 10 illustrates that CBT, PIM-ShT, and PIM-SBT require a sim-
ilar number of hops to deliver copies of a packet to all multicast group members. The measured
values displayed in Figure 10 are normalized by the measured value for CBT under the same test
conditions.
Table 10 shows that the network resource usage is much larger for PIM Dense than for the
other protocols in all of the networks. This difference is due to periodic flooding, which almost
doubled the number of data packets counted in some of our simulations. Examination of individual
simulation results shows that the network resource usage of PIM Dense approaches that of CBT as
the network becomes more densely populated with multicast group members. This result was
expected because PIM Dense was designed for a dense membership environment.
Figure 10. Network Resource Usage relative to CBT.
PIM-ShT PIM-SBT PIM-D CBT
AAI Mesh S-Mesh
PIM
/CB
T R
atio
0.00
0.20
0.40
0.60
0.80
1.00
1.20
1.40
1.60
Page 17
4.3 Overhead Traffic Percentage
Overhead messages are used to exchange IGMP information and to set up, maintain, and tear
down routing trees. We measured the number of each type of packet created during each simulation
-- including overhead packets -- and the number of hops that the packets traveled. From these val-
ues, we computed the percentage of bits transmitted during a simulation that were contained in
overhead messages. These percentages are shown in Figure 11 for each protocol and each network.
Figure 11 shows that CBT has the lowest overhead percentage of the protocols examined,
approximately 0.3%. PIM-ShT also has a low overhead percentage, only 5% greater than the per-
centage measured for CBT. The extra overhead bits make up the packets that set up source-based
trees between each sender and the RP.
More complicated source-based trees are set up in PIM-SBT. In addition, each time a host asks
to join (or leave) a multicast group, overhead messages must be sent to each sender to the group to
PIM-ShT PIM-SBT PIM-D CBT
AAI Mesh S-Mesh
Overh
ead %
0.00
1.00
2.00
3.00
4.00
5.00
Figure 11. Percentage of Bits Contained in Overhead Messages
Page 18
set up (or tear down) a branch of the multicast tree for that sender. (Only one tree needs to be
updated in CBT or PIM-ShT.) These factors double the percentage of overhead bits measured for
PIM-SBT compared to CBT or PIM-ShT.
Due to prunes that follow flooding, the percentage of overhead bits for PIM Dense is more than
double the percentage for PIM-SBT. In some simulations, the percentage of overhead messages
was measured to be almost 5%. If data packets that are flooded but do not reach a multicast group
member are considered overhead, the performance of PIM Dense with respect to this metric
becomes much worse. In one simulation where each group had few members spread throughout
the network, 43% of all network traffic (measured in bits) was overhead or unwanted flooded data.
There are more paths to flood data in the highly connected Mesh Networks; thus, the overhead per-
centage was greater for these networks.
The number of overhead messages required to maintain the routing trees for an active group
depends on the number of members joining and leaving the group, not on the amount of traffic sent
to that group. Thus, the overhead percentages reported would be smaller for higher levels of traffic.
Because the absolute percentages are related to the level of traffic in the network, the numbers dis-
cussed here should only be used for comparison.
4.4 Traffic Concentration
The ratio of the largest average throughput measured on a network link to the average through-
put measured for all links, which we call the Traffic Concentration metric, can be used to examine
the distribution of traffic on the links in the network. The shared tree protocols CBT and PIM-ShT
are expected to concentrate traffic onto the subset of the network links that compose the shared
trees. The source-based tree protocols PIM Dense and PIM-SBT are expected to distribute the traf-
fic more evenly among all links because they use a different tree for each sender and each group.
Page 19
In the Mesh and Stressed Mesh Networks, CBT and PIM-ShT have similar Traffic Concentra-
tion Metrics, as shown in Figure 12. The metric for PIM-SBT is 17% less. PIM Dense distributes
traffic most evenly among the links. Recall, however, that the overall traffic level of PIM Dense
(the denominator of the ratio) is much higher.
The results shown in Figure 12 for the AAI Network should not be considered representative
because the link between the AAI/MAGIC Network and the ATD Network is a bottleneck link for
each protocol. Every group that has a Core, RP, or member on the ATDnet and one on the AAI/
MAGIC Network must include this link in its tree.
Only four nodes are used as Core and RP nodes in our simulations. A copy of each multicast
data packet in CBT and PIM-ShT always passes through these of nodes, causing a concentration
of traffic on the links leading to them. SBT protocols build trees rooted at each sender for every
group so that traffic is not constrained to pass through a small number of nodes. This result gives
Figure 12. Traffic Concentration Metric.
PIM-ShT PIM-SBT PIM-D CBT
AAI Mesh S-MeshMaxim
um
-to-M
ean T
hro
ughput R
atio
0.00
1.00
2.00
3.00
4.00
5.00
Page 20
insight into a way that the locations of the Cores and RPs could be chosen if traffic concentration
is a potential problem: allow each node to serve as Core and RP nodes for only a few multicast
groups. The difference in traffic concentration metric between the shared tree and source-based
tree protocols should be reduced.
4.5 Implementation Issues
The OPNET model of the AAI Network was built to study the performance of a physical net-
work that will be used for large-scale, military simulation exercises. (The physical network will
have more complicated site topologies with hundreds of hosts.) These exercises are expected to be
large; the STOW-97 (Synthetic Theatre of War) simulation may have 20,000 groups with several
hundred members of each group. Many of these members will send to the group. The intricacy of
the protocol, operating system overhead, and routing table size become especially important issues
when the number of senders and groups grow to this size because router speed and memory
requirements are impacted.
This section analyzes the size of the routing tables needed for each protocol as a function of the
number of groups and senders per group. Each routing table entry initiates at least one timer. The
effect on the operating system of these timers is also considered.
4.5.1 Routing Table Size
Both PIM and CBT require that each multicast router maintain a table of multicast routing
information. We examine the size of the tables using the notation in Table 3 .
Table 3: Parameters Used to Analyze the Size of the Routing Tables for CBT and PIM
Variable Meaning
T(*) Average number of routing table entries in a router for protocol *.
N Number of multicast groups active in the network.
n Average number of senders to a group.
Page 21
CBT sets up only one tree for each group. The average number of entries that a router maintains
is the likelihood that the router will be on a shared tree times the total numberN of shared trees.
(1)
The PIM Dense protocol sets up and maintains a source-based tree for every sender and every
group. Due to the flooding of the data packets, entries are set up throughout the network. Thus,
routing table entries for a group are maintained in portions of the network where no hosts are mem-
bers of that group and no branches of the tree are needed. These entries are removed after a period
of disuse, but re-flooding replaces them.
The average number of entries in each PIM Dense router is approximated by the following
expression.
(2)
A lower bound for this expression is given below.
r Total number of routers in the network.
Average percentage of the routers that are on a shared tree.Also, the average percentage of the routers
that are on a source-based tree for a group and a sender.
Average percentage of the routers that are on the source-based treefor a group and sender but not on the shared tree for that group.
d Average number of hops from a host to a RP.
p Average number of interfaces at a router.
Fraction of groups moving to SBT in PIM Sparse.
Length of timer that initiates PIM-Dense flooding.
Length of timer that initiates the removalof inactive PIM Dense entries < .
Table 3: Parameters Used to Analyze the Size of the Routing Tables for CBT and PIM
Variable Meaning
αC
αS
β
t f
t r
t f
αC
T CBT( ) N αC⋅=
T PIM Dense( ) N n αc 1 αc–( )+ t r t f⁄⋅[ ]⋅ ⋅≅
Page 22
(3)
For comparison, we compute a lower bound on the ratio of the average number of routing tables
entries needed to implement PIM Dense to the number needed for CBT. We find the routing tables
in routers running PIM Dense are at leastn times larger than those running CBT, on average.
(4)
In our simulations, the ratio was even larger. When , routing table entries existed
at most routers for each active sender of every group plus some out-of-date senders whose entries
had not yet timed out.
PIM-ShT routes packets along shared trees as CBT does (contributing the first term of
Equation 5). PIM-ShT also sets up a source-based tree between each source S and the RP. The
average number of entries that a router maintains for source-based trees is the average number of
hopsd from a host to the RP divided by the number of routersr in the network.
(5)
A lower bound for this expression can be providedif every member is a sender. Every router
that is on the shared tree must lie on the shortest path between a group member and the RP accord-
ing to the rules for tree construction. Therefore, every router on the shared tree must be on the
source-based tree that connects a sender (the group member) to the RP and must have an source-
based tree entry.
(6)
(7)
The PIM routing tables will always be at least twice the size of the CBT routing tables and
T PIM Dense( ) N n αc⋅ ⋅≥
T PIM Dense( )T CBT( )------------------------------------
N n αc⋅ ⋅
N αc⋅------------------------ n=≥
t r t f⁄ 1 2⁄=
T PIM-ShT( ) N αC⋅ N n d r⁄⋅ ⋅+=
T PIM-ShT( ) 2 N αC⋅ ⋅≥ 2 T CBT( )⋅=
T PIM ShT–( )T CBT( )------------------------------------- 2≥
Page 23
sometimes much larger for the same environment. These bounds are conservative for network
topologies where the average number of senders to a group is greater than the number of network
interfaces at the RP. Also, an entry is required for each source in the group at the RP.
PIM-SBT sets up and maintains many trees for each multicast group G in the network including
• one source-based tree for every sender S, called the (S,G) tree, and
• one shared tree, called the (*,G) tree.
Additional routing table entries called (S,G,RP) entries stop traffic that is delivered on source-
based trees from being forwarded on shared trees. See reference [Dee95] for a more detailed
description of these routing table entries.
A router that is part of a shared tree for group G has a (*,G) entry. If the router is on a source-
based tree for a sender S, it also has an (S,G) for sender S and group G. If the router is not on the
source-based tree for sender S, it has an (S,G,RP) entry. Thus, each router on the shared tree for
group G has another entry -- either an (S,G) or an (S,G,RP) entry -- for each sender to group G.
(8)
If only some of the groups use source-based trees, the average number of routing table entries
is given by the following expression.
(9)
If all PIM traffic is delivered along the shortest path ( ), the ratio of the number of routing
table entries for PIM-SBT to the number for CBT is given by Equation 10. The parameter is
defined to be .
T PIM-SBT( ) N n 1+( ) αC⋅ n αS⋅+[ ]⋅=
T PIM-SBT( ) N β n⋅ 1+( ) αC⋅ β n α⋅ ⋅ S+[ ]1 β–( ) N n d r⁄⋅ ⋅ ⋅+
⋅=
β 1=
ρ
ρ αS αC⁄=
Page 24
(10)
On average, implementing PIM-SBT in a network will require each router to maintain at least
n+1 times as many routing table entries as implementing CBT in the network.
4.5.2 Timer Requirements
According to our interpretation of the PIM protocol specification, several timers are required
for PIM. Both PIM Dense and PIM Sparse have two types of timers that are kept at each router.
• One timer is maintained for each routing table entry at the router. This timer is updated each
time each entry is used. If the entry is not used for 3 minutes, the entry is deleted. The length
of these timers has been reduced to 30 seconds in our simulations.
• For each routing table entry, one timer is kept for each output interface. In PIM Dense, when
one of these timers expires for a pruned interface, the interface for which it was set is added to
the entry’s output interface list. In PIM Sparse, these timers are updated when routing informa-
tion is exchanged by neighboring routers.
The average number of timers at a router is . Complexity added by
requiring that many PIM routing entries be maintained is compounded by requiring timers for each
entry.
4.5.3 Implications of Complexity on Speed
As we discussed Section 4.5, the STOW-97 exercises may have 20,000 multicast groups, each
with several hundred members. As our analysis of the previous section shows, the number of rout-
ing table entries and the number of timers required to support an exercise of this size are extremely
T PIM SBT–( )T CBT( )
-------------------------------------N n 1+( ) αC× n α× S+[ ]×
N αC×----------------------------------------------------------
1 n ρ 1+( )
n 1+≥
×+
=
=
1 p+( ) T protocol( )⋅
Page 25
large.
To illustrate how the average size of the routing tables scales with the number of groups and
members, consider an example where all multicast data is delivered on SBTs ( ). (PIM
Sparse routers currently being built support only the source-based tree mode.) Let ,
, and . These numbers are consistent with our observations of AAI Net-
work model. Let = 1/3. Table 4 shows the growth of the routing table size with the growth
in number of groups,N, and the number of members per group,n.
If PIM-SBT or PIM Dense is used for the large number of senders and groups expected in
STOW-97, each router will be forced to maintain a routing table whose average size may be several
million entries. Routers near bottleneck and RP locations are likely to lie on many shared trees and
require many more routing table entries. PIM-ShT and CBT, which deliver traffic using shared
trees, require fewer routing table entries on average. For comparison, internet routers maintained
routing tables for 30,000 routes at last check.
The maximum number of routes that can be supported in a network is a function of memory;
thus, the amount of memory needed to store routing information cannot be overlooked when plan-
Table 4: Average Number of Routing Table Entries Required at Each Router for, , , = 1/3, and .
Protocol
Number of Groups and Senders per Group
N=1000n=10
N=100n=200
N=20,000n=10
N=20,000n=200
CBT 500 500 10,000 10,000
PIM Dense 6,700 133,000 133,000 2,670,000
PIM-SBT 8,000 150,500 160,000 3,010,000
PIM-ShT 1,000 1,500 20,000 210,000
β 1= αC 0.5= αS 0.25= t r t f⁄ d r⁄ 0.05=
β 1=
αC 0.5=
αS 0.25= d r⁄ 0.05=
t r t f⁄
Page 26
ning large simulations.
As with any routing algorithm that relies on a table of this sort, a larger table leads to slower
performance. Our simulations do not model the time required to look up the route; the packet for-
warding time depends only on the size of the packet. A more complete model would consider the
delay added by searching the routing table. We expect increased end-to-end delay for packets
delivered by PIM.
Each routing table entry set up by PIM initiates at least one timer. Because the operating system
overhead impacts the performance of a router, the effect of several million timers on the operating
system must be addressed. An approach used by router manufacturers to minimize the impact of
the timers is to aggregate the timers in the routing daemons into a single timer. One effect of this
aggregation on performance can be seen in the measured join time.
4.6 Join Time
We define join time to be the time between when a host asks to join a given multicast group
and the time it receives a packet addressed to that group. The mean join times measured for CBT,
PIM-ShT, and PIM-SBT under in our simulations were each 230 ms. This value was highly depen-
dent on the traffic levels. On average, 4.5 packets are created and sent for each group each second;
thus, most of the measured join time is spent waiting for a packet to be created and sent to the
group. If the traffic level in the network was higher, the join times would be smaller. We defined
join time in this way to be consistent with measurements being taken on real networks and to be
consistent among protocols.
PIM Dense join times are much higher than those of the other protocols for every network. The
mean join time for PIM Dense was 1.2 seconds with individual measurements as high as 40 sec-
onds. We have traced the long join times to the behavior of our implementation of the timers that
initiate flooding of data packets.
Page 27
We cannot model the number of timers required by the PIM specification in OPNET in reason-
able periods of time. We chose to consolidate the timers in each router. This deviation from the
specification is not unreasonable; as we noted earlier, some router manufacturers consolidate tim-
ers in the routing daemons to minimize the impact of the timers on the operating system. However,
consolidating timers did introduce long join time delays (several times the length of the timers) in
both PIM Sparse and PIM Dense in initial simulations. We deactivated the timers in PIM Sparse
in our final simulations, leaving outdated routing table entries in the routers.
5. Suggestions for PIM Sparse
In this section, we note some features of PIM Sparse that could be altered to make the protocol
more attractive for the DIS community and possibly the wider multicast community. We are not
attempting to define a new protocol with all of the details required by a real protocol. We are just
offering some suggestions and thoughts as to how PIM might be improved.
The issues we address arise from the complex features instituted to achieve the laudable objec-
tive of being able to offer either a source-based packet delivery mode or a shared-tree mode.
When using PIM-SBT, the receiver initially receives data packets from the source S for group
G over the shared tree. Then, based upon some unspecified criteria, each receiver may send a
request to the sender S to join the source-based tree. After receiving data on the source-based tree
from S, the receiver sends a prune message up the shared tree to terminate data delivery of packets
from S along the shared tree. This prune request causes complicated routing table entries to be set
up to stop delivery of only the data from S along the shared tree.
We recommend that the decision as to whether thesource-based tree mode orshared tree mode
be used be made by the group initiator rather than individually by receivers, as currently defined
in the specification [Dee95]. We do not see great advantages in allowing the receivers to control
Page 28
this choice, and it leads to considerable complexity. In almost all cases, the application and the
nature of the group being initiated indicate which type of tree might be optimal. Videoconference
application with a single sender would use a source-based tree; large applications such as distrib-
uted simulation would use shared trees.
The approach we recommend is described briefly below.
1)The group initiator registers the group at the RP established for that group and specifies whether
the traffic should be delivered using shared trees or source-based trees.
2)For shared tree groups, requests to join a group are handled in a straightforward manner much
like CBT or PIM-ShT. A shared tree that is a bidirectional graph (unlike the directed graph used
currently) is constructed. Senders can use this tree immediately; there is no need to require them
to register at the RP or to have source-based routes established from S to the RP. Thus, there is
no need to have source-based routing table entries for groups using the shared tree mode.
3)Operation ofsource-based tree groups could follow these steps.
• A receiver that joins a group that uses source-based trees sends a join message to the RP to be-
come part of a unidirectional tree for the group rooted at the RP. (This tree is used only for
overhead messages.) The RP sends the current list of senders for group G back to the receiver.
• New sources register with the RP. The RP multicasts the arrival of the new source to all mem-
bers of group G along the tree rooted at the RP. The RP may periodically transmit a list of
senders to the group to insure that the receiver knows about all current senders.
• The receiver requests to join the source-based trees of all senders on the list of senders. Note
that the receiver communicates its intention to join group G and learns of a source S over the
shared tree. Traffic is received from S only over the source-based tree. Since data is not for-
warded on the tree based at the RP, no (S, G, RP) entries are needed.
These modifications to the PIM protocol would keep the advantageous properties of the current
Page 29
PIM specification. One disadvantage is the join time when using source-based trees would be
longer because the receiver would not receive data from a sender until it joined the sender’s source-
based tree.
The advantages of implementing these suggested changes to PIM are that the intricacy of the
protocol is greatly reduced, the routing table sizes are always reduced (greatly reduced in some
cases) and the number of timers required is reduced. For PIM-ShT, the table sizes and the number
of timers are reduced at least by a factor of 2. The reduction is probably a great deal larger for appli-
cations with a large number of senders per group such as STOW-97. In PIM-SBT, no (S,G,RP)
entries are needed along the shared trees since no traffic will be routed on these trees. This again
leads to a savings in routing table size and number of timers.
6. Conclusions
This paper compares the performance of the PIM and CBT protocols with respect to the metrics
outlined in Section 4. These results are summarized in Table 5 . PIM-SBT and PIM Dense have
slightly lower end-to-end delays than CBT and PIM-ShT, but the absolute delays are all very small.
Network resource usage is similar for all of the protocols except PIM Dense, which periodically
floods data on the network. Traffic concentration is observed in CBT and PIM-ShT, but does not
degrade performance significantly.
Table 5: Comparison of Multicast Routing Protocols.
MetricProtocol
CBT PIM-ShT PIM-SBT PIM-Dense
End-to-EndDelay
Low Low Low Low
NetworkResource Usage
Moderate Moderate Moderate High insparse groupenvironment
Page 30
The size of the routing table and the impact of the timers on the operating system overhead may
become a major factor. Operation of PIM-SBT or PIM Dense for a large number of members and
groups requires each router to maintain large routing tables. For example, the average number of
routing table entries at a single router in the STOW-97 exercises is expected to be several million
entries. Routers near bottleneck locations are likely to lie on many shared trees and require many
more routing table entries. Significant delay may be added each time one of these large tables is
accessed.
On average, PIM-ShT routers have fewer routing table entries and fewer timers than the
source-based tree protocols. CBT routers have even fewer. Based on these observations, we argue
that with current technology CBT is the best suited multicast protocol for environments with a large
number of groups, each with many senders.
OverheadTraffic
Percentage
Small in thecases simulatedbut proportional
to number ofjoins per second
Small in thecases simulatedbut proportional
to number ofjoins per second
Small in thecases simulatedbut proportional
to number ofjoins per second
High
Join Time Low Low Low High on average
TrafficConcentration
High Highest Low Lowest
Routing TableSize
Linear with thenumber of
groups
Proportional tothe product of
number ofgroups and
mean number ofsenders per
group
Proportional tothe product of
number ofgroups and
mean number ofsenders per
group
Proportional tothe product of
number ofgroups and
mean number ofsenders per
group
ImplementationDifficulty
Low toModerate
Complex Complex Moderate
Table 5: Comparison of Multicast Routing Protocols.
MetricProtocol
CBT PIM-ShT PIM-SBT PIM-Dense
Page 31
Acknowledgments
We would like to thank Dr. Stuart Milner of DARPA for project guidance. We would also like
to thank the anonymous reviewers for their valuable comments and suggested improvements.
References
[Bal94]T. Ballardie,Core Based Tree (CBT) Multicast: Architectural Overview and Specification,Internet Draft RFC, July, 1994.
[Bil95]Billhartz,T., Cain, J., Farrey-Goudreau, E., Fieg, D., Batsell, S., “Simulation Comparisonof CBT and PIM Multicasting for Distributed Interactive Simulation (DIS),”Proceedings ofthe 1996 Society for Computer Simulation Western Multiconference: Communications Net-works Modeling and Simulation Conference, pp. 246-251, January 14-17, 1996.
[Bil96]Billhartz, T., Cain, J., Farrey-Goudreau, E., Fieg, D., Batsell, S., “Performance and Re-source Cost Comparisons for the CBT and PIM Multicast Routing Protocols in DISEnvironments,”Proceedings of IEEE INFOCOM ‘96, Vol. 1, pp. 85-93, March 26-28 1996.
[Dee89]S. Deering,Host Extensions for IP Multicasting, Request for Comments 1112, DDN Net-work Information Center, August 1989.
[Dee95]S. Deering, D. Estrin, D. Farinacci, V. Jacobson, C. Liu, and L. Wei,Protocol IndependentMulticast (PIM): Protocol Specification, Internet Draft RFC, January 11, 1995.
[Eri94]Eriksson, H. “MBone: The Multicast Backbone,”Communications of the ACM, Volume37, Number 8, pp 54-60, August 1994.
[Moy94]Moy, John, “Multicast Routing Extensions for OSPF,”Communications of the ACM, Vol-ume 37, Number 8, pp 61 - 66, August 1994.
[Wai88]D. Waitzman, C. Partridge, and S. Deering,Distance Vector Multicast Routing Protocol,Request for Comments 1075, DDN Network Information Center, November 1988.
[Wei94]L. Wei and D. Estrin, “The Trade-Offs of Multicast Trees and Algorithms”, Proceedingsof the 1994 International Conference on Computer Communication and Networks, Septem-ber 12-14, 1994.
Authors
Tom Billhartz received the B.S. and M.S. degrees in electrical engineering from the Univer-
sity of Florida, Gainesville, in 1988 and 1990, respectively. He is currently an engineer at Harris
Corporation in Melbourne, Florida, where his work includes analysis and simulation of network
protocols, communication systems, and digital signal processing.
J. Bibb Cain (S’64 - M’69) received the B.S.E.E., M.S.E.E., and Ph.D. degrees in electrical
engineering from the University of Alabama, Tuscaloosa, in 1965, 1966, and 1969, respectively.
Page 32
Since 1969, he has been with Harris Corporation in Melbourne, Florida, where he is a Senior
Scientist. During this time, he has contributed to the development of a number of different error-
correction decoder designs for both block (BCH and Reed--Solomon), convolutional (with sequen-
tial, threshold, and Viterbi decoders), and concatenated codes. In addition, he has worked
extensively in applications of coding in a variety of systems including digital communications
links such as coded spread spectrum data links, communications networks, and storage systems.
More recently, his research has also included work in the areas of development of algorithms and
protocols for computer and communications networks. This work included the development, per-
formance, evaluation, and validation of link assignments, routing, and flow control protocols. He
has published numerous papers on these subjects, and he has co-authored the textbookError-Cor-
rection Coding for Digital Communications (New York: Plenum, 1981). He also holds five U.S.
patents in the areas of error-correction coding and networking protocols.
Dr. Cain is a member of Tau Beta Pi, Eta Kappa Nu, Sigma Xi, Phi Eta Sigma, and Pi Mu
Epsilon.
Ellen Farrey-Goudreau (S’88 - M’94) earned M.A. and Ph.D. degrees in electrical engineer-
ing from Princeton University in 1991 and 1994, respectively. Her research at Princeton examined
properties of sampled bilinear systems. She also earned a B.S. degree in Electrical Engineering
from Washington University in St. Louis is 1989.
Since 1994, Dr. Farrey-Goudreau has been a member of the systems modeling and analysis
group in the Communications Systems Division of Harris Corporation in Melbourne, Florida. In
addition to studying the performance of multicast protocols, she has worked on adaptive control
algorithms for phased array antennas and applications of fuzzy logic to radar systems. Her interests
include adaptive control and analysis of nonlinear systems.
She is a member of Tau Beta Pi, Eta Kappa Nu, and Pi Mu Epsilon.
Page 33
Doug Fieg received his B.S. and M.S. degrees in Industrial Engineering from the University
of Missouri at Columbia in 1976 and 1977, respectively. He is currently a systems engineer at Har-
ris Corporation in Melbourne, Florida.
His expertise and experience are in the areas of discrete-event simulation, queuing theory,
mathematical optimization, and probability and statistics. He has applied this expertise to a variety
of applications, ranging from the analysis of network protocols and communications systems to
inventory control and other business applications.
Steven Gordon Batsell received a B.S. in physics and mathematics from Kansas State
University, an M.S.E.E. from the University of Illinois, Urbana-Champaign, and the Ph.D. in
electrical engineering from Texas Tech University.
He is currently the Network Research Group Leader at Oak Ridge National Laboratory where
he is involved in high speed networking. He spent the previous nineteen years in military commu-
nication system engineering, most recently at the Naval Research Laboratory in Washington, DC,
where he worked on the DARPA STOW program.
His current interests are in high-speed network design, network performance issues, and dis-
crete-event simulation of communication networks. He is a member of the IEEE Communications
Society, SIGCOM, the IEEE Computer Society, Sigma Xi, Sigma Pi Sigma, and Phi Kappa Phi.