Presented by: Bill Nickless
Best Current Practices for IPv4 Multicast Deployment
Bill [email protected]://www.mcs.anl.gov/home/nickless
Presented by: Bill Nickless
What is Multicast?• A multicast sender simply sends its
data, and intervening routers "conspire" to get the data to all interested listeners. (S. Deering)
• Destination of IP multicast packets is a “Group” address, within 224.0.0.0/4.
Presented by: Bill Nickless
Notation• Specific source address(es): S• Specific group address(es): G• Specific source traffic for a group: (S,G)• All sources traffic for a group: (*,G)• Rendezvous Point RP
Presented by: Bill Nickless
Any Source Multicast• Senders send multicast group-addressed packets.• Receivers register their interest in groups by way of
IGMPv2 (*,G) Joins• Network keeps track of all senders for each group,
and delivers packets from all senders to each interested Receiver.
Presented by: Bill Nickless
Source Specific Multicast• Senders send multicast group-addressed packets.• Receivers register their interest in specific sources
sending to specific groups by way of IGMPv3 (S,G) Joins (well, group membership reports….)
• Receivers are responsible for specifying which Senders’ traffic they want to receive.
Presented by: Bill Nickless
Reachability
NOT DEFINEDBY INTERNETSTANDARDS
Presented by: Bill Nickless
Reachability (Where To?)• NOT DEFINED BY INTERNET STANDARDS• Unicast reachability is interpreted by implementation
and practice as: Send me IP packets with destination addresses that match this advertisement.
• Think ‘show ip route’
Presented by: Bill Nickless
Reachability (Whence?)• NOT DEFINED BY INTERNET STANDARDS• Multicast reachability is interpreted by implementation
and practice as: Here’s where to get IP packets from sources that match this advertisement.
• Think ‘show ip rpf’
Presented by: Bill Nickless
Reachability Examplesterra% netstat –rn
Kernel IP routing table
Destination Gateway Genmask Flags Iface
140.221.11.103 0.0.0.0 255.255.255.255 UH eth0
140.221.8.0 0.0.0.0 255.255.252.0 U eth0
127.0.0.0 0.0.0.0 255.0.0.0 U lo
224.0.0.0 0.0.0.0 240.0.0.0 U eth0
0.0.0.0 140.221.11.253 0.0.0.0 UG eth0
Presented by: Bill Nickless
Reachability ExamplesKiwi#show ip route 140.221.11.103
Routing entry for 140.221.8.0/22
Known via "ospf 683", distance 110, metric 1117, type intra area
Last update from 140.221.20.124 on GigabitEthernet5/0, 03:35:56 ago
Routing Descriptor Blocks:
* 140.221.20.124, from 140.221.47.6, 03:35:56 ago, via GigabitEthernet5/0
Route metric is 1117, traffic share count is 1
Presented by: Bill Nickless
Reachability ExamplesKiwi#show ip rpf 140.221.11.103
RPF information for terra.mcs.anl.gov (140.221.11.103)
RPF interface: GigabitEthernet5/0
RPF neighbor: stardust-msfc-20.mcs.anl.gov (140.221.20.124)
RPF route/mask: 140.221.8.0/22
RPF type: unicast (ospf 683)
RPF recursion count: 0
Doing distance-preferred lookups across tables
Presented by: Bill Nickless
The Old MBONE• Excellent first approximation.• Used tunnels to encapsulate multicast traffic over
unicast paths.• Routing done by user-space daemons running on
general purpose Unix boxes.• Internet Group Management Protocol (IGMP)
(Think Multicast ARP)• Pre-dates the World Wide Web (hence SDR)
Presented by: Bill Nickless
Lessons Learned from MBONE• Distance Vector Metric Routing Protocol
(DVMRP) does not scale– Easy to create IP Multicast “amplifiers”.– Separate tunneled routing infrastructure not aligned
with modern BGP Internetworking.• Flood & Prune does not scale
– Examples: PIM-Dense Mode, DVMRP.– Not sensitive to available bandwidth.– Requires downstream routers that are smart and
powerful enough to send prune messages.
Presented by: Bill Nickless
Applying Those Lessons• Multicast Border Gateway Protocol.
– Provides reachability and policy control for multicast routing, just as BGP does for unicast.
• Protocol Independent Multicast (Sparse Mode)– Listeners receive traffic only when requested.– Forms multicast distribution trees.
• Multicast Source Discovery Protocol– Finding active sources in other PIM Sparse Mode
domains (usually other ASes).
Presented by: Bill Nickless
Setting Reachability Policy: Multicast Border Gateway Protocol• RFC 2283 adds the MP_REACH_NLRI attribute to BGP-4.
– Identifies a BGP route as unicast, multicast, or both• When implemented in a router, all the standard BGP
machinery is available for prefix filtering, preference setting, MEDs, AS length comparisons, etc.
• M-BGP routes can be independent of BGP, allowing for different inter-AS unicast/multicast reachability.
Presented by: Bill Nickless
Cisco M-BGP Configurationrouter bgp 683
network 130.202.0.0 nlri unicast multicast network 140.221.0.0 nlri unicast multicast neighbor 192.5.170.130 remote-as 145 nlri unicast multicast neighbor 192.5.170.130 description vBNS neighbor 192.5.170.130 soft-reconfiguration inbound neighbor 192.5.170.130 route-map from-vbns-lp-400 in neighbor 192.5.170.130 route-map to-vbns-med-10 out
Presented by: Bill Nickless
Cisco M-BGP Configuration route-map from-vbns-lp-400 permit 10
match nlri unicast set local-preference 400!route-map from-vbns-lp-400 permit 15 match as-path 145 match nlri multicast set local-preference 400!route-map to-vbns-med-10 permit 10 match ip address 50 set metric 10
Presented by: Bill Nickless
Cisco M-BGP Configuration access-list 50 permit 140.221.0.0
access-list 50 permit 130.202.0.0!ip as-path access-list 145 deny _24_ip as-path access-list 145 deny _293_ip as-path access-list 145 deny _11537_ip as-path access-list 145 permit .*
Presented by: Bill Nickless
Juniper M-BGP Configurationrouting-options { rib inet.2 { static { route 141.142.0.0/16 reject; route 141.142.109.0/25 next-hop 141.142.11.74; route 141.142.109.128/25 next-hop 141.142.11.74; route 141.142.104.0/24 next-hop 141.142.11.74; route 141.142.105.0/24 next-hop 141.142.11.74; route 141.142.108.0/24 next-hop 141.142.11.74; } }}
Presented by: Bill Nickless
Juniper M-BGP Configurationrouting-options {
rib-groups { ifrg { import-rib [ inet.0 inet.2 ];
} mcrg { export-rib inet.2; import-rib inet.2; }
igp-rg { export-rib inet.0; import-rib [ inet.0 inet.2 ]; } }}
Presented by: Bill Nickless
Juniper M-BGP Configuration protocols { bgp { group anl { import [ bgp-anl-accept reject-all ]; family inet { any; } export [ bgp-announce-ncsa reject-all ]; peer-as 683; neighbor 206.220.243.21; }}
Presented by: Bill Nickless
Monitoring M-BGP (Cisco)
Kiwi#show ip mbgp sum
BGP router identifier 192.5.170.2, local AS number 683
MBGP table version is 324285
4121 network entries and 12621 paths using 862335 bytes of memory
Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ
192.5.170.130 4 145 53420 20497 324285 0 0
Up/Down State/PfxRcd
5d14h 346
Presented by: Bill Nickless
Kiwi#show ip mbgp 128.163.3.214MBGP routing table entry for 128.163.0.0/16, version 323761Paths: (3 available, best #2) 24 145 10490 10437, (aggregated by 10437 128.163.55.253),
(received-only) 192.12.123.10 from 192.12.123.10 (198.10.80.66) Origin IGP, localpref 100, valid, external, atomic-aggregate 145 10490 10437, (aggregated by 10437 128.163.55.253) 192.5.170.130 from 192.5.170.130 (204.147.135.241) Origin IGP, localpref 400, valid, external,
atomic-aggregate, best 145 10490 10437, (aggregated by 10437 128.163.55.253),
(received-only) 192.5.170.130 from 192.5.170.130 (204.147.135.241) Origin IGP, localpref 100, valid, external, atomic-aggregate
Presented by: Bill Nickless
Monitoring M-BGP (Juniper)nickless@charlie> show bgp neighbor 206.220.243.21 Peer: 206.220.243.21+179 AS 683 Local: 206.220.243.160+1969 AS 1224[. . .] NLRI advertised by peer: inet-unicast inet-multicast NLRI for this session: inet-unicast inet-multicast Peer supports Refresh capability (2) Table inet.0 Bit: 10006 Active Prefixes: 13 Received Prefixes: 13 Suppressed due to damping: 0 Table inet.2 Bit: 20006 Active Prefixes: 9 Received Prefixes: 9 Suppressed due to damping: 0
Presented by: Bill Nickless
nickless@charlie> show route table inet.2 140.221.34.1
inet.2: 5046 destinations, 5046 routes (5045 active, 0 holddown, 1 hidden)+ = Active Route, - = Last Active, * = Both
140.221.0.0/16 *[BGP/170] 2w5d 19:24:04, MED 0, localpref 1000 AS path: 683 I > to 206.220.243.21 via at-1/0/0.683 [BGP/170] 3d 04:38:22, MED 0, localpref 60 AS path: 11537 683 I > to 141.142.11.246 via so-2/2/0.0 [BGP/170] 1w0d 11:18:35, localpref 60 AS path: 145 683 I > to 141.142.11.1 via at-1/0/0.145 [BGP/170] 2w5d 19:23:42, localpref 60 AS path: 38 683 I > to 192.17.8.32 via at-1/0/0.38 [BGP/170] 4d 05:55:21, MED 5, localpref 20 AS path: 2914 683 I > to 192.17.8.34 via at-1/0/0.2914
Presented by: Bill Nickless
PIM Sparse Mode• RFC 2362 defines PIM Sparse Mode.• No PIM-SM activity until:
– A host starts transmitting traffic (or)– A host subscribes to a group.
• A Rendezvous Point (RP) is the root of the shared distribution tree for multicast traffic within a PIM Domain.
• Given enough traffic, a source-based distribution tree is created. (Enough is typically anything greater than zero).
• Inter-PIM Domain distribution trees are all source-based.
Presented by: Bill Nickless
PIM Sparse Mode
Presented by: Bill Nickless
Multicast Session Discovery Protocol (MSDP)
• Not yet an RFC (in Last Call stage). See http://www.ietf.org/html.charters/msdp-charter.htmlandftp://ftp.ietf.org/internet-drafts/ draft-ietf-msdp-spec-09.txt
• Currently only covers IPv4.• PIM-SM RPs communicate through MSDP to find active
multicast sources. • If “interested”, the RP initiates a PIM-SM Join towards each
active source.
Presented by: Bill Nickless
Reachability Redux• A BGP NLRI=Multicast route is a statement of reachability.• Inter-domain PIM-Sparse Mode Joins follow the BGP
reachability topology.• MSDP forwarding between RPs follows the BGP
reachability topology.• Not doing MSDP where you do M-BGP means that you’ve
formed an MSDP “black hole”.
Presented by: Bill Nickless
Cisco PIM-SM w/ MSDP Configuration• interface ATM3/0.145 point-to-point
description vBNS MBGP+PIM-SM+MSDP ip address 192.5.170.129 255.255.255.252 ip pim border ip pim sparse-mode ip multicast ttl-threshold 32 ip multicast boundary 10
ip msdp peer 204.147.128.141ip msdp description 204.147.128.141 vBNSip msdp sa-filter in 204.147.128.141 list 111ip msdp sa-filter out 204.147.128.141 list 111ip msdp sa-request 204.147.128.141ip msdp ttl-threshold 204.147.128.141 32ip msdp cache-sa-state
Presented by: Bill Nickless
• access-list 10 deny 224.0.1.39 ! CISCO-RP-ANNOUNCE.MCAST.NET access-list 10 deny 224.0.1.40 ! CISCO-RP-DISCOVERY.MCAST.NET access-list 10 deny 239.0.0.0 0.255.255.255access-list 10 permit 224.0.0.0 15.255.255.255
• access-list 111 deny ip any host 224.0.2.2 ! SUN-RPC.MCAST.NET
access-list 111 deny ip any host 224.0.1.3 ! RWHOD.MCAST.NET
access-list 111 deny ip any host 224.0.1.24 ! MICROSOFT-DS.MCAST.NET
access-list 111 deny ip any host 224.0.1.22 ! SVRLOC.MCAST.NET
access-list 111 deny ip any host 224.0.1.2 ! SGI-DOG.MCAST.NET
access-list 111 deny ip any host 224.0.1.35 ! SVRLOC-DA.MCAST.NET
access-list 111 deny ip any host 224.0.1.60 ! HP-DEVICE-DISC.MCAST.NET
access-list 111 deny ip any host 224.0.1.39 ! CISCO-RP-ANNOUNCE.MCAST.NET
access-list 111 deny ip any host 224.0.1.40 ! CISCO-RP-DISCOVERY.MCAST.NET
access-list 111 deny ip any 239.0.0.0 0.255.255.255access-list 111 deny ip 10.0.0.0 0.255.255.255 anyaccess-list 111 deny ip 127.0.0.0 0.255.255.255 anyaccess-list 111 deny ip 172.16.0.0 0.15.255.255 anyaccess-list 111 deny ip 192.168.0.0 0.0.255.255 anyaccess-list 111 permit ip any
Presented by: Bill Nickless
Juniper PIM-SM w/ MSDP Configprotocols {
pim { rib-group mcrg;
rp { local { address 141.142.12.1; } }
interface all { mode sparse; version 2; }
}}
Presented by: Bill Nickless
Juniper PIM-SM w/ MSDP Configprotocols {
msdp { rib-group mcrg; group anl {
/* kiwi-loop.anchor.anl.gov */
peer 192.5.170.2 {
local-address 141.142.12.1;
}
}
}
}
Presented by: Bill Nickless
Monitoring MSDP and PIM-Sparse• Verify that MSDP session has come up with your
peer:Kiwi#show ip msdp sum MSDP Peer Status SummaryPeer Address AS State Uptime/ Reset Peer Name Downtime Count204.147.128.141 145 Up 1d12h 11 cs.dng.vbns.net
nickless@charlie> show msdp peer 192.5.170.2 Peer address Local address State Last up/down Peer-Group192.5.170.2 141.142.12.1 Established 2w5d18h anl
Presented by: Bill Nickless
Monitoring MSDP and PIM-Sparse• Verify that active sources are being discovered:
Kiwi#show ip msdp sa-cache 224.2.177.155
MSDP Source-Active Cache - 4020 entries
(128.197.160.27, 224.2.177.155), RP 204.147.128.141,
MBGP/AS 145,
03:40:18/00:05:03
[…etc]
nickless@charlie> show msdp source-active group 233.2.171.1
Group address Source address Peer address Originator Flags
233.2.171.1 140.221.34.1 141.142.11.246 192.5.170.2 Accept
192.5.170.2 192.5.170.2 Accept
192.17.8.32 192.5.170.2 Accept
204.147.128.141 192.5.170.2 Accept
Presented by: Bill Nickless
Monitoring MSDP and PIM-Sparse• Verify that you are receiving traffic from those
active sources, and are forwarding:Kiwi#show ip mroute count 224.2.177.155 128.163.3.214 Forwarding Counts: Pkt Count/Pkts per second/ Avg Pkt Size/Kilobits per secondOther counts: Total/RPF failed/ Other drops(OIF-null, rate-limit etc)
Group: 224.2.177.155, Source count: 26, Group pkt count: 31060731 RP-tree: Forwarding: 159/0/429/0, Other: 72/0/0 Source: 128.163.3.214/32, Forwarding: 7089/0/480/0, Other: 6/0/0
Presented by: Bill Nickless
Kiwi#show ip mroute 224.2.177.155 128.163.3.214 IP Multicast Routing TableFlags: D - Dense, S - Sparse, C - Connected, L - Local, P - Pruned R - RP-bit set, F - Register flag, T - SPT-bit set, J - Join SPT, M - MSDP created entry, X - Proxy Join Timer RunningTimers: Uptime/ExpiresInterface state: Interface, Next-Hop or VCD, State/Mode
(128.163.3.214, 224.2.177.155), 03:55:28/00:03:22, flags: MT Incoming interface: ATM3/0.145, RPF nbr 192.5.170.130, Mbgp Outgoing interface list: ATM0/0.216, Forward/Sparse, 03:55:28/00:03:08 ATM0/0.200, Forward/Sparse, 03:55:28/00:02:04
Presented by: Bill Nickless
nickless@charlie> show multicast route group 233.2.171.1 \
source-prefix 140.221.34.1 extensive
Group Source prefix Act Pru NHid Packets IfMismatch T/O
233.2.171.1 140.221.34.1 /32 A F 68 1829657 0 355
Upstream interface: at-1/0/0.683
Session name: Static Allocations
nickless@charlie> show multicast route group 233.2.171.1 \
source-prefix 140.221.34.1 extensive
Group Source prefix Act Pru NHid Packets IfMismatch T/O
233.2.171.1 140.221.34.1 /32 A F 68 1830512 0 355
Upstream interface: at-1/0/0.683
Session name: Static Allocations
Presented by: Bill Nickless
nickless@charlie> show pim join 233.2.171.1 extensive
Group Source RP Flags
[. . .]
233.2.171.1 140.221.34.1 sparse,spt-pending Upstream interface: at-1/0/0.683Upstream State: Local RP, Join to Source Downstream Neighbors: Interface: ge-1/1/0.103 141.142.0.14 State: Join Flags: S Timeout: 182 Interface: gr-1/2/0.0 141.142.11.74 State: Join Flags: S Timeout: 208
Presented by: Bill Nickless
Other Tips• ATM peerings are best done with point-to-point
subinterfaces. (What’s a Designated Router in the context of an ATM exchange point, anyway?)
• MSDP Source Actives are made from PIM Register messages. If you’re not sending MSDP SA messages for a source, you may have a problem with the Designated Router for that source.
Presented by: Bill Nickless
More Tips• MSDP encapsulates data in its Source Active
messages (just like they were encapsulated in the PIM Sparse Mode Register messages). This was done primarily to support SDR.
• It is possible for MSDP to work while PIM-SM is not working, so you can’t always count on SDR to verify multicast routing.
Presented by: Bill Nickless
Debugging Multicast• You must have:
– at least one constantly active source – at least one constantly active receiver
• Start near the receiver– Identify the PIM-SM Designated Router– Verify IGMP state in the Designated Router– Look for (S,G) state in the Designated Router
Presented by: Bill Nickless
Debugging Multicast• Follow the Reverse Path Forwarding (RPF) from the
Designated Router back towards the source• Verify PIM-SM has been configured on each
interface along the RPF, because that determines the forwarding tree topology.
• Check (S,G) state in each router.• Check (S,G) counters in each router.
Presented by: Bill Nickless
Debugging Multicast• If the source is external to your PIM Domain:
– Verify that you have an MSDP SA for that source.– Verify that the M-BGP Next Hop is:
• A PIM Sparse Mode neighbor• An MSDP peer
– Verify that you’re actually choosing the NLRI=Multicast route as your preferred RPF path. (hello BGP distance)
Presented by: Bill Nickless
Debugging Multicast• What if nobody can hear your source?
– Verify that the (S,G) shows up at your RP.– Verify that your RP is MSDP announcing the source, and
that it shows up in your peer’s MSDP SA cache.– Verify your PIM-SM adjacency with your peer.– Verify that you have your peer’s interface in the outgoing
list for the (S,G). – Verify that packet counters show traffic going out.
Presented by: Bill Nickless
The Beacon: Test Signal• Testing Multicast requires active sessions• http://dast.nlanr.net/projects/beacon• In Java, so runs
anywhere
Presented by: Bill Nickless
The Beacon: Issues• Shows current state only.
– Archive state over time?– How to visualize evolving state? Inherently a 3-
dimensional problem, since state is 2D already.• Server scaling problems with O(40) beacons.
– Currently seeing O(70) beacons at any time.• Assumes Any Source Multicast model.
Presented by: Bill Nickless
Core Multicast Building Blocks• M-BGP: RFC 2283 is implemented by Juniper and
Cisco in all major releases. AG community has used Juniper/Cisco the most.
• MSDP: Implemented by Juniper, Cisco, Foundry...• PIM-Sparse Mode: RFC 2362 is implemented by a
whole raft of vendors, including Cisco, Juniper, Foundry, Extreme, Marconi, etc.
Presented by: Bill Nickless
Edge Multicast Building Blocks• IGMPv2 is widely available in Layer 2 and Layer 3
devices, and in most host operating systems.• IGMPv3 is coming soon to support SSM:
– Available in Layer 3 devices from Cisco and Juniper.– IGMPv3 will be available in Windows XP (Whistler).– Ugly hack workarounds exist (URD et al).
Presented by: Bill Nickless
North American IP Multicast Status• ESNet, Abilene, vBNS+, and NREN all running M-BGP,
MSDP, and PIM-SM amongst themselves and with their customers/peers.
• Regional and Institutional networks are currently the most common stumbling blocks for multicast apps.
• STARTAP in Chicago is an international IP multicast meeting point.
• International / commercial networks are coming online.