multicast troubleshooting tutorial caren litvanyi [email protected] joint techs meeting salt...

144
Multicast Troubleshooting Tutorial Caren Litvanyi [email protected] Joint Techs Meeting Salt Lake City, Utah February 2005

Upload: cecil-harris

Post on 21-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Multicast Troubleshooting Tutorial

Caren [email protected]

Joint Techs MeetingSalt Lake City, Utah

February 2005

Page 2: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Tutorial Outline

• Review IP multicast terminology and basic functionality.

• Review how the most common multicast protocols in use today work.

• Discuss some design issues.• Troubleshooting multicast methodology,

particularly interdomain multicast.• Mention some tools and resources.

Page 3: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Multicast Functionality and

Terminology

Page 4: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Unicast vs. MulticastMulticastUnicast

Page 5: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Multicast Building Blocks

• The SENDERS send without worrying about receivers.– Packets are sent to a multicast address.– (224.0.0.0 - 239.255.255.255)

• The RECEIVERS inform their local routers what they want to receive.

• The routers build a tree backwards (reverse-path) towards the source, thus making sure the STREAMS make it to the correct receiving networks.

Page 6: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Essential Multicast Terminology

A few things to note here:

The IP source address is the IP address of the serverBUT – the destination address in the packet is NOT an IP

address of a receiver. It is a multicast IP address. 224.0.0.0 - 239.255.255.255

tree = the path taken by multicast data. Routing loops are not allowed, so there is always a unique series of branches between the root of the tree and the receivers.

IP source = IP unicast addrEthernet source = MAC addr

IP destination = IP multicast addr Ethernet dest = MAC addr

sourcesender

Multicast streamDistribution tree

receiverslisteners

group members

e.g., video server

128.138.10.2

233.12.24.11 128.138.10.2

Page 7: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

(S,G) notation

• For every multicast stream there must be two pieces of information: the source IP address, S, and the group address, G.– These correspond to the sender and

receiver addresses in unicast.– This is generally expressed as (S,G).– Also commonly used is (*,G) - every

source for a particular group.

Page 8: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Multicast Addressing • RFC 3171 244.0.0.0 – 239.255.255.255• Examples of Reserved & Link-local Addresses

• 224.0.0.0 - 224.0.0.255 reserved & not forwarded• 224.0.0.1 - All local hosts• 224.0.0.2 - All local routers• 224.0.0.4 - DVMRP• 224.0.0.5 - OSPF• 224.0.0.6 - Designated Router OSPF• 224.0.0.9 - RIP2• 224.0.0.13 - PIM• 224.0.0.18 - VRRP• 224.0.0.22 - All IGMP routers• 239.0.0.0 - 239.255.255.255 Administrative Scoping• 232.0.0.0 – available for SSM use

• “Ordinary” multicasts don’t have to request a multicast address from IANA. Use GLOP space – RFC 2770.

Page 9: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Essential Multicast Protocols

• Group Management Protocol - enables hosts to dynamically join/leave multicast groups. Receivers send group membership reports to the nearest router.

• Multicast Routing Protocol - enables routers to build a delivery tree backwards from the receivers to the sender of a multicast stream.

Senders

Receivers

Group Management Protocol (IGMPv2 or v3)

Multicast Routing Protocol (PIM-SM)

Data flow

Membership reports

Reverse path tree

Page 10: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Multicast Protocol Summary

• Essential Protocols– IGMP - Internet Group Management Protocol is used

by hosts and routers to tell each other about group membership. (Usually version 2)

– PIM-SM - Protocol Independent Multicast - Sparse Mode is used to propagate forwarding state between routers.

• Other Protocols (for interdomain)– MBGP - Multiprotocol Border Gateway Protocol is

used to exchange routing information for inter-domain reverse-path forwarding (RPF) checking.

– MSDP - Multicast Source Discovery Protocol is used to exchange active-source information.

Page 11: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

IGMP Protocol Flow - Join a Group

• Router triggers group membership request to PIM.

• Hosts can send unsolicited Join membership messages – called reports in the RFC (usually more than 1)

• Or hosts can join by responding to periodic query from router

I want 230.0.0.1

230.0.0.1

230.0.0.1Forwards stream

Router adds group

I want to JOIN!

230.0.0.1

Page 12: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

IGMPv2• Router:

– sends Membership Query messages to All Hosts (224.0.0.1)• default query-interval = 125 seconds

– router with lowest IP address is Querier (rest non-queriers)– If lower-IP address query heard, back off to non-querier state

• Other Querier Present Interval default: (robust-count x query-interval) + (0.5 x query-response-interval) = 255 seconds

– listens for reports (whether querier or not) and adds group to membership list for that interface

• default query-response-interval = 10 seconds– timeout (Group member interval) default:

• (robust-count x query-interval) + (1 x query-response-interval) = 260 seconds

– robust-count - provides fine-tuning to allow for expected packet loss on a subnet. Default = 2 (tunable from 2-10)

– Triggers group membership request to PIM.

Page 13: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

IGMPv2• Host:

– responds to router query with Membership Report messages to groups it is a member of (e.g.224.10.8.5)• waits 0-10 sec (default; specified in Query) • Hosts listen to other host reports• Only 1 host responds. Others become “idle-

members.”– sends unsolicited Membership Reports (i.e., Join

Messages) to group address (e.g. 224.10.8.5)– sends Leave messages to All Routers (224.0.0.2)– reports group membership ONLY – no sources. – Only the existence of local group members is known,

not the actual members themselves (due to idle-member state).

Page 14: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

IGMP Protocol Flow - Querier

• Hosts respond to query to indicate (new or continued) interest in group(s)– only one host should respond per group

• Hosts fall into idle-member state when same-group report heard.

• After 260 sec with no response, router times out group.

224.0.0.1

Still interested?

(general query)

224.0.0.1125 sec

I want 230.0.0.1

230.0.0.1

230.0.0.1

0-10 sec

230.0.0.1 group

Yes, me!

Page 15: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

IGMP Protocol Flow - Leave a Group

• Hosts that support IGMPv2 send Leave messages to all-routers group indicating group they’re leaving.– Router follows up with 2 group-specific query messages.

• IGMPv1 hosts leave by not responding to queries (260 sec timeout).

I don’t want 230.0.0.1 anymore

224.0.0.2<230.0.0.1>

230.0.0.1 group

I wantto leave!

Anyone still want this group?

230.0.0.1<230.0.0.1>

230.0.0.1<230.0.0.1>

1 sec (re-transmit timer)

Page 16: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Switches and Snooping

• IGMP host reports (Joins) tell the router to start sending multicast traffic to the LAN, since one or more hosts on the LAN are members of the group.

• In a conventional shared broadcast LAN using switches that have no multicast smarts, the traffic is flooded to all hosts.

• With multiple high bandwidth multicast sources (e.g. video at 5 Mbps), this does not scale.

• There are a few techniques used to deal with this...

Page 17: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

IGMP Snooping• Implemented by several vendors. Support for

IGMPv2 is common; support for IGMPv3 is becoming more common.

• What happens at the MAC layer:

– IGMP snoopers add a bridge table entry for each multicast group destination address (GDA) to each switch port that has the interested member's unicast source address (USA) already on it.

– Remember that there are likely to be hubs or switches downstream of a given switch port, so more than one USA can be on a single port.

– When an IGMP Leave is received, the GDA entries are pruned.

Page 18: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Why IGMP snooping isharder than it looks

• The IGMP membership reports have to be captured from each host and suppressed to other hosts to prevent the others from going into idle-member state. Every interested host has to be spoofed into thinking it is the only member of the group, so that it actively sends membership reports.

• The IGMP snooper then forwards one of these membership reports up to the router or makes up a fake membership report coming from one of:– the host– the switch’s management IP address, or – 0.0.0.0

Page 19: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Why IGMP snooping is harder than it looks, continued

• Since multiple USAs can be on a port (via downstream switch), the switch has to actually do the IGMP membership query/timeout before pruning a port.

• Since membership reports are sent to the same GDA as the (possibly high-bandwidth) multicast traffic, there is a potential for heavy loading of the switch CPU, unless you use more expensive ASICs that can separate the IGMP protocol messages from general traffic and route only the IGMP messages to the CPU.

• The switch has to know which is the multicast router port. It does this by snooping for IGMP queries.

Page 20: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Join without IGMP snooping

Switch

230.0.0.1

230.0.0.1230.0.0.1

230.0.0.1

230.0.0.1

230.0.0.1

230.0.0.1

230.0.0.1

I want 230.0.0.1

I want 230.0.0.1

1. Host A sends membership report.2. Switch floods it to all ports.3. Router sends traffic (floods).

4. Host B wants to join. No IGMP message needed (idle-member).

Page 21: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Join with IGMP snooping

Switch

230.0.0.1

230.0.0.1

230.0.0.1

230.0.0.1

230.0.0.1

I want 230.0.0.1

I want 230.0.0.1

230.0.0.1

1. Host A sends membership report.2. Switch forwards it to router.3. Router sends traffic.

4. Host B sends membership report. Switch suppresses it and adds port to bridge table.

Page 22: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Maintaining state w/IGMP snooping

Switch

230.0.0.1

230.0.0.1

230.0.0.1

230.0.0.1

230.0.0.1

224.0.0.1General Query

224.0.0.1 ?

224.0.0.1 ?

224.0.0.1 ?

230.0.0.1

1. Router sends general query.2. A&B both respond w/membership report (no idle member).3. Switch sends one to router and suppresses one.

Page 23: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Leave with IGMP snooping

Switch

224.0.0.22<230.0.0.1>

230.0.0.1

230.0.0.1

230.0.0.1

done230.0.0.1 ?

1. Host A sends Leave.2. Switch spoofs G-specific query.3. No reply, switch prunes port.

(Nothing sent to router.)

Page 24: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

5. Router sends 2 G-specific queries, gets no response, and prunes the group. (Queries may [not] be suppressed)

Leave with IGMP snooping, cont’d

Switch

224.0.0.22<230.0.0.1>

230.0.0.1230.0.0.1

done230.0.0.1 ?

224.0.0.22<230.0.0.1>

230.0.0.1 ?

230.0.0.1 ?

1. Host B sends Leave.

2. Switch spoofs G-specific query.

3. No reply; switch prunes port.4. Switch sends Leave to router.

Page 25: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Sourcing Multicast: conventional switch

Switch

230.0.0.1

230.0.0.1230.0.0.1

230.0.0.1

Video Server

Multicast is just like broadcast: Flooded out all ports.

Page 26: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Sourcing with multicast-aware switch

Switch

230.0.0.1

230.0.0.1

Video Server

Multicast traffic is forwarded only to mrouter ports (learned by snooping for IGMP queriers).

Exception: flood 224.0.0.0/24

Page 27: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Design Consequences for Networks• Be careful selecting/purchasing switches if you

plan to support multicast. Try to do a test/eval before buying. Many vendors say they support IGMP, but how well varies widely. Also varies widely within same vendor.

• Consider your physical topology design. Is it possible to put multicast-heavy subnets closer to the core, or on higher-class switches? Can you avoid switches and connect direct to a router?

• Keep subnets small. Less churn in joins/leaves.

• Check defaults. What is turned on and what is not?

Page 28: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Consequences for Troubleshooting• In general, multicast on the LAN is not as well

understood as multicast on the WAN.

• Bugs are common.

• The horsepower of your switch(es) might matter. When snooping is enabled and CPU load is high, they may drop packets that shouldn’t be dropped.

• Even without snooping, sometimes they step outside their bailiwick, trying to do non-Layer-2 tasks.

• Management visibility into the switch may be limited.

• Often testing to a host directly connected to a router can expose these problems.

Page 29: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

PIM-SM Protocol Independent Multicast - Sparse Mode

• The core multicast protocol: builds and tears down multicast trees.

• “Protocol Independent” means independent of the protocol used to build the reachability table, not independent of IP. (More on reachability in a moment.)

• “Sparse Mode” refers to the explicit join approach taken by PIM-SM — the protocol assumes that not everyone wants the data.

• PIM also has a Dense Mode, which starts with the assumption that everyone does want the data. This is also known as a flood-and-prune approach. Not recommended!

Page 30: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

•Multicast routing can be thought of as the reverse of unicast forwarding.– Unicast forwarding is concerned with

where the packet is going.– Multicast routing is concerned with where

the packet will be coming from.•Multicast paths to receivers form a “tree”.

The tree is built (or torn down) from the receiver back toward the source. This is easy to forget, but very important to remember.

Multicast “Routing”

Page 31: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Multicast “Routing”• Multicast forwarding topology is stored in outgoing

interface lists (OILs).

• On each router, PIM-SM maintains an OIL for each group for which it has downstream listeners.

• Once the multicast distribution tree is built, multicast forwarding works similarly to unicast forwarding — but instead of using unicast forwarding tables to send packets out single interfaces, routers use OILs to send packets out multiple interfaces.

• Multicast packets received from a given source on an incoming interface for a given group are sent out only on the interfaces specified in the appropriate outgoing interface list (OIL).

Page 32: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

ASM: the original multicast service model

• Packet transmission is based on UDP, so packet delivery is “best-effort”, with no loss detection or retransmission

• A source can send multicast packets at any time, with no need to register or schedule transmissions.

• Sources do not know the group membership. A group may have many sources and many members.

• Group members may come and go at will, with no need to coordinate with a central authority.

• And, critically, group members know only the group. They don’t need to know anything about sources — not even whether or not any sources exist.

• This is the ASM paradigm. It requires sender registration and tree-switching.

Page 33: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Multicast Distribution Trees• In the original multicast service model, a

connection between a source and a receiver is first set up by building an RPT from the receiver back to a Rendezvous Point (RP), then an SPT (source tree) from the RP back to the source.

• Then, once data starts flowing to the receiver, an SPT is built directly from the receiver back to the source.

• This is called “tree-switching”.• A special router adjacent to the receiver is

responsible for this – the PIM Designated Router (DR).

• Each multicast-enabled routed segment on your network has a PIM DR.

Page 34: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Designated Router (DR)

• DR sends – “Join/Prune” messages toward the RP from receiver

network– “Register” messages toward the RP from source network

• Selecting the DR:– Neighboring PIM-SM routers multicast periodic “Hello”

messages to each other (default is every 30 seconds; the hello-interval is tunable for faster convergence).

– On receipt of a Hello message, a router stores the IP address and priority for that neighbor.

– The router with highest IP address is selected as the DR, if the priorities match.

• When DR goes down, a new one is selected by scanning all neighbors on the interface and choosing the one with the highest IP address.

Page 35: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

ASM RP Tree Join

Receiver

RP

(*, G) Join

RP Tree

Receiver announces desireto join group G with IGMPv2host report – (*,G).

IGMPv2 host report DR sends PIM (*,G) Join toward the RP; subsequent routers do likewise.

Page 36: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

ASM Sender Registration

Receiver

RPSource

RP Tree

Shortest Path Tree

Traffic Flow(S, G) Register (unicast)

(S, G) Join

Active source triggers DR to send (S,G) Register message to RP.

RP sends (S,G) Join to source.

Page 37: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

ASM Sender Registration

Receiver

RP

RP Tree

Shortest Path Tree

RP sends a Register-Stop back to the first-hop router to stop the Register process.

(S, G) Register-Stop (unicast)

Traffic Flow(S, G) Register (unicast)

(S, G) traffic begins arriving at the RP via the SPT.

Source

Page 38: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

ASM Sender Registration

Receiver

RP

Source traffic flows nativelyalong SPT to RP.

From RP, traffic flows downthe RPT to the receiver.

Source

Shortest Path Tree

RP Tree

Traffic Flow

Page 39: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

ASM SPT Cutover

Receiver

RP

(S, G) Join

Last-hop router joins the SPT.

Source

Shortest Path Tree

RP Tree

Traffic Flow

Page 40: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

ASM SPT Cutover

Receiver

RP

Shortest Path Tree

RP Tree

(S, G) RP-bit Prune

Traffic begins flowing down the new branch of the SPT.

Additional (S, G) state is created along the RPT to pruneoff (S, G) traffic.

Traffic Flow

Source

Page 41: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

ASM SPT Cutover

Receiver

RP

Shortest Path Tree

RP Tree

(S,G) traffic flow is now pruned off of this branch of the RPT and is flowing to the receiver via the SPT.

Traffic for other sources may still be flowing down the RPT.

Traffic Flow

Source

Page 42: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

ASM SPT Cutover

Receiver

RP

Shortest Path Tree

RP Tree

(S, G) traffic flow is no longer needed by the RP, so it prunes the flow of (S, G) traffic.

Traffic Flow

(S, G) Prune

Source

Page 43: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

ASM SPT Cutover

Receiver

RP

Shortest Path Tree

RP Tree

(S, G) Traffic flow is now only flowing to the receiver via a single branch of the SPT.

Traffic Flow

Source

As long as the source remains active, its first-hop router sends Null-Register messages to the RP, enabling the RP to maintain a list of all active sources.

Page 44: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

RP Options• Remember, the RP is used to “hook up”

receivers with senders. Receivers only know group address.

• Static RP– Recommended– Easy transition to Anycast-RP– Allows for a hierarchy of RPs

• Auto-RP (Cisco proprietary)– Fixed convergence timers (slow)– Must flood RP mapping traffic

• bootstrap router– Fixed convergence timers (slow)– Allows for a hierarchy of RPs

Page 45: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

RP Options• In most cases, static RP is the best option:

– simple: just tell every router the RP address (once!)

– flexible: use a /32 on a loopback interface so it can be moved

– scalable: add more instances of same RP address for redundancy, load splitting, topological localization, etc.

– survivable: fail-over from one RP to another is as fast as IGP convergence

– blessed: RFC 3446 (just 8 pages!)• Only use more complicated options if you really

need to:– different RP(s) for different groups– see later Anycast-RP slides for details

Page 46: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Inter-domain ASM and MSDP

• A PIM domain is a network in which all routers use the same RP for any given multicast group.

• Inter-domain ASM requires another protocol: Multicast Source Discovery Protocol (MSDP).– Why? Because the receiver is restricted to sending

only (*,G) joins to its RP. And its RP doesn’t know where the source is, because the source is registered to a different RP. MSDP is needed for the receiver's RP to find the (S,G).

– Officially, MSDP is a temporary solution. We shall see.

Page 47: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

MSDP Peers (inter-domain case)• MSDP establishes a neighbor relationship between

MSDP peers– Peers connect using TCP port 639– Peers send keepalives every 60 secs (fixed)– Peer connection reset after 75 seconds if no MSDP

packets or keepalives are received

• MSDP peers must have knowledge of multicast topology.– Required for peer-RPF checking of the RP address

in the SA to prevent SA looping. Note that this is not the same thing as the multicast routing RPF check.

Page 48: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

MSDP Operation — Flooding

• Initial SA message sent when source DR first registers– May optionally encapsulate first data packet

• Originating RP sends subsequent SA messages every 60 seconds, for as long as source remains active

• Flooding– SA (source active) packets periodically sent to MSDP

peers indicating:• source IP address of active streams• group multicast IP address of active streams• IP address of RP originating the SA

– RPs only originate SAs for your sources within your domain!

Page 49: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

MSDP Overview

SA Message192.1.1.1, 224.2.2.2

Domain C

Domain B

Domain D

Domain E

SA

SA

SA SA

SA

SA

Source ActiveMessages

SA

Domain A

SA Message192.1.1.1, 224.2.2.2

r

Join (*, 224.2.2.2)

MSDP Peers

RP

RP

RP

RP

sRP

Register192.1.1.1, 224.2.2.2

Page 50: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

MSDP Overview

Domain C

Domain B

Domain D

Domain E

Domain A

RP

RP

RP

RP

r

MSDP Peers

RP

s

Join

(S

, 224

.2.2

.2)

Join

(S

, 224

.2.2

.2)

Multicast TrafficJoin (S, 224.2.2.2)

Page 51: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

MSDP so far

• Allows RPs to share information about which sources in their domains are active sending.

• Interconnects RPs (MSDP Peers) between domains, using TCP connections to pass source active messages (SAs).

• SAs are Peer-RPF checked before accepting or forwarding.

• RPs may trigger (S,G) Joins on behalf of local receivers.

• MSDP connections typically (but not always) parallel MBGP connections.

• Next: Peer-RPF checking in detail. This is complex.

Page 52: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

MSDP RPF Rules

1. The MSDP peer sending the SA is the originating RP2. The MSDP peer sending the SA is the eBGP next hop

for the originating RP3. The MSDP peer sending the SA is the iBGP advertiser

for the originating RP4. The MSDP peer sending the SA is in the same AS as

the next hop for the originating RP5. The MSDP peer sending the SA is statically

configured to be the RPF peer

If any of the following tests pass, the SA is accepted. For any given (S,G), there can be one or more accepted SAs in the SA cache.

Page 53: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Design Issue: Anycast-RP• MSDP used intra-domain to provide RP redundancy• Becoming best common practice for large networks• Specified in RFC 3446• Allows deployment of multiple RPs within a domain (for

the same group range)• Adding more RPs does not require changes to non-RP

routers• Sources and receivers use closest RP, as determined by

the IGP• RPs share information about sources via MSDP mesh

group• Note: MSDP peering uses normal address, not

Anycast-RP address

Page 54: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

MSDP Application: Anycast-RP

• Rules are fairly simple– Have e-MSDP peers and i-MSDP peers, similar to BGP

• If a mesh group member originates a SA message– Send to all i-MSDP peers and any e-MSDP peers

• If a mesh group member receives a SA message from an i-MSDP peer– Send to any e-MSDP peers– Do NOT send to other i-MSDP peers

• If a mesh group member received a SA message from an e-MSDP peer– Check RPF — if passes, then– Flood to all i-MSDP peers and any other e-MSDP

peers.

Page 55: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

MBGP Overview

• MBGP: Multiprotocol BGP(aka multicast BGP in multicast networks)– Makes it possible for multicast routing policies to

differ from unicast routing policies– Can carry different route types for different purposes

• Unicast• Multicast

– Both route types carried in same BGP session– Has nothing to do with multicast state information!– Same path selection and validation rules

• AS-Path, LocalPref, MED, …

Page 56: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

MBGP

• Tag unicast prefixes as multicast source prefixes for intra-domain mcast routing protocols (PIM, MSDP) to do RPF checks.

• WHY? Allows for inter-domain RPF checking where unicast and multicast paths are non-congruent.

• DO I REALLY NEED IT? – YES, if:

• ISP to ISP peering• Multiple-homed networks

– NO, if:• You are single-homed

Page 57: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

New multiprotocol attributes

• MP_REACH_NLRI and MP_UNREACH_NLRI– Address Family Information (AFI) = 1 (IPv4)

• Sub-AFI = 1 (NLRI is used for unicast forwarding)

• Sub-AFI = 2 (NLRI is used for multicast PIM RPF check and MSDP peer-RPF check)

Page 58: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

MBGP — Capability Negotiation• BGP routers establish BGP sessions through the OPEN message• OPEN message contains optional parameters• BGP session is terminated if OPEN parameters are not

recognised• New parameter: CAPABILITIES

• Multiprotocol extension• Multiple routes for same destination

• Configures router to negotiate either or both NLRI– If neighbor configures both or subset, common NLRI is used

in both directions– If there is no match, notification is sent and peering doesn’t

come up– If neighbor doesn’t include the capability parameters in

open, session backs off and reopens with no capability parameters

• Peering comes up in unicast-only mode

Page 59: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

MBGP — Summary

• Solves part of inter-domain problem

– Can exchange unicast prefixes for multicast RPF checks

– Uses standard BGP configuration knobs

– Permits separate unicast and multicast topologies if desired

• Still must use PIM to:

– Build distribution trees

– Actually forward multicast traffic

Page 60: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

End of Protocol Review.

Questions?

Page 61: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

A Methodology for Troubleshooting

Inter-domainIP Multicast

Page 62: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Problems Addressed

• The main types of problems addressed in this section are topology/reachability problems – the packets aren’t flowing.

• The source and receiver are assumed to be in two different AS’s. Troubleshooting multicast within your own campus network is a subset of interdomain troubleshooting.

• Because it is the most common today, we assume ASM. Many problems would go away with SSM.

• We will mention some things about performance issues at the end, and list some tools/references.

Page 63: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Why the need for a “methodology”?

• Most engineers don’t troubleshoot multicast problems as often as unicast.

• As we have learned, multicast is receiver-driven (somewhat backwards).

• The problem can be far from the symptom.• The same symptom can have many different

causes, at different places in the path.

Page 64: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Overview

Gather information

Verify receiver interest

Verify knowledge of active source

Trace forwarding state back

Page 65: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

STEP 1:GATHER INFORMATION

Page 66: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

What is the problem?

Nobody can see me!

Some sites can hear us, but others can’t.

Multicast is broken …

again

Multicast isn’t working between here and there.

Site X called to say they can’t see my

presentation!

We’re not getting

anything.

Page 67: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Gather Information

• End-users seem to have trouble reporting multicast problems in our language.

• Performance issue vs. topology/reachability issue?• Was it working recently then stopped working, or

has one site gotten nothing at all from another site?– If nothing, double-check group and port info, TTL at sender

• Is the problem intermittent, cyclic, or steady-state?• User education about how to report a problem

before a problem happens is very helpful!

Page 68: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Gather Information• Pick ONE direction (that is the problem, or

seems representative of the problem).

• Identify source end and receiving end.

• Recall multicast is unidirectional in nature…

Implies almost nothing about…

A BCan

Can’t

A BCan

Can’t

Page 69: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Gather Information

• A constantly active source IP address

• A constantly active receiver IP address

• The group address

Now that you have a direction, you will need:

It is virtually impossible to debug a multicast problem without specifying all of these!!!

Page 70: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Gather Information

• OK – we know the IP addresses for the problem source, receiver, and group, and that the source and receiver are active.

Move on to step 2…

Page 71: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

STEP 2:VERIFY RECEIVER

INTEREST

Page 72: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify Receiver Interest• Because of the way multicast distribution trees

are built, it is almost always easier to debug a problem by starting at the receiver. If you are the sender, you are pretty much working blind.

• Recall in ASM, group interest on a subnet is indicated by a host sending out (multicast) an IGMPv2 membership report.

• The DR (designated router) on a segment is responsible for listening to that report, and forwarding a PIM ( * , G) join towards the RP.

• For this step, all we need to do is verify which router is the DR, and check that it knows it has interested listeners for that group on the interface facing the receiver. Stop there. Don't worry about getting to the RP at this point.

Page 73: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify Receiver Interest• What can go wrong?

– No host is sending out IGMP membership reports, or not the right version.

– A switch is in the path that is dropping/limiting multicast/IGMP.– The router is not running IGMP, PIM, etc.– A device has been elected DR that shouldn't have been.– bugs, incompatible timer implementations, querier confusion, etc.– ACLs, firewalls.

DR?

DR? Gack! I dunno where RP…

receiver

RP

IGMP report( * , G) join

?

Page 74: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify Receiver Interest• You might think you know which router is the DR,

but you should not proceed until it has been verified. It only takes a couple seconds.

• To verify the DR, log into the router you think should be routing multicast for the receiver.1) Find/verify the interface that serves the receiver’s subnet. 2) Check that there is no other PIM router that thinks it is the DR for the subnet.

• Although in our workshop lab our first-hop routers are Ciscos, the following examples show both Junipers and Ciscos.

Page 75: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify Receiver Interest

squash# show ip rpf 140.221.34.1

RPF information for ws-video.mcs.anl.gov (140.221.34.1) RPF interface: GigabitEthernet5/7 RPF neighbor: ? (0.0.0.0) - directly connected RPF route/mask: 140.221.34.0/28 RPF type: unicast (connected) RPF recursion count: 0 Doing distance-preferred lookups across tables

squash#

 

1) Cisco: find the right interface: receiver

Page 76: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify Receiver Interest

remote@MREN-M5> show multicast rpf 140.221.34.1Multicast RPF table: inet.2, 5051 entries

140.221.34.0/27 Protocol: Direct Interface: ge-0/0/0.108

 

1) Juniper: find the right interface: receiver

Page 77: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify Receiver Interest

 

squash#sh ip igmp interface gig5/7GigabitEthernet5/7 is up, line protocol is up Internet address is 140.221.34.13/28 IGMP is enabled on interface Current IGMP host version is 2 Current IGMP router version is 2 IGMP query interval is 60 seconds IGMP querier timeout is 120 seconds IGMP max query response time is 10 seconds Last member query response interval is 1000 ms Inbound IGMP access group is not set IGMP activity: 867 joins, 866 leaves Multicast routing is enabled on interface Multicast TTL threshold is 0 Multicast designated router (DR) is 140.221.34.13 (this system) IGMP querying router is 140.221.34.13 (this system) No multicast groups joinedsquash#

2) Cisco: verify DR for that interface:

Page 78: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify Receiver Interest

 

remote@MREN-M5> show pim interfaces

Instance: PIM.masterName Stat Mode IP V State Count DR addressat-0/2/1.237 Up Sparse 4 2 P2P 1at-0/2/1.6325 Up Sparse 4 2 P2P 1at-0/2/1.9149 Up Sparse 4 2 P2P 1ge-0/0/0.108 Up Sparse 4 2 DR 1 140.221.34.13 ge-0/0/0.109 Up Sparse 4 2 NotDR 1 10.10.10.1

remote@MREN-M5>

2) Juniper: verify DR for that interface:

Page 79: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify Receiver Interest

• SO… now you are sure you are on your receiver’s DR.

• Remember, multicast is receiver-driven.

• QUESTION: Does the DR know that there are interested receivers of the group on your host’s subnet??

• Look at IGMP for the group in question.

 

Page 80: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify Receiver Interest

 

squash#sh ip igmp group 233.2.171.1IGMP Connected Group MembershipGroup Address Interface Uptime Expires Last Reporter233.2.171.1 Vlan1 1d03h 00:02:16 140.221.10.87233.2.171.1 GigabitEthernet5/7 7w0d 00:02:21 140.221.34.1squash#

On the DR (Cisco):

If receiver’s interface is in this list, you are OK. You might want to watch for a while to ensure no timeouts are occurring.

group you are debugging

Page 81: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify Receiver Interest

 

On the DR (Juniper):

remote@MREN-M5> show igmp group 233.2.171.1Interface: ge-0/0/0.108 Group: 233.2.171.1 Source: 0.0.0.0 Last Reported by: 206.220.240.86 Timeout: 156 Type: Dynamicremote@MREN-M5>

group you are debugging

If receiver’s interface is in this list, you are OK. You might want to watch for a while to ensure no timeouts are occurring.

Page 82: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify Receiver Interest

• What if your interface isn’t listed with that group, even though everything else about the DR looked fine??

• You have a problem!– Host OS / driver problem– Application problem– Broken IGMP snooping switches in the

middle– Try tcpdump on the host - can you see

the IGMP membership reports on the wire? (Remember, they don't have to come from that particular host.)

STOP

Page 83: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify Receiver Interest

• If your receiver’s DR knows it has listeners of your group on that interface, you are done this step.

Move on to step 3…

Page 84: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

STEP 3:VERIFY KNOWLEDGE OF

ACTIVE SOURCE

Page 85: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source

• This is often the most complex part – the bulk of your work could be here. As we have learned, a lot has to happen for the receiver’s DR to know about a particular source.

• You MAY have to view this from both ends– The receiver’s RP– The source’s RP

• For most interdomain cases, these RPs will not be the same, and MSDP will be involved.

Page 86: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• First, let’s check to see if this is a problem

at all.

• If the receiver’s DR has (S,G) state already, we know we are ok on knowledge of active source, and we can skip this whole step!

source

DR

receiver

RPRP

Check for (S,G) state here

Page 87: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source

squash# show ip mroute 233.2.171.1 141.142.64.104IP Multicast Routing TableFlags: D - Dense, S - Sparse, s - SSM Group, C - Connected, L - Local, P - Pruned, R - RP-bit set, F - Register flag, T - SPT-bit set, J - Join SPT, M - MSDP creat entry, X - Proxy Join Timer Running A - Advertised via MSDP, U - URD, I - Received Source Specific Host Report Outgoing interface flags: H - Hardware switchedTimers: Uptime/ExpiresInterface state: Interface, Next-Hop or VCD, State/Mode (141.142.64.104, 233.2.171.1), 1w0d/00:02:59, flags: CJT Incoming interface: Vlan669, RPF nbr 130.202.222.74 Outgoing interface list: GigabitEthernet5/7, Forward/Sparse, 20:19:14/00:02:08 Vlan1, Forward/Sparse, 1w0d/00:01:56

DRreceiver

GOOD!

On the receiver's DR (Cisco):

Page 88: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source

remote@starlight-m10> show multicast route group 233.2.171.1 source-prefix 141.142.64.104Family: INETGroup Source prefix Act Pru InIf NHid Session Name233.2.171.1 141.142.64.104 /32 A F 6 246 Static Alloc

DRreceiver

GOOD!Family: INETGroup Source prefix Act Pru NHid Packets IfMi Timeout233.2.171.1 141.142.64.104 /32 A F 246 8702556 69 360 Upstream interface: ge-0/0/0.0 Session name: Static Allocations Forwarding rate: 1 kBps (9 pps)

(…extensive)

On the receiver's DR (Juniper):

Page 89: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• If the DR does NOT know about the

source, we may only see a ( * , G) entry on a Cisco DR, and we have some work to do.

squash# show ip mroute 233.2.171.1 141.142.64.104IP Multicast Routing TableFlags: D - Dense, S - Sparse, s - SSM Group, C - Connected, L - Local, P - Pruned, R - RP-bit set, F - Register flag, T - SPT-bit set, J - Join SPT, M - MSDP creat entry, X - Proxy Join Timer Running A - Advertised via MSDP, U - URD, I - Received Source Specific Host Report Outgoing interface flags: H - Hardware switchedTimers: Uptime/ExpiresInterface state: Interface, Next-Hop or VCD, State/Mode (*, 233.2.171.1), 7w0d/00:02:59, RP 192.5.170.2, flags: SJCF Incoming interface: Vlan29, RPF nbr 140.221.20.97 Outgoing interface list: GigabitEthernet5/7, Forward/Sparse, 20:22:27/00:02:52 Vlan1, Forward/Sparse, 7w0d/00:02:45

( * , G) only is BAD!

Page 90: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source

• If the DR does NOT know about the source, we may see nothing on a Juniper DR, and we have some work to do.

BAD!

remote@starlight-m10> show multicast route group 233.2.171.1 source-prefix 141.142.64.104

Family: INETGroup Source prefix Act Pru InIf NHid Session Name remote@starlight-m10>

Page 91: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source

• Recall that knowledge of active sources is first spread through a given PIM domain by per-group RP-rooted shared distribution trees.

• Current practice is to set the Shortest Path Tree (SPT) threshold to zero, so that (S,G) state is created on the first packet sent through the RP.

• But if the RPT doesn’t get built properly, the SPT never will!

Page 92: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• So, first, we will work back from the

receiver’s DR to its RP, to be sure that the RPT branch is built correctly.

• Second, we will check to see if the receiver’s RP knows about the source.

• Third, we will check with the source end for their RP’s knowledge and advertisement of the source.

• Last, we will troubleshoot MSDP as needed to make sure knowledge of the source can get from one RP to the other.

• The following page has a rough flowchart for later reference.

Page 93: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source

Recv DR know of source?

Is RPT built correctly recv DR to recv RP?

Yes, but still no traffic

Go to step 4

No

NoYes

Recv RP know of source?Troubleshoot RPF, PIM

NoYes

Source RP know of source?

No

Troubleshoot source DR to RP

Yes

Troubleshoot MSDP

Page 94: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source

• First, we check that the RPT is built properly from the receiver’s DR back to the receiver’s RP.

DR

receiver

RP

RPF, ( * ,G) join RPF, ( * ,G) join

Page 95: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• Does the DR have the right RP (Cisco)?

– We can first just look at the ( * , G) entry on the receiver's DR.– If that doesn't look right, we can look to see how it learned

about the RP with show ip pim rp mapping <group> .

squash# show ip mroute 233.2.171.1IP Multicast Routing TableFlags: D - Dense, S - Sparse, s - SSM Group, C - Connected, L - Local, P - Pruned, R - RP-bit set, F - Register flag, T - SPT-bit set, J - Join SPT, M - MSDP creat entry, X - Proxy Join Timer Running A - Advertised via MSDP, U - URD, I - Received Source Specific Host Report Outgoing interface flags: H - Hardware switchedTimers: Uptime/ExpiresInterface state: Interface, Next-Hop or VCD, State/Mode (*, 233.2.171.1), 7w0d/00:02:59, RP 192.5.170.2, flags: SJCF Incoming interface: Vlan29, RPF nbr 140.221.20.97 Outgoing interface list: GigabitEthernet5/7, Forward/Sparse, 20:22:27/00:02:52 Vlan1, Forward/Sparse, 7w0d/00:02:45

Page 96: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• Does the DR have the right RP

(Juniper)?remote@MREN-M5> show pim rps detailInstance: PIM.masterFamily: INETRP: 206.220.241.254Learned via: static configurationTime Active: 13w2d 09:59:40Holdtime: 0Group Ranges: 224.0.0.0/4Active groups using RP:

224.2.127.254 233.2.171.1 239.22.33.5 total 3 groups activeremote@MREN-M5>

Page 97: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• What if the RP is wrong?

– A common problem is that auto-RP and/or PIMv2 BSR may be running without the admin's knowledge (on Ciscos they are on by default when PIM-SM is enabled, and Junipers listen to them). Information can leak from a neighboring AS! These take precedence over anything you statically configure.Hint: use ip pim rp-address <address> override

– Auto-RP and BSR are complex, and could have any one of a number of problems. We recommend static configuration in most campus networks, Anycast-RP in backbone/transit networks.

– Might just be a typo in entering the static RP address.

Page 98: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source

• Now that you are sure of what the RP is (and it is correct), starting at the receiver’s DR, work your way back to the receiver’s RP:

• Check that the RPF is pointing the way you expect.

• Check that PIM is configured and working properly on the interface. A common problem is PIM is not turned on for a particular interface.

• You may also want to double-check that each router has ( * , G) state for the group you are debugging.

Page 99: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source

– show ip rpf <RP ip address>– show ip pim neighbor <rpf interface>

squash# show ip rpf 192.5.170.2RPF information for kiwi-loop.anchor.anl.gov (192.5.170.2) RPF interface: Vlan29 RPF neighbor: kiwi.anchor.anl.gov (140.221.20.97) RPF route/mask: 192.5.170.2/32 RPF type: unicast (ospf 683) RPF recursion count: 0 Doing distance-preferred lookups across tables

squash# show ip pim neighbor Vlan29PIM Neighbor TableNeighbor Address Interface Uptime Expires Ver Mode140.221.20.97 Vlan29 7w0d 00:01:35 v2 (DR)squash#

Cisco:

Page 100: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source

– show multicast rpf <RP ip address>– show pim neighbors

remote@MREN-M5> show multicast rpf 206.220.241.254Multicast RPF table: inet.2, 5061 entries

206.220.241.0/24 Protocol: BGP Interface: ge-0/0/0.108

remote@MREN-M5> show pim neighborsInstance: PIM.master

Interface IP V Mode Option Uptime Neighbor addrat-0/2/1.237 4 2 H 4w6d11h 192.122.182.13at-0/2/1.6325 4 2 H 4w6d11h 206.166.9.33at-0/2/1.9149 4 2 HP B 4w6d11h 199.104.137.245ge-0/0/0.108 4 2 H G 4w6d11h 140.221.20.97

Juniper:

Page 101: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• Repeat that process until you have verified

the RPF paths and the PIM adjacencies back to the receiver's RP. This verifies that the RPT has been built correctly.

DR

receiver

RPRPF, ( * ,G) join RPF, ( * ,G) join

Page 102: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• Next Big Question: Does the receiver's RP have

knowledge of the active source?• Since we already checked that the RPT is correct,

it probably doesn’t, or the DR would have likely had (S,G) information.

• If it doesn’t, but has ( * , G) only, and no MSDP SA (source-active) cache entry for that source, we will have to find out some information about the source end of things, then troubleshoot MSDP.

• Note it does not matter which peer you get an SA from as long as it is accepted and in the cache. However, if you are going to open a ticket with an upstream, you might as well figure out who you expect to accept it from.

Page 103: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• The objective here will be to get an MSDP source-

active about the source to our receiver’s RP.• The SA originates from the source’s RP, and is re-

advertised/ flooded by MSDP peers along the way.• Some sites have estimated that about half of their

multicast problems are problems associated with missing MSDP SA information.

Domain CDomain B

Domain D

Domain E

Domain A

r

Join (*, 233.2.171.1)

RP

RP

RPRP

sRP

Register192.1.1.1, 233.2.171.1

SA SA

SA

SA

SA

SAMSDP MSDP

MSDP

MSDP

MSDP

MSDP

Page 104: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source

Kiwi#sh ip mroute 233.2.171.1 141.142.64.102IP Multicast Routing TableFlags: D - Dense, S - Sparse, B - Bidir Grp, s - SSM Grp, C-Connected, L - Local, P - Pruned, R - RP-bit set, F - Register flag, T - SPT-bit set, J - Join SPT, M - MSDP created entry, X - Proxy Join Timer Running, A - Candidate for MSDP Advert, U - URD, I - Recved Source Specific Host Rpt, Z - Mcast Tunnel, Y - Joined MDT-data group, y - Sending to MDT-data groupOutgoing interface flags: H - Hardware switched Timers: Uptime/Expires Interface state: Interface, Next-Hop or VCD, State/Mode

(*, 233.2.171.1), 6w6d/stopped, RP 192.5.170.2, flags: S Incoming interface: Null, RPF nbr 0.0.0.0 Outgoing interface list: GigabitEthernet5/0, Forward/Sparse, 6w6d/00:03:01 BAD!

On the receiver’s RP:

Kiwi#sh ip msdp sa-cache 233.2.171.1 141.142.64.102 MSDP Source-Active Cache Entry not found BAD!

Page 105: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• Recall it is MSDP's job to flood source-active

advertisements between peers so that an RP in one PIM domain can know about active sources in another.

• MSDP SA advertisements are accepted/forwarded or rejected based on MSDP "peer-RPF" rules covered earlier in this workshop.

• Remember, the information being tested against the peer-RPF rules is the originating RP's IP address. Not the IP of the source itself, but its RP.

• We need to trace the source-RP via the peer-RPF rules from our receiver's RP out into our neighbor's AS.

Page 106: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• But… how do we know the source’s RP if

we run only the receiver network?– You may have to pick up phone and walk

them through verifying the source’s DR and finding the group-to-RP mapping there.

– Get them to tell you they have verified the source is sending, the group, port number, source TTL setting and the IP of their RP is ___.

– You might want to have them look to see that they mark the mroute as a candidate for MSDP advertisement while you're there. (Example - next slide.)

Page 107: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source

Kiwi#sh ip mroute 233.2.171.1 141.142.64.104IP Multicast Routing TableFlags: D-Dense, S-Sparse, B-BidirGroup, s-SSM Group, C-Connected, L - Local, P - Pruned, R - RP-bit set, F-Register flag, T - SPT-bit set, J - Join SPT, M - MSDP created entry, X – Proxy Join Timer Running, A - Candidate for MSDP Advertisement, U - URD, I - Recv Source Specific Host Report, Z - Multicast Tunnel, Y - Joined MDT-data group, y - Sending to MDT-data groupOutgoing interface flags: H - Hardware switched Timers: Uptime/Expires Interface state: Interface, Next-Hop or VCD, State/Mode

(141.142.64.104, 233.2.171.1), 6w6d/00:03:26, flags: TA Incoming interface: GigabitEthernet5/0, RPF nbr 141.142.20.124 Outgoing interface list: ATM3/0.6200, Forward/Sparse, 2w0d/00:02:42 (ttl-threshold 32)Kiwi#

On the source’s RP to show generating an SA:Source IP

Page 108: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• Now we have the source/originating RP's IP

address.• The idea here is we are trying to figure out

which of our MSDP peers we should expect to get knowledge of the actual source from. – If the source RP is an MSDP peer of our RP,

the source RP is the RPF peer.– If we look at show ip mbgp <source RP IP> , the

MSDP peer in the adjacent AS is the RPF peer.– In practice, in most campus networks, show ip rpf <source RP IP> and show ip mbgp <source RP IP> will usually get you going in the right direction.

Page 109: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active sourceguava#sh ip rpf 141.142.20.124RPF information for lsd6509.sl.startap.net (206.220.241.254) RPF interface: Vlan109 RPF neighbor: mren-anl-gige.anchor.anl.gov (192.5.170.214) RPF route/mask: 141.142.0.0/16 RPF type: mbgp RPF recursion count: 0 Doing distance-preferred lookups across tables

guava#sh ip mbgp 141.142.20.124BGP routing table entry for 141.142.0.0/16, version 1977637Paths: (2 available, best #1, table NULL) Flag: 0x208 Advertised to peer-groups: imbgp-mesh 22335 11537 1224 192.5.170.214 from 192.5.170.214 (206.220.241.254) Origin IGP, localpref 40100, valid, external, best Community: 683:65001 11537:950 22335:11537 293 11537 1224 192.5.170.78 from 192.5.170.78 (134.55.29.97) Origin IGP, metric 100, localpref 10000, valid, external Community: 293:52 683:293 no-export

Source’s RP

Page 110: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• Assuming we do not have an entry for the source and group

in our receiver RP's SA-cache, we might be able to see if we are getting a reasonable SA advertisement but rejecting it:

LSD6509#sh ip msdp sa-cache 233.2.171.1 141.142.64.104 rejected detail read-onlyMSDP Rejected SA Cache5285 rejected SAs received over 00:00:13, cache size: 2000 entriesTimestamp (source, group)3928782.016, (141.142.64.104, 233.2.171.1), RP: 141.142.12.1, Peer:206.220.240.220 Reason: rpf-fail3928782.076, (141.142.64.104, 233.2.171.1), RP: 141.142.12.1, Peer:205.189.32.74 Reason: rpf-fail3928782.120, (141.142.64.104, 233.2.171.1), RP: 141.142.12.1, Peer:205.189.32.70 Reason: rpf-fail3928782.148, (141.142.64.104, 233.2.171.1), RP: 141.142.12.1, Peer:205.213.117.13 Reason: rpf-fail

This is a circular buffer, so it's hit-or-miss...• On a Juniper, turn on MSDP traceoptions and search the file.

flag source-active receive detail

Page 111: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• If we are getting an SA from what we think

should be the RPF peer, yet rejecting it, we need to work through the MSDP peer-RPF rules to figure out why. Possible reasons:–We've configured to use only the multicast RIB, yet we have no MBGP route to the originating RP. Check that the source network is advertising the route to the RP in MBGP and we are accepting it (policy misconfigurations).

–We have MBGP running, but not MSDP, with a peer that appears to have a better route to the originating RP than who we think is the RPF peer.

–incorrectly configured default peer.–bugs, voodoo, who knows!

Page 112: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• Assuming you are not getting an SA from the

peer you think should be the RPF peer, you may need to open a ticket with your upstream provider or peer. You can give them the following:– We are not getting an SA for <source IP

address>

– The group address is <group address>

– The source’s RP is <source RP IP address>

– We expected to get this from <MSDP peer’s IP address>

• Also report if you’re not getting the MBGP route.

Page 113: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source• Other than just turning the problem over to

your upstream provider, for many Internet2 campuses, Abilene core routers will be in the path.

• It is sometimes helpful to go to the router proxy closest to the source and check for the SA-cache entry for the source/group in question there.

• If there is no entry there, it is not too surprising your campus is not getting a valid SA. (We have a screenshot at the end of these slides.)

http://ratt.uits.iu.edu/routerproxy/abilene/

Page 114: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Verify knowledge of active source

• Since you have already checked your path back from the receiver to your RP, you should then get (S,G) state on the receiver’s DR when you fix rejecting a received SA, or your upstream provider or peer resolves the ticket concerning a missing SA.

Move on to step 4…

Page 115: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Overview Refresher!

Gather information

Verify receiver interest

Verify knowledge of active source

Trace forwarding state back

Page 116: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

STEP 4:TRACE FORWARDING

STATE BACK

Page 117: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Trace forwarding state back• We now have (S,G) state on the receiver’s DR.• Next, we need to check to see if traffic is actually

flowing… (Cisco example)

squash# show ip mroute 233.2.171.1 141.142.64.104 countIP Multicast Statistics226 routes using 103842 bytes of memory42 groups, 4.38 average sources per groupForwarding Counts: Pkt Count/Pkts per second/Avg PktSize/Kilobits per secOther counts: Total/RPF fail/Other drops(OIF-null,rate-limit,etc) Group: 233.2.171.1, Source count: 100, Group pkt count: 987910557 Source: 141.142.64.104/32, Forwarding: 0/0/0/0, Other: 6/0/6squash#

If this is zero, you still have a problem.

Page 118: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Trace forwarding state back

• Here’s how to check if traffic is flowing on a Juniper:

litvanyi@starlight-m10> show multicast route group 233.2.171.1 source-prefix 141.142.64.104 extensive Family: INETGroup Source prefix Act Pru NHid Packets ...233.2.171.1 141.142.64.104 /32 A F 426 0 0 249 Upstream interface: ge-0/0/0.11537 Session name: Static Allocations Forwarding rate: 0 kBps (0 pps)

litvanyi@starlight-m10>

If this is zero, you still have a problem.

Page 119: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Trace forwarding state back• Start on your receiver’s DR.• This time, RPF back towards the actual source IP

address (as opposed to the source RP).

squash# show ip rpf 141.142.64.104RPF information for ag-nl-video.ncsa.uiuc.edu (141.142.64.104) RPF interface: Vlan669 RPF neighbor: guava-stardust.anchor.anl.gov (130.202.222.74) RPF route/mask: 0.0.0.0/0 RPF type: unicast (ospf 683) RPF recursion count: 0 Doing distance-preferred lookups across tables

sourceOn a Cisco:

Page 120: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Trace forwarding state backOn a Juniper:

litvanyi@starlight-m10> show multicast rpf 141.142.64.104 Multicast RPF table: inet.2, 5060 entries

204.121.50.0/24 Protocol: BGP Interface: ge-0/0/0.293 Neighbor: 198.125.140.97

litvanyi@starlight-m10>

• You are looking to see how you are expecting the SPT tree to be built, where you actually expect the packet flow to come from.

source

Page 121: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Trace forwarding state back• Work your way back towards the source IP, looking for

PIM problems along the way.

squash# show ip pim neighbor Vlan669PIM Neighbor TableNeighbor Address Interface Uptime Expires Ver Mode130.202.222.74 Vlan669 7w0d 00:01:35 v2 (DR)

Cisco:

Juniper:litvanyi@starlight-m10> show pim neighbors detail | find "ge-0/0/0.293"

Interface: ge-0/0/0.293

Address: 198.125.140.97, IPv4, PIM v2 Hello Option Holdtime: 105 seconds 98 remaining Hello Option DR Priority: 1 Hello Option LAN Prune Delay: delay 500 ms override 2000 ms Rx Join: Group Source Timeout 233.2.171.1 203.255.248.51 201 233.2.171.1 150.183.121.105 201 233.2.171.1 131.94.133.48 201

Page 122: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Trace forwarding state back• Log into that upstream router and check state there

with:

router# show ip mroute <group> <source>

router# show ip mroute <group> <source> count

• Or (Juniper):

router> show multicast route group <group> source <source>

extensive

• Look to see if the downstream router is in the outgoing interface list, and to see if you see a positive traffic rate.

• Hopefully you will work your way back to a router that is seeing the traffic flow.

Page 123: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Trace forwarding state back

DR

receiver

RP

RPF, (S,G) join

RPF, (S,G) join

We are tracing back the SPT...

Traffic?

don’t care

Page 124: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Trace forwarding state back

Kiwi#sh ip mroute 233.2.171.1 141.142.64.104 countIP Multicast Statistics493 routes using 224398 bytes of memory71 groups, 5.94 average sources per groupForwarding Counts: Pkt Count/Pkts per second/Avg Pkt Size/Kbits per secOther counts: Total/RPF failed/Other drops(OIF-null, rate-limit etc)

Group: 233.2.171.1, Source count: 123, Group pkt count: 82381322 Source: 141.142.64.104/32, Forwarding: 37847545/9/89/6,Other:33/0/0

Kiwi#sh ip mroute 233.2.171.1 141.142.64.104IP Multicast Routing TableFlags: <cut>Outgoing interface flags: H - Hardware switched Timers: Uptime/Expires Interface state: Interface, Next-Hop or VCD, State/Mode

(141.142.64.104, 233.2.171.1), 6w6d/00:03:26, flags: TA Incoming interface: Vlan109, RPF nbr 192.5.170.214, Mbgp, RPF-MFD Outgoing interface list: Vlan669, Forward/Sparse, 5d18h/00:02:37, H

Page 125: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Trace forwarding state back

litvanyi@starlight-m10> show multicast route group 233.2.171.1 source-prefix 141.142.64.104 extensive

Family: INETGroup Source prefix Act Pru NHid Packets IfMismtch Timeout233.2.171.1 128.55.247.10 /32 A F 426 5251621 0 360 Upstream interface: ge-0/0/0.293 Session name: Static Allocations Forwarding rate: 1 kBps (9 pps)

Juniper:

Page 126: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Trace forwarding state back

• If you get to a point where the upstream router IS showing it is receiving the packets, but your downstream is not, you need to figure out why those packets are getting lost.

• ACLs or VPNs?• Broken IGMP snooping switch in the

middle?• PIM problem?• TTL on sender too low?

Page 127: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Trace forwarding state back• You may work this back to the edge of your

area of responsibility, and may have to open a ticket with your upstream to continue the process towards the source. Give them:

• The active source IP address• The group address• The circuit / link towards which your router

has sent the (S,G) join• The fact that you are not receiving packets

for that (S,G) on that shared link.

Page 128: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Summary

Gather information

Verify receiver interest

Verify knowledge of active source

Trace forwarding state back

Page 129: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Summary

•Pick a direction•Active source and receiver IP addresses•Group address

Gather information

Page 130: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Summary

•Identify the DR for the receiver.•Verify the DR knows of interest in that group.•Check that the DR is not receiving traffic.

Verify receiver interest

Page 131: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Summary

•Might mean fixing multicast reachability topology or PIM state.

•Probably will involve MSDP SA debugging.

Verify knowledge of active source

Page 132: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Summary

•Trace forwarding state from receiver’s DR.

•Work towards the actual source.•Verify reachability, PIM state, and whether traffic is flowing at each step.

Trace forwarding state back

Page 133: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

A word on troubleshooting performance problems...

• Performance problems in multicast inherit virtually all the problems associated with unicast performance issues, which you know how to troubleshoot:• packet loss due to congestion.• latency/jitter due to queueing, traffic shaping

devices, interleaving, etc.• duplex problems, cable issues, etc.

• Users often neglect to look at their host performance. Video apps can drive the CPU to where it cannot handle the load.

• It is usually more fruitful to look to the above issues before spending a lot of time looking at timers and such in multicast protocols.

Page 134: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Tools

• Beacon http://dast.nlanr.net/projects/Beacon/– The beacon is an application to monitor

multicast reachability and performance among beacon-group participants. Participants both send and receive on a known group.

– The results are displayed with receivers on the hosts as the vertical axis and sources on the horizontal axis.

– A host’s source number matches its receiver number.

Page 135: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Toolshttp://dast.nlanr.net/projects/Beacon/

Page 136: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Tools• If the beacon is broken, that gives you higher

confidence the problem is not just user error or host issues.

• It is sometimes possible to use the beacon as the constantly active source and receiver for debugging.

• However, many times the beacon can be fine yet multicast is broken for a different group.

• It will not catch new/transient problems with source knowledge or state creation (the tree has been built).

• Encourage sites you collaborate with to participate in a beacon group!

Page 137: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Tools• Example: GEANT

http://beaconserver.geant.net:9999

Page 138: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Tools

• Some web tools exist to look at peer’s routers.• Again, the Abilene router proxy:

http://ratt.uits.iu.edu/routerproxy/abilene/• Also, some looking-glass pages include multicast

information as queries you can run:http://www.nordu.net/connectivity/looking-glass/lg.cgi

• You can get the proxy code free from IU after signing a license agreement. You can freely download the looking glass code and modify it yourself if you would like to make your network visible to others.

Page 139: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Tools

Page 140: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Tools

Page 141: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Tools• rtpqual ftp://ftp.ee.lbl.gov/rtpqual.c

– Simple Multiprotocol Multicast Signal Quality Meter– very useful for establishing a receiver (even if the multicast

is not using RTP)– also useful for finding packet loss problems and whether

they are periodic or not– If you know the group but not the port, you can use rtpqual

to join with any port, then use tcpdump to find out which port the traffic is actually going to.

• Mtrace ftp://ftp.parc.xerox.com/pub/net-research/ipmulti/mtrace5.2.tar.gz

– Simple host-based rpf check tool• Iperf http://dast.nlanr.net/projects/iperf

– Source/client traffic generator that can generate multicast packets (requires access to device at both ends of path)

Page 142: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Internet2 Workshops• Your institution can sponsor a 2.5-day hands-on workshop

with lectures and labs!

• Typically 12-20 students.

• Lab setup consists of 4 “pods” of 5 routers and a switch, plus 2 PCs (linux) with multicast tools and a camera. Currently, this is 4 Cisco routers, one Juniper, and an HP switch.

• Varying amount of instructor support possible (0-4 instructors).

• Does not require multicast connectivity to the world, just a unicast tunnel.

• See: http://multicast.internet2.edu/workshops/

Page 143: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Information Online• tutorial-style paper at:

http://multicast.internet2.edu/almeroth.pdf

• http://www.ncne.nlanr.net/documentation/faq/mcast_eng_faq.html

• http://dast.nlanr.net/projects/Beacon/

• GEANT: http://www.dante.net/nep/GEANT-MULTICAST/links to some troubleshooting docs and monitoring tools

• ftp://ftpeng.cisco.com/ipmulticast.html

• http://www.sprint.net/multicast/faq.html

• Abilene router proxy: http://ratt.uits.iu.edu/routerproxy/abilene/

Page 144: Multicast Troubleshooting Tutorial Caren Litvanyi litvanyi@grnoc.iu.edu Joint Techs Meeting Salt Lake City, Utah February 2005

Questions?Thank you!

Caren Litvanyi [email protected]