inter-peer noc communication
DESCRIPTION
Inter-Peer NOC Communication. Mike Hughes [email protected]. Scene Setting: Straw Poll. Who here in this room does peering?. Scene Setting: Straw Poll. Who here in this room does peering? Have you ever had issues resolving problems with your peerings? - PowerPoint PPT PresentationTRANSCRIPT
Scene Setting: Straw Poll
Who here in this room does peering?
Scene Setting: Straw Poll
Who here in this room does peering? Have you ever had issues resolving
problems with your peerings?– Difficulties contacting peers, finding the
right contact, communication problems?
Scene Setting: Straw Poll
Who here in this room does peering? Have you ever had issues resolving
problems with your peerings? Do you maintain a local db of contacts?
– Why? Issues with freshness of data?
Scene Setting: Straw Poll
Who here in this room does peering? Have you ever had issues resolving
problems with your peerings? Do you maintain a local db of contacts? When a peer needs to talk to you,
where does their call/email arrive?– Main NOC contact? Dedicated peering
contact? “Customer Care”?
Scene Setting: Straw Poll
Who here in this room does peering? Have you ever had issues resolving
problems with your peerings? Do you maintain a local db of contacts? When a peer needs to talk to you,
where does their call/email arrive? Some names have been changed to
protect the innocent… and guilty…
Why do you go peering?
Long term money savings Less Transit Lower latency, better performance Traffic Control Diversity, Reliability Presence
…and so on…
Where’s the problem?
Poor inter-peer communication seems to be common– Friendly IX operator called in to “mediate”
Communication hitting the wrong place– Customer NOCs– IX Operator– IP address maintainer (e.g. whois contact)
Identifying the right contact
Sources of information:– Whois queries to databases– IXP-maintained NOC and Peering contact db– Internal databases– Third-party voluntary databases
• http://puck.nether.net/netops list• peeringdb.com
All above are vulnerable to information “rot”
How to drive RIPEdb/RA, etc
Some really subtle differences in the implementations– RIPE expects “AS” before an AS number!
Which contacts are useful Which objects to look up
– Like the Peer ASN, not the Peer IP address! Why can’t ASN be logged in adjacency
changes on routers?– This seems to drive IP-based lookups
Drive the Data Sources Properly!
Example: using WHOIS queries “Oh, I have an outage on WAIX, I’ll look
up the IP address”$ whois -h whois.arin.net 198.32.212.11|less
…
OrgName: Exchange Point Blocks
…
RTechHandle: WM110-ARIN
RTechName: Manning, Bill
RTechPhone: +1-310-322-8102
RTechEmail: [email protected]
Bad Data Enters the System
“Okay, I’ll phone Bill Manning”– But all Bill did was give WAIX some v4 space– Bill doesn’t run WAIX, and isn’t an operational
contact for WAIX
So, Bill either ignores your voicemail, or tells you to call someone else
Whatever – it’s added delay, increased frustration – it’s how not to do it
Driving Whois Properly
Always lookup the PEER ASN– Not the IP address!– It’s a BGP problem, we use ASNs in BGP$ whois -h whois.ra.net AS3856|less
aut-num: AS3856
as-name: UNSPECIFIED
descr: Packet Clearing House
www.pch.net
admin-c: Bill Woodcock
tech-c: Bill Woodcock
remarks: [email protected], +1 866 BGP PEER
Driving Whois Properly
Always lookup the PEER ASN– Not the IP address!– It’s a BGP problem, we use ASNs in BGP$ whois -h whois.ra.net AS3856|less
aut-num: AS3856
as-name: UNSPECIFIED
descr: Packet Clearing House
www.pch.net
admin-c: Bill Woodcock
tech-c: Bill Woodcock
remarks: [email protected], +1 866 BGP PEER
So you’ve found the contact
How do they respond to you?– Confusing recursive call trees?– Recalcitrant ticketing systems?– First-line NOC – “Is it switched on?”– “You’re not a customer, go away”
Once negotiated, peering is an engineering relationship– So backbone ops, not “customer care”
Expectations of Peer Contacts
Choose your points of contact carefully Big problems with
– What’s peering/BGP/WAIX?– Are you a customer?– What’s your circuit ID?– Go away, you aren’t a customer
All serious no-no’s – be nice to your peers!
PCH INOC-DBA Phones
PCH operate a “dial by ASN” NOC hotline system– They run the SIP registry/proxy– “Bring your own” SIP compliant phone
The idea is that it should get through to someone clueful– No call-trees, no music-on-hold
http://www.pch.net/inoc-dba/
Suggested Role Contacts
Peering@– For setting up new peerings, changing existing
ones, no 24x7 expectation– Shouldn’t go to exclusively to sales@ ;-)
NOC@– Reaches your 24x7 NOC, which is either BGP
friendly and has enable, or knows when, how and where to escalate
Support@– Is generally your “customer-care”/call center
Getting the message across
Okay, so you’ve made contact– Now, make your point
Provide the peer with useful information
– Start with the subject line– Be informative, who, when, what– Messages like “Help” and “Peering down”
aren’t helpful
How not to do it…
Where? How does it affect me?– All detail buried in wordy message body
When? No TZ stamp! Help me handle my huge NOC inbox!
-----Original Message-----
From: Joe Schmoe <[email protected]>
Sent: Wednesday, January 25, 2006 5:41 PM
Subject: Maintenance Notification
Dear Peers,
…
Example: Useful Subject Headers
AS7132’s preferred subject line format: <IX location> - <peer writing to/ASN> - <peer writing from/ASN> - <what is the issue> - <date of initial correspondence> - <time of initial message>
Example subject line: Equinix-Ashburn - RCN/6079 - SBC/7132 - new session
turn-up - 29- Mar-06 - 9:45 am ESTThanks to Ren Provo
Look clueful
What does this say about your peer?– Don’t you think they look silly?
Run tools to help you answer these questions yourself– Netflow, MAC accounting, etc.
Subject: Traffic Drop
Dear Peer,
We suddenly noticed a 300Mb drop in traffic on our connection to the PIE-IX. Can you investigate, and help us find where the traffic has gone?
Regards,
…
How to escalate
Check your equipment first Ask your peer - “What’s up?”
– Often you can resolve a problem bi-laterally Go to the IX only if you need to
– Not all IX operators can provide a 24x7 contact When to escalate a customer fault
– Don’t stonewall customer reports– Don’t point them to the IX operator– Co-ordinate directly with your peers
How the IXP Op can help
Provide an up-to-date list of IX participants and their NOC/Peering contact information– Usually password protected
Help break comms deadlock– Help fix “dead ends”
Otherwise, they can only help with “physical” problems– “link down”, packet loss, broken cables, packet
corruption to all destinations connected to the IXP
In Summary
Keep your own information up to date– Whois db objects, third party dbs
Make sure your peering and NOC contacts are appropriate– No-one likes call-trees and holding
Find the right contacts at your peers Be nice to your peers!
Thanks