enterprise pagerduty overview - it.ucsf.edu11 servicenow / pagerduty workflow • integration...
TRANSCRIPT
ServiceNow Integration with PagerDuty
(On-Call Schedules and Escalations)
July 23, 2013
Agenda
• On-Call before PagerDuty• About PagerDuty• ServiceNow and PagerDuty• Important Information• Important Dates• Live Demonstration• Q&A
2
On-Call Today – ITS
• Phones and pagers are physically transferred between people
• Email lists and aliases have to be modified to redirect to the correct person
• Phone trees have to be maintained and distributed
• Calendars need to be maintained and made available
• Many contact methods and escalations are manual
3
On-Call Today – ITS
• Existing ITS PagerDuty Instances (Schluntz and Kluba)– Current ServiceNow integration (Schluntz)
will run in parallel for a short period of time to ensure the new Enterprise instance works as expected (duplicate user profiles, notifications, and acknowledgements).
– Automated PagerDuty escalations (Schluntzand Kluba) will be integrated into the new Enterprise instance within the coming weeks.
4
• Vendor hosted On-Call escalation and notification service
• Supports multiple…– alert sources / types– escalation policies– on-call schedules– contact methods
• SMS / TXT• Email• Phone Call• Push (iOS & Android)
5
About
• Hosted in Amazon Web Services• Replicated to two data centers with
fast failover• Engineer on call 24x7 to ensure
service continuity
6
About
PagerDuty officially supports the following web browsers:• Desktop Browsers:
– Google Chrome newest versions– Internet Explorer v. 8 and higher– Firefox v. 10 and higher– Safari all versions
7
About
PagerDuty officially supports the following web browsers:• Mobile Browsers:
– iOS v 6.0 and higher– Safari all versions– Chrome all versions– Android v 2.3 and higher
8
About
• PagerDuty Components– Users – who & method of contact– On-Call Schedules – when to contact– Escalation Policies – which schedule to use– Services – trigger integration
• Email• Generic API (web POST)• Prebuilt API tools for some platforms like Nagios
9
About
ServiceNow / PagerDuty Workflow
10
P1/P2 Incident Assigned to Group (no assignee)
P1/P2 Incident Assigned to Group (no assignee)
PagerDuty Incident Opened
PagerDuty Incident Opened
Escalation Policy
Activated
Escalation Policy
Activated
On-Call Schedule Check to
Assign User
On-Call Schedule Check to
Assign User
Contact UserContact User•If user fails to respond, escalate
User Acknowledge or Reassign
via PagerDuty
User Acknowledge or Reassign
via PagerDuty
• Or, manually update ServiceNow Incident
11
ServiceNow / PagerDuty Workflow
• Integration Latency “Known-Issue”• At the time that a ServiceNow Incident meets the
qualification to trigger a PagerDuty escalation policy, there is a one-time 10 second delay when saving the Incident.
• At the time that a ServiceNow Incident no longer meets the qualification of a PagerDuty escalation policy (changed Priority to Low, or added an Assigned to person, etc.), there is a one-time 25 second delay when saving the Incident.
– This latency is actively under investigation.• Less than 1% of ServiceNow Incidents will
activate PagerDuty (based on Critical/High Priority).
Contacting You – SMS / Text
12
Contacting You – Phone Call
“You have one triggered incident on [Service Name]. The failure is INC0812345 [short description]. Press 4 to acknowledge, press 6 to resolve, press 0 for help or press star to repeat this message.”
13
Mobile Device (Browser or App)
14
• PagerDuty is mobile enabled
• You can access the site and incidents from your phone (browser or app)
Contacting You – Email
15
PagerDuty Documentation
• http://tiny.ucsf.edu/pagerduty– Quick Reference Cards (QRC)
• User Profile; Dashboard; Exporting On-Call Schedules to Your Calendar Application; Responding to an Alert; Maintaining On-Call Schedules (managers & team-leads only)
– ServiceNow Integration• Comprehensive list of ServiceNow Incidents that will
trigger PagerDuty
– General Presentation• Enterprise PagerDuty Overview (this presentation)• Video Training (On-Call staff & managers/leads)
16
Important Information
• Ensure all On-Call staff have set up their User Profiles before Go-Live.
• Every On-Call group must identify a Subject Matter Expert (SME) to train new users and answer questions.
17
Important Information
• Extremely important for On-Call staff to NOT adjust Services, Escalation Policies, or On-Call Schedules.
• Only managers or team leads should adjust On-Call Schedules. If you can’t find a User name it means they haven’t been set up in PagerDuty. Submit a ServiceNow Incident request assigned to “Service Now Admin” to add them to PagerDuty.
18
Important Dates
• Go-Live, Monday, July 29 8:00AM• Training schedule 7/23 – 7/26
– On-Call Staff (4 sessions)– Managers & Team-Leads (3 sessions)
• Drop-In Q&A 7/29 – 7/30 – Mon 7/29, Conf 316A, 10:00-noon & 1-4pm– Tue 7/30, Conf 316A, 9-11am
19
PagerDuty Walkthrough / Demo
• Login– URL: https://ucsf.pagerduty.com– Credentials: email address and temp
password (plan for future MyAccessauthentication)
• Turn off the "Welcome to PagerDuty" banner.
• Set up and maintain your Profile: Contact Methods, Notification Rules, Change Password. (QRC)
20
PagerDuty Walkthrough / Demo
• How to export On-Call schedules to my email/calendar tool via calendar feed (QRC)
• PagerDuty Dashboard (QRC)• How to set up and maintain On-Call
Schedules [MANAGERS] (QRC)• ServiceNow ticket triggers PagerDuty
(QRC)• Responding to an Alert (QRC)
21
Questions?
22