bullet proof solutions: resiliency for avaya aura uc€¦ · 12.04.2016 · bullet proof...
TRANSCRIPT
BULLET PROOF SOLUTIONS:
RESILIENCY FOR AVAYA AURA® UC
Lisa Marinelli Senior Product Manager, Team Engagement Solutions
DRAFT as of March 21, 2016
Thank you Sponsors!
Global
Sponsors
Gold
Sponsors
Silver
Sponsors
@Avaya
#AvayaATF
Join the
Conversation
© 2016 Avaya Inc. All right reserved 4
BULLET PROOF SOLUTIONS:
RESILIENCY AND SECURITY FOR AVAYA AURA® UC
Introduction
Reliability Basics
The Avaya Aura® Core Reliability
Avaya Aura® CM and the Media Server
The Avaya Aura® Branch Survivability
Avaya Aura® Survivability Enhanced with VMware
Use Cases
Avaya Aura® Rock Solid Security
© 2016 Avaya Inc. All right reserved 5
INTRODUCTION
Avaya Aura® provides ongoing business continuity in situations
that would otherwise be catastrophic.
– Core and Branch protection against server failures and network outages
– Maximizes enterprise productivity under a wide variety of failure and disaster scenarios
– Coordinated architectures monitor resource exhaustion, network failures, component
failures, and system failures in real-time.
– Self monitoring software and firmware
– Avaya Aura® Communication Manager, Session Manager, Media Server, endpoints,
gateways, and applications coordinate to recover quickly and in many cases seamlessly
– Duplication of critical hardware components
– Redundancy of key resources
– Administrable failover and fall-back policies.
5
© 2016 Avaya Inc. All right reserved 6
POST FAILURE LEVEL OF SERVICE
Failure happens before call (increasing completeness order)
1. No new call can be made
2. New call can be made, but calls cannot be received
3. New calls can be made and received
4. Complete Feature state preserved (e.g. call forwarding state).
Failure happens during call (increasing completeness order)
1. No Preservation: Call is torn down and client may not even indicate
2. Connection Preservation: media path is preserved (can still talk/see each other)
3. Call/Feature Preservation: clients can continue to communicate
4. Transaction Preservation: ringing calls can be answered and monitoring services informed
Affected further by:
– Time lapse from failure to particular level (instantly is best)
– Amount of manual intervention required—also affects time lapse (none is best)
© 2016 Avaya Inc. All right reserved 7
ROCK SOLID RELIABILITY
Survivable Core – Seamless Failover
– Geo Redundancy
Inter-gateway Routing – PSTN Wrap-Around
Active-Active – SIP Call Preservation
Survivable Branch – Call/Trunk Preservation
System Management – Geo Redundancy
VMWare Leverage – VM High Availability
© 2016 Avaya Inc. All right reserved 8
BULLET PROOF SOLUTIONS:
RESILIENCY AND SECURITY FOR AVAYA AURA® UC
Introduction
Reliability Basics
The Avaya Aura® Core Reliability
Avaya Aura® CM and the Media Server
The Avaya Aura® Branch Survivability
Avaya Aura® Survivability Enhanced with VMware
Use Cases
Avaya Aura® Rock Solid Security
© 2016 Avaya Inc. All right reserved 9
AHW ASW
A
A = (AHW) (ASW) = ~99.96%
Or 3+ “9’s”
SM
Markov Series System Example: SM RELIABILITY
DESIGN -
SERIES
© 2016 Avaya Inc. All right reserved 10
ASYS = 1 – [(1 – A)(1 – A)]
Or 4-5 “9’s”
Markov 1+1 Availability GOOD,
TRADITIONAL
RELIABILITY:
“1+1”
© 2016 Avaya Inc. All right reserved 11
ASYS = [1 – ( )(1 – A)N+1]
Or 4-5 “9’s”
N+1 N
BETTER,
COST
EFFECTIVE
RELIABILITY:
“N+1”
Markov N+1 Availability
© 2016 Avaya Inc. All right reserved 12
ASYS = [1 – ( )(1 – A)N+M]
Or > 6 “9’s”
N+M N
Markov N+M Availability
BEST
RELIABILITY:
“N+M”
© 2016 Avaya Inc. All right reserved 13
BULLET PROOF SOLUTIONS:
RESILIENCY AND SECURITY FOR AVAYA AURA® UC
Introduction
Reliability Basics
The Avaya Aura® Core Reliability
Avaya Aura® CM and the Media Server
The Avaya Aura® Branch Survivability
Avaya Aura® Survivability Enhanced with VMware
Use Cases
Avaya Aura® Rock Solid Security
© 2016 Avaya Inc. All right reserved 14
IP Office
AVAYA AURA® ARCHITECTURE
SBC
SIP
Trunks
SIP
SIP
SIP Hard and Soft
Endpoints
Third Party
SIP Phones
SIP SIP
CS 1000 CM 3rd Party
Equipment
SM
SM
SM
SM
SM
SM
PSTN
Media Services
EDP EDP EDP CM
CM CM
SIP
CM PS PS AAM AAM
System Manager
© 2016 Avaya Inc. All right reserved 15
AVAYA AURA® SESSION MANAGER AVAILABILITY
Active-active redundancy
– Geo-redundant SIP Routing Element (SRE) allow servers to be geographically separated with no distance restrictions
– Server hardware failure or network impairment results in alternate Session Manager(s) providing equivalent SRE routing functionality for SIP entities including feature servers, application servers, and endpoints
– Service provider SIP trunks can be configured as shared resource between Session Managers so no single point of failure
Server/network failures transparent to users
– No difference in behavior or routing
– Connection preservation for stable calls
15
© 2016 Avaya Inc. All right reserved 16
SESSION MANAGER DATA GRID
SMGR
Master DB Saved
Copy of DB
SM SM SM
Software JMS/JGroup “Bus”
SM DB
Subsets
© 2016 Avaya Inc. All right reserved 17
O O O
SIP Entity: SBC SIP Entity: VP/IVR
SIP Entity: AACC
SIP Trunks
IMPROVED RELIABILITY – N+M CONFIGURATION
PSTN
SMN+M SM2 SM1
© 2016 Avaya Inc. All right reserved 18
O O O
SIP Entity: SBC SIP Entity: VP/IVR
SIP Entity: AACC
SIP Trunks
IMPROVED RELIABILITY – FAILURE EX #1:
SINGLE FAILURE
PSTN
SMN+M SM2 SM1
1 2
© 2016 Avaya Inc. All right reserved 19
O O O
SIP Entity: SBC SIP Entity: VP/IVR
SIP Entity: AACC
SIP Trunks
IMPROVED RELIABILITY – FAILURE EX #1:
SINGLE FAILURE
PSTN
SMN+M SM2 SM1
1 2 3
© 2016 Avaya Inc. All right reserved 20
INVITE 4
o o o
CM/AACC
SIP Entity: SBC SIP Trunks
PSTN
INVITE
1
2
INVITE
Timer B Expires
3
IMPROVED RELIABILITY – FAILURE EXAMPLE #3:
ALTERNATE ROUTING
SM SM
INVITE
5
SIP Session
© 2016 Avaya Inc. All right reserved 21
INVITE
2
SIP Entity: VP/IVR
INVITE
1 Timer B Expires
3
INVITE
4
SM SM
IMPROVED RELIABILITY – FAILURE EXAMPLE #4:
FEATURE SERVER
CM CM CM
© 2016 Avaya Inc. All right reserved 22
IMPROVED RELIABILITY FOR ENDPOINTS/CLIENTS
Avaya Aura® Allows Multiple Concurrent Device Registrations
H.323 Phone
SIP Phone
Registrar #1
Registrar #2
Registrar #3
Registrar #1
Registrar #2
Registrar #3
Alternate Gatekeeper List - Registrar #1 - Registrar #2 - Registrar #3
Simultaneous Server List - Registrar #1 - Registrar #2 - Registrar #3
NOTE: H.323 Registrars are CLANs, PROCR, LSPs, …
NOTE: SIP Registrars are Session Managers, Branch SM, …
© 2016 Avaya Inc. All right reserved 23
IP Office
AVAYA AURA® ARCHITECTURE
SBC
SIP
Trunks
SIP
SIP
SIP Hard and Soft
Endpoints
Third Party
SIP Phones
SIP SIP
CS 1000 CM 3rd Party
Equipment
SM
SM
SM
SM
SM
SM
PSTN
Media Services
EDP EDP EDP CM
CM CM
SIP
CM PS PS AAM AAM
System Manager
© 2016 Avaya Inc. All right reserved 24
AVAYA AURA® COMMUNICATION MANAGER SERVER
RELIABILITY OVERVIEW
Communication Manager has physically separate active and standby servers providing full redundancy in the event of catastrophic failure
Memory is synchronized between active and standby servers over a duplication link (standard Ethernet link)
– No specialized hardware required
State of health is constantly monitored between both servers
– Arbiter software process controls interchanges
– Immediate interchange to standby server if active server hardware fails
24
Server interchange is transparent to users on stable calls.
Communication Manager provides
– Server software duplication for reliability
– Geo-redundant survivability
Protects against both server failures and network failures
Communication Manager endpoints, media servers, gateways, servers, and applications work together to automatically and quickly recover full service
Enterprise can configure its system design and parameters to meet specific availability requirements
© 2016 Avaya Inc. All right reserved 25
SM
Data Center
1
5
Levels of redundancy
Data Center
2
CM SC
2
Data Center
3-N
CM SC 3
6
1 CM Software Duplicated FS/ES CMs
2 Duplicated Geo-Redundant FS/ES SCs
3 Multiple Geo-Redundant SCs - Simplex
4 Multiple SMs – Geo-Redundant CM Connections
5 Redundant Connections to Phones, Branches
6 Multiple Geo-Redundant Media Servers and Gateways
7 Geo-Redundant PSTN Connections
4
7
8 Branch Survivability
CM SR
8
CM REDUNDANCY DESIGN
SM SM SM
Remote Branch
PSTN
CM Main
1
© 2016 Avaya Inc. All right reserved 26
AVAYA AURA®
COMMUNICATION MANAGER SURVIVABILITY
Main server pair failure or inability to reach main server pair results in survivable server(s) taking over all or part of enterprise network
Survivable server can provide core or branch redundant backup
– Survivable Core Server (SCS fka ESS) and
– Survivable Remote Server (SRS fka LSP)
Administered data is automatically synchronized between main and all survivable servers
Media Servers, port networks, media gateways, endpoints, & applications respond quickly to recover and restore service
Options to meet the needs of enterprise networks with different configurations and availability requirements
26
© 2016 Avaya Inc. All right reserved 27
AVAYA AURA® SURVIVABILITY
SURVIVABLE CORE SERVER (SCS)
Can recover all or any subset of a Main’s PNs, MGs, Media Servers, endpoints, and applications.
SCS servers can be duplicated and can be distributed across different locations
Flexible failover strategy
– PNs, MGs, Media Servers and endpoints have an administered list of survivable servers.
Shuffled calls preserved
– voice/video connections maintained
Full Recovery in less than 10 minutes with 48k BH calls
Fall-back is administrable and can be manual, scheduled, or auto
27
© 2016 Avaya Inc. All right reserved 28
CM SERVER INTERCHANGE – WHAT IS IT AND WHAT CAUSES IT?
An Interchange is the controlled process by which the Active and Standby server swap states.
Spontaneous Interchanges
– Hang/Crash of Active Server
– Standby server is “healthier”
– Hardware problems (bad UPS, overheating CPU, disk errors, fan failures, corrupted memory, NIC failure, etc.)
– Network problems (network sanity failures in links to media gateways / port networks).
– Takes 6-12 seconds
Scheduled Interchanges (via “server –i” command or from Maintenance Web pages)
– Less disruptive
– Near instantaneous and transparent: 3-4 seconds
28
© 2016 Avaya Inc. All right reserved 29
BULLET PROOF SOLUTIONS:
RESILIENCY AND SECURITY FOR AVAYA AURA® UC
Introduction
Reliability Basics
The Avaya Aura® Core Reliability
Avaya Aura® CM and the Media Server
The Avaya Aura® Branch Survivability
Avaya Aura® Survivability Enhanced with VMware
Use Cases
Avaya Aura® Rock Solid Security
© 2016 Avaya Inc. All right reserved 30
HOW DOES THE AVAYA AURA® MEDIA SERVER
AFFECT SURVIVABILITY?
Before support of the AAMS, CM survivability depended on
Port Networks and/or Media Gateways to trigger the CM
failover and recovery
In an AAMS environment, there might not be any PN or MG
So CM relies on status information relayed by AAMS as an
additional trigger for failover and recovery
© 2016 Avaya Inc. All right reserved 31
AAMS high availability clusters are deployed in 1+1 server
pairs in an Active/Hot Standby configuration
Each media server in a 1+1 cluster has a unique IP for
management and intra-cluster communication
A floating IP address is used on the active server and is
acquired by the standby during a failure
Servers monitor each other in real-time using a rapid heart-
beat mechanism
Interruptions in the heartbeat (network, hardware failures,
power failures), events due to process crashes or
management operational state changes trigger a failover
Active and standby servers are equivalent in functionality,
and can assume either state (active or standby) indefinitely.
If previously failed server returns to service, it protects the
active automatically (no fail back required)
Management interfaces allow administrators to “force”
failover for maintenance reasons (upgrades, patches, etc.)
SIP
(S)RTP
Communication
Manager
AMS AMS Hot
Standby
Active
Check Pointing
Health
Checks
Health
Checks
1+1 Avaya Aura Media Server
Cluster
SIP/MSML
MEDIA SERVER HIGH
AVAILABILITY DESIGN
© 2016 Avaya Inc. All right reserved 32
AAMS RELIABILITY - CLUSTERING
An AAMS cluster is two or more servers working together to provide scalability, redundancy and high availability
Two cluster types are defined:
– Load Sharing
– All AAMS servers in the cluster are active and handling traffic (N+1)
– Clusters are typically engineered with a floating spare server to maintain capacity if one server goes down
– Cluster limit at 8 servers.
– Sessions may be lost if an active server goes down (unless the controlling application moves the sessions)
– High Availability
– AAMS servers are deployed in pairs (1+1)
– One server in the pair is active, the other in standby
– Each server can run indefinitely as active or standby
– Automatic recovery of SIP signaling state and media occurs after a failover
– Sessions are preserved if an active server goes down
© 2016 Avaya Inc. All right reserved 33
AAMS1 AAMS2 AAMS3 AAMS4
Main
CM
Surviv
able
Core
MAIN CM FAILURE WITH AAMS
During normal operation, the Main CM and the Survivable Processor are both communicating status to the AAMS
If the Main CM dies, the AAMS will tell the Survivable Processor that the Main CM is no longer reporting status
This will cause the Survivable Processor to become active
Similarly to recovery with Port Networks and Media Gateways, control is not returned to the Main CM until the recovery rule is met (manual, automatic or scheduled.)
© 2016 Avaya Inc. All right reserved 34
NETWORK FAILURE WITH AAMS
If the network fails, AAMS3 and AAMS4 will tell the Survivable Processor that the Main CM is no longer reporting status
This will cause the Survivable Processor to become active to serve the part of the network that it can reach
Meanwhile, the Main CM continues to serve the part of the network that it can reach
The Split Registration Prevention feature, if enabled, prevents phones that registered with Survivable Processor from registering back to Main CM until the recovery is complete
AAMS1 AAMS2 AAMS3 AAMS4
Main
CM
Surviv
able
Core
© 2016 Avaya Inc. All right reserved 35
BULLET PROOF SOLUTIONS:
RESILIENCY AND SECURITY FOR AVAYA AURA® UC
Introduction
Reliability Basics
The Avaya Aura® Core Reliability
Avaya Aura® CM and the Media Server
The Avaya Aura® Branch Survivability
Avaya Aura® Survivability Enhanced with VMware
Use Cases
Avaya Aura® Rock Solid Security
© 2016 Avaya Inc. All right reserved 36
AVAYA AURA® BRANCH SURVIVABILITY
Unparalleled in the industry, allowing for full feature and translation duplication at branches of any size
Full software only model, if SIP only branches
Supports SIP, H.323, legacy trunking, etc…
Supports full contact center functionality for agents in the branch
Provided by a Communication Manager Survivable Remote Server and a Branch Session Manager
36
© 2016 Avaya Inc. All right reserved 37
AVAYA AURA® SURVIVABILITY
COMMUNICATION MANAGER SURVIVABLE
REMOTE SERVER (SRS)
Can recover Media Servers, MGs and endpoints. Not PNs.
Provides branch survivability
Sits in a media module in a GW or a standalone simplex server.
Flexible failover strategy
– MGs and endpoints have an administered list of survivable servers.
Fall-back is administrable and can be manual, scheduled, or auto.
© 2016 Avaya Inc. All right reserved 38
PSTN PSTN
FULLY REDUNDANT BRANCH SURVIVABILITY
Unprecedented complete feature functionality at the Branch
Primary Service
Primary Service
Backup Service
Backup Service SIP
SIP SIP
SIP
SIP-ISC
SIP-ISC
PSTN
SM SM SM
© 2016 Avaya Inc. All right reserved 39
Original CM-based version for H.323 endpoints
DIAL PLAN TRANSPARENCY – USERS DIAL THE SAME
NUMBERS IN SUNNY AND RAINY DAY
Simple DPT Scenario
G450
Public Voice
Network
If G450 registers with LSP, calls from
G450 are routed over Public Voice
Network (TG3TG1 or TG3TG2)
Network
Region 1
Network
Region 3
Trk Grp 1
Trk Grp 3 Trk Grp 2
G650
Port
Networks
LSP – S8300 WAN
OUTAGE
G430
G430
Network
Region 2
Enterprise Data Network
CM server
© 2016 Avaya Inc. All right reserved 40
Updated version for SIP endpoints
DIAL PLAN TRANSPARENCY
Simple DPT Scenario
BSM
Public Voice
Network
If G450 registers with LSP, calls from
G450 are routed over Public Voice
Network (TG3TG1 or TG3TG2)
Network
Region 1
Network
Region 3
Trk Grp 1
Trk Grp 3 Trk Grp 2
G650
Port
Networks
LSP – S8300 WAN
OUTAGE
G430
G430
Network
Region 2
Enterprise Data Network
Avaya Aura core
(CM / SM)
© 2016 Avaya Inc. All right reserved 41
BULLET PROOF SOLUTIONS:
RESILIENCY AND SECURITY FOR AVAYA AURA® UC
Introduction
Reliability Basics
The Avaya Aura® Core Reliability
Avaya Aura® CM and the Media Server
The Avaya Aura® Branch Survivability
Avaya Aura® Survivability Enhanced with VMware
Use Cases
Avaya Aura® Rock Solid Security
© 2016 Avaya Inc. All right reserved 42
DELIVERING RELIABILITY AND RESILIENCY
Avaya Aura application availability
Communication Manager – Software Duplication for failover
– Survivable Core and Survivable Remote Geo redundancy
Session Manager – Active-active clustering,
N+M routing
System Manager – Geo-redundancy
VMware Avaya Aura VE is tested with
VMware availability features
vMotion
vMotion Storage
VMware High Availability
VMware Snapshot
Avaya Aura Virtualized Environment
© 2016 Avaya Inc. All right reserved 43
ESXi Cluster 1 & 3 ESXi Cluster 1
AVAYA AURA REDUNDANCY
CM Software Duplication & VMware HA
Communication
Manager A
ESXi Cluster 2
Network
CM Software
Duplication
VMware
Host
Communication
Manager B
Data
Center
Communication
Manager A - 2
VMware HA
Communication
Manager A
© 2016 Avaya Inc. All right reserved 44
ESXi Cluster 1 & 3 ESXi Cluster 1
AVAYA AURA REDUNDANCY
CM Software Duplication & VMware HA
Communication
Manager A
ESXi Cluster 2
Network
VMware
Host
Communication
Manager C
Data
Center
Communication
Manager B
Communication
Manager A
Data Center 1 Data Center 2
CM Survivable
Core
CM Software
Duplication
© 2016 Avaya Inc. All right reserved 45
BULLET PROOF SOLUTIONS:
RESILIENCY AND SECURITY FOR AVAYA AURA® UC
Introduction
Reliability Basics
The Avaya Aura® Core Reliability
Avaya Aura® CM and the Media Server
The Avaya Aura® Branch Survivability
Avaya Aura® Survivability Enhanced with VMware
Use Cases
Avaya Aura® Rock Solid Security
© 2016 Avaya Inc. All right reserved 46
A Simple Active-Active Configuration with common CM and SCS – in a single data center
Configuration
• Phones simultaneously register with multiple SMs
• SM1 and SM2 serve SIP endpoints in Active-Active configuration
• CM and SCS act in a Active-Standby mode
Phone1 Phone2
SM1 SM2
SCS
CM
Note: Media Gateway is not
shown for brevity (CM and
SC need media resources
even for SIP endpoints)
Data Center
Phone1 Phone2
SM1 SM2
SCS
CM
Data Center
Phone1 Phone2
SM1 SM2
SCS
CM
Data Center
Sunny Day
Primary SM fails
Primary CM fails
x
x
GEO-REDUNDANCY CONFIGURATIONS
© 2016 Avaya Inc. All right reserved 47
An Active-Active Configuration with separate CM and SCS – in a dual data center configuration
Configuration
• Phones simultaneously register with multiple SMs
• SM1 and SM2 serve SIP endpoints in Active-Active Config
• Two pairs of CMs and SCSs provide geo-redundancy
Phone1 Phone2
SM2 SM1
SCS-2 SCS-1
CM-1 CM-2
Note: Media Gateway is not shown for brevity (CM and
SC need media resources even for SIP endpoints)
Data Center 1 Data Center 2
Phone1 Phone2
SM2 SM1
SCS-2 SCS-1
CM-1 CM-2
Data Center 1 Data Center 2
Phone1 Phone2
SM2 SM1
SCS-2 SCS-1
CM-1 CM-2
Data Center 1 Data Center 2
x
x Sunny Day
Primary SM fails
Primary CM fails
GEO-REDUNDANCY
CONFIGURATIONS
© 2016 Avaya Inc. All right reserved 48
EXAMPLE – AGENT CALLS IN QUEUE --
CALL PRESERVATION FOR SIP TRUNKS ON SM FAILURE
State of affairs prior to the SM Call Preservation support
– The active-active SM redundancy approach ensured SIP clients and other network elements
could establish new calls via the alternate SM when one instance of SM failed
– As media (RTP stream) does not pass through SM, SIP elements could continue to maintain the
media for the existing calls past the SM failure
– The calls in progress phase (with pending SIP transactions) would timeout – callers needed to
dial again -- Not good for caller in queue that are lost!
Problem Domain
– With typical Contact Center based configurations, a large number of inbound calls (PSTN
Agents) stay “in-queue” while waiting for the availability of agents
– As these calls are in “progress” and not active, the calls were dropped when SM switched over
– It was not acceptable to expect those large number of “in-queue”
callers to call again when SM failed
© 2016 Avaya Inc. All right reserved 49
The SM Call Preservation feature ensures the routing function (the SRE) continues to carry traffic uninhibited via alternate SM
Instead of the IP addresses, SM inserts Failover Group Domain Name (FGDN) in the pertinent SIP headers of the requests
The SM FGDN resolves to an ordered set of SM instances in the order of the preferred SM instance
In case an SM fails over in the middle of a call, SM peer elements continue the message exchange with the alternate SM – derived from SM FGDN resolution
– Mid-transaction and mid-dialog message exchange is successfully carried over the alternate SM
Requires peer elements to support domain resolution
SOLUTION OVERVIEW
© 2016 Avaya Inc. All right reserved 50
The solution covers:
– Contact Center configuration with H.323 Agents only
– The existing configuration comprises of SM, CM, AAEP and M3K
– ASBCE support coming soon (1H’2016)
The solution expects SM’s peer element to support domain resolution – while being standard compliant, not many third-party elements support it
KEY POINTS
© 2016 Avaya Inc. All right reserved 51
M3k/G860
CMcc
SM-1 SM-2
M3K /
G860
SM-1
CCElite
SM-2
Physical View Logical View
H.323 based
Agent Devices
H.323 based Agent
Devices
• When one SM goes down, the other SM can serve responses and in-dialog requests (required for the mid-call features) that were originally homed on the down SM
• New calls are established via SM-2 – these calls stay with SM-2 even after SM-1 is restored
x
After the SM handling calls goes down, the
existing calls are handled via the
alternate SM
SIP TRUNK CALL PRESERVATION
CONFIGURATION # 1
CCElite (without AAEP)
PSTN
PSTN
© 2016 Avaya Inc. All right reserved 52
• When one SM goes down, the other SM can serve responses and in-dialog requests that were originally homed on the down SM
G860
AAEP/ICR CM
SM-1 SM-2
G860
SM-1
AAEP/ICR CM
SM-2 x
SIP TRUNK CALL PRESERVATION
CONFIGURATION # 2
CCElite (with AAEP/ICR)
After the SM handling calls goes down, the
existing calls are handled via the
alternate SM
Physical View Logical View
PSTN
PSTN
© 2016 Avaya Inc. All right reserved 53
BULLET PROOF SOLUTIONS:
RESILIENCY AND SECURITY FOR AVAYA AURA® UC
Introduction
Reliability Basics
The Avaya Aura® Core Reliability
Avaya Aura® CM and the Media Server
The Avaya Aura® Branch Survivability
Avaya Aura® Survivability Enhanced with VMware
Use Cases
Avaya Aura® Rock Solid Security
© 2016 Avaya Inc. All right reserved 54
HIGH SECURITY PROFILE IN AVAYA AURA SOLUTION
Access Control
Restricted access to
applications and services
Authentication of all
users with two-factor
authentication
Authorization of all user
actions is usually
enforced via Role based
access control (RBAC) or
mandatory access control
Secure Communication
Securing all traffic flow: Use TLS (but no SSL) to secure
Signaling and Data paths; Secure Media path with SRTP
Confidentiality: Encryption of all network traffic using standard
protocols and standard encryption algorithms (AES preferred)
Integrity: the ability to verify the integrity of all communications
usually a keyed hash is used to provide message integrity
(HMAC-SHA2 preferred, SHA1 should be retired)
Non-repudiation: Build in ability to irrefutably identify the parties
in a call or SIP session by means of security logs and use of PKI
certificate extension
© 2016 Avaya Inc. All right reserved 55
HIGH SECURITY PROFILE IN
AVAYA AURA SOLUTION (CONT’D)
Secure Network
Network Access Control / 802.1x for all endpoints
Separation of Voice, Data and management traffics using VLAN with different VLAN IDs
Firewall and SBC to detect and defend against intrusions
Access Control by means of building a ‘white list’
Secure Servers
Harden the Operating System to meet customer specifications
No USB tokens, or drives or cameras. No NFS-mounted file systems
Close all unnecessary or unused ports and user accounts
Auditability of the Environment
Audit trail of all administrative action login, logout
Generate Call Detail Records of all calls
Log and generate alarm for any system abnormalities
© 2016 Avaya Inc. All right reserved 56
SYSTEM MANAGER AS A CERTIFICATE AUTHORITY (CA)
System Manager is by default a Root CA (self-signed root certificate) or can be setup as a Sub-CA (from a Third Party Certificate Authority)
Uses a third-party open source application, Enterprise Java Beans Certificate Authority (EJBCA), to issue identity and trusted certificates to applications through Simple Certificate Enrollment Protocol (SCEP)
System Manager Trust Management provisions and manages certificates of various applications, such as servers and devices, enabling the applications to have secure inter-element communication
System Manager generates Certificates using SHA2 as the signing algorithm and 2048 bits as the defaulted key size
© 2016 Avaya Inc. All right reserved 57
AN IMPORTANT UPDATE ABOUT AVAYA DIGITAL
CERTIFICATE POLICY CHANGE
Avaya products are using Digital Certificates for some time. However… – Non-unique Avaya SIP Product CA Certificates were used previousely acrosss products to
provide out-of-box lab/demo/default support for TLS.
– The Avaya Default certificate is a SHA1/1024 identity certificate no longer meets current NIST standards.
New Policy:
New greenfield installations do not come with default CA certs – New SM servers no longer use the Avaya “SIP Product CA” issued Default Certificates.
– The new generation of Avaya Communicator Clients will no longer trust demo Certificates and require servers to have certificates issued by a trusted CA in order for the client to establish a secure connection.
A new CA certificate must be created via SMGR EJBCA (recommended), Enterprise (Private) CA, or a third party trusted root authority. The Default certificate that came with each product should be upgraded.
All products deployed in an Enterprise must be on the same secure hash algorithm and bit size key across a solution i.e., – SHA1 2048 or SHA2 2048
© 2016 Avaya Inc. All right reserved 58
Centralized Certificate Management
Trust Management service that provisions and manages certificates of various applications, servers, and devices thereby enabling a secure, inter-element communication
Support for https for System Manager administration - System Manager uses certificates to create a trust domain where trust relationships are established between hosts and services to enable secure communication (e.g. Transport Layer Security)
Central management of Identity and trusted certificates using System Manager web console
Support for 3rd Party certificate
Support for element/application registration and Public‐Key Infrastructure (PKI), including trust management for the elements. The managed elements require the enrollment password to request certificates from the System Manager Trust Management
SHA2 Support
AVAYA AURA SYSTEM MANAGER SINGLE SIGN ON AND
CENTRAL MANAGEMENT
Embedded identity management (OpenSSO) • Multiple Administrators can access Multiple Devices
through the System Manager User Interface
• Administrators Login Once to the System Manager User
Interface and can manage Multiple Devices without the
need to login separately into each device
Enterprise level authentication • Active Directory
• Sun LDAP 5.2
• OpenLDAP
• RADIUS
© 2016 Avaya Inc. All right reserved 59
AVAYA AURA SESSION MANAGER SECURITY OVERVIEW
© 2016 Avaya Inc. All right reserved 60
BEST PRACTICE #1:
SECURE ALL DATA IN TRANSMISSION
Goal: Ensure ALL data transmitted are in an encrypted format. No
clear text transmission is allowed.
Best Practice:
– Removal of SSL v3 (SSLv3 is vulnerable to POODLE, (a protocol flaw) even if
it also supports more recent versions of TLS).
– All communications between the client and the servers in the Avaya Aura
environment can be secured using Transport Layer Security (TLS) protocol.
This includes signaling and data traffic.
– TLS 1.2 is used by default, with the capability to enable/disable TLS 1.0 and
1.1 to provide backward compatibility Use SRTP (RFC 3711), SDP (RFC
4568) with Cap-Neg (RFC 6871) to secure media traffic.
© 2016 Avaya Inc. All right reserved 61
BEST PRACTICE #2:
USE A MORE STRINGENT CRYPTOGRAPHY MODULE
Goal: Replace cryptographic modules, algorithms, and random number generator in your application with FIPS 140-2 compliant ones if the application implements encryption, key exchange, digital signature, and hash functionality.
Best Practice:
– Set the OS security module in 'FIPS' mode.
– The applications, including SSH, must be configured to only use FIPS 140-2, approved ciphers.
– The FIPS-2 supported cryptographic modules can be found at: http://csrc.nist.gov/publications/fips/fips140-2/fips1402annexa.pdf
© 2016 Avaya Inc. All right reserved 62
BEST PRACTICE #3: ENSURE ACCOUNT PASSWORDS
AND KEYS ARE STORED IN AN APPROVED ENCRYPTED FORMAT
Goal: To ensure passwords and private keys are stored in an encrypted
format that cannot be seen/decrypted
Best Practice:
– Store (current and N-previous) passwords/PINs in SHA256 hashed form is
recommended
– In the Avaya Aura® solution, System Manager (SMGR) is responsible for ensuring
storage of account passwords in an approved encrypted format.
– Password/PIN/Key file content should never be in clear-text and the filename need to be
obfuscated and accessible only by the proper application.
– Database/DBMS files should be protected with no general user
logins/access but applications.
© 2016 Avaya Inc. All right reserved 63
BEST PRACTICE #4: HARDENING THE OS PLATFORM
Goal: Avaya Aura® uses the hardened Red Hat Enterprise Linux operating system as a secure foundation, which can provide the platform security by limiting the number of access ports, services, and executables and protects the system from typical modes of attack.
Best Practice:
– Remove unused Linux RPM modules
– Close unnecessary IP ports/accounts
– File system partitions for the directories /tmp and /var is secured by setting noexec and nosuid
– Run ‘auditd’ to log the usage of the privileged commands such as unauthorized attempts to delete or modify files, change system time, scheduled jobs, change permissions, and add accounts
– Monitor the system integrity using Linux AIDE, SELinux, and HBSS
© 2016 Avaya Inc. All right reserved 64
BEST PRACTICE #5:
ENFORCE THE NETWORK LAYER SECURITY
Goal: To prevent attacks to the Avaya Aura® Core at the network layer
Best Practice:
– VLAN separating the management traffic and SIP signaling traffic on different VLANs
– Use Linux ACL (e.g. via Linux IPTables) as the network firewall to close
unused TCP or UDP ports
– SIP application firewall to set up rules to rate-limit the SIP messages to prevent SIP
message spams
– HTTP/HTTPS DoS Protection by rate-limiting the number of connections (e.g. 3
HTTP/HTTPS connection per remote entity) and packet rate (e.g. < 200 packets/sec)
– Authenticate the remote service logins (sroot, inads, rasaccess, init, craft) via
ASG/EASG to prevents unauthorized access
© 2016 Avaya Inc. All right reserved 65
BEST PRACTICE #6:
ENFORCE APPLICATION LAYER SECURITY
Goal: To ensure Avaya Aura® application layer security
Best Practice:
– Implement Role Bases Access Control (RBAC) to categorize and assign different roles
and privileges to a certain user(s).
– Implement and enforce password/PIN complexity policy (e.g. password aging,
complexity, length, characters combination, and reuse requirements)
– Enforce active sessions and inactive session timeout
– Run system audit, security/syslog logs always
– Ensure proper reporting of events or alarms on system security events or anomalies.
© 2016 Avaya Inc. All right reserved 66
BEST PRACTICE #7:
USE PKI FOR CERTIFICATE MANAGEMENT
Goal: To ensure the Avaya Aura® system uses the PKI infrastructure for
certificate management and processing
Best Practice:
– Avoid using default/demo certificate
– Ensure checking on certificate trust chain and path processing in all accounts
– Build in mutual authentication by means of entity certificates in the certificate trust chain
– Validate certificate revocation status use OCSP, CRL or CTL
– Ensure the received SIP message matches with the identity says in the identity
certificate of the sending party
© 2016 Avaya Inc. All right reserved 67
BEST PRACTICE #8: COMPLY TO BOTH THE FEDERAL
AND COMMERCIAL SECURITY STANDARDS
Goal: Comply to the federal and commercial industry security standards to
ensure maximum security posture and minimum risk of security breaching
Best Practice:
– Allow user authentication and authorization via an external AAA/RADIUS server
– System must support regulatory compliance such as FISMA (for Federal and Federal
IS), HIPAA (for healthcare/health insurance industry), PCI (for credit card and payment
account data security) and be compliant to the more stringent federal standards such as
JITC/DISA STIGs, NIST US-CERT alerts (CVEs/CCEs vulnerabilities).
Visit Our Smart City
6:30pm 9:00pm
Monday, April 4
7:00am 8:30am
12:15pm 1:30pm
Tuesday, April 5
7:00am 8:30am
12:15pm 1:30pm
6:00pm 8:00pm
Wednesday, April 6
7:00am 8:30am
(Expo closes after
breakfast)
Thursday, April 7
Expo Hours
Complete your survey at the end of the session in the Mobile App
✓
✓
✓
✓
✓