increasing ip network survivability: an introduction to protection mechanisms
DESCRIPTION
Increasing IP Network Survivability: An Introduction to Protection Mechanisms. 20. October 22, 2000. Jonathan Sadler Lead Engineer - ONG SE. Motivation. There is increasing demand to carry mission critical traffic, real-time traffic, and other high priority traffic over the public internet - PowerPoint PPT PresentationTRANSCRIPT
File Name
Increasing IP Network Survivability:Increasing IP Network Survivability:An Introduction to Protection MechanismsAn Introduction to Protection Mechanisms
Jonathan SadlerJonathan SadlerLead Engineer - ONG SELead Engineer - ONG SE
October 22, 2000
20
File Name
MotivationMotivation
There is increasing demand to carry mission critical traffic, real-time traffic, and other high priority traffic over the public internet
Any network that carries critical, high-priority traffic needs to be resilient to faults
As network technologies continue to improve and converge, protection and restoration schemes have become available at multiple layers
File Name
ProtectionProtection
What is it?
Automated mechanism for recovering traffic path
Invoked when the current working path fails
Requirements
Fast restoration time
Voice / video / data can tolerate small outages ( 50ms)
Predictable
Protection path is pre-determined
Can be dedicated (1+1) or shared (M:N)
Can be preemptive
File Name
ProtectionProtection
How is protection different from dynamic rerouting?
Dynamic rerouting develops a new path utilizing current network state information
Delay incurred as state updates are flooded through network
Time to re-converge on new end-to-end path is long
Therefore time until destinations become re-reachable is long
Side Effect: State information will be received by nodes that are not involved in restoration causing unnecessary CPU usage
While best effort services may tolerate this behavior, new services will not
VoIP
Virtual leased line
File Name
Protection DomainsProtection Domains
Method of dividing up a network into separate sub-networks in which a protection mechanism will operate
Cross domain coordination is required
File Name
Protection TopologiesProtection Topologies
Within a protection domain, a number of protection topologies may be used
Linear
Ring
Mesh
For any topology the following terminology applies:
Working: The path or span being used to carry live traffic
Protect: The path or span that will be used to recover live traffic
File Name
Protection Topologies - LinearProtection Topologies - Linear
Two nodes connected to each other with two or more sets of links
Working Protect Working Protect
(1+1) (1:n)
File Name
Two or more nodes connected to each other with a ring of links
Line vs. Drop interfaces
East vs. West interfaces
Protection Topologies - RingProtection Topologies - Ring
E
W
W
E
W
EW
E
D
LL
Working Protect
File Name
Protection Topologies - MeshProtection Topologies - Mesh
Three or more nodes connected to each other
Can be sparse or complete meshes
Spans may be individually protected with linear protection
Overall edge-to-edge connectivity is protected through multiple paths
Working
Protect
File Name
Protection MechanismsProtection Mechanisms
Protection mechanisms are the algorithms which will restore services carried by a specific network topology
Typically take advantage of topology characteristics
Two different approaches exist
Link oriented
Multiple links that support end-to-end connectivity can be individually switched to restore service
Path oriented
Two paths exist which can be “globally” switched to restore service
File Name
Protection Mechanisms - Linear APSProtection Mechanisms - Linear APS
Two nodes connected to each other with two or more sets of transmission facilities
Receiving node will signal source node to change from working to protect facility via out-of-band communication
Working Protect
Protect Working
“Switchover”
“OK”
File Name
Bi-directional = both directions are handled as one unitLine Switched = multiple nodes reconfigure line behavior Ring
Node that determines need for change will signal out-of-band to other node. All intermediate nodes on protect path then “reconfigure”.
Pros: Efficient
Cons: Not as fast asother protectionmechanisms
Z-AWorking
A-ZProtect
Z-AProtect
A-ZWorking
Z-AWorking
A-ZWorking
Z-AProtect
A-ZProtect
?
Z
A
?
Protection Mechanisms - BLSRProtection Mechanisms - BLSR
“Switchover”
“OK”
File Name
Protection Mechanisms - BLSR cont’dProtection Mechanisms - BLSR cont’d
How is this efficient?
Each node is involved in reconfiguring when a protection switch is necessary. Consequently, each node knows if the bandwidth reserved for a service is actually in use.
If a specific route is declared the “primary route” for the service, then the protect path will only be used when trying to restore a failure on the primary route.
As a result, it is possible to insert a second signal on the protect path.
When a protection switch is necessary to handle the higher priority traffic, then the “Extra Traffic” will be removed by the nodes as part of the switchover activity.
File Name
Protection Mechanisms - BLSR cont’dProtection Mechanisms - BLSR cont’d
Why is more time needed for a protection switch?
Signaling latency
Traffic cross connect activation / deactivation in intermediate nodes
Definitely needed when Extra Traffic is in use
File Name
Unidirectional = Each traffic direction is independentPath Switched = Not handled “node-by-node”Ring
Source generates two copies of signal
Destination evaluates both copies and chooses “best path” signal
Pros: Low switch time
Cons: Not efficient?
Protection Mechanisms - UPSRProtection Mechanisms - UPSR
A-ZProtect
A
Z
Z-AWorking
A-ZWorking
A-ZProtect
A-ZWorking
Z-AProtect
?
File Name
Protection Mechanisms - MeshProtection Mechanisms - Mesh
End-to-End Path Oriented
Requires:
Topology Discovery
Constrained Route Selection (x2)
Primary route
Protection route Resource affinity (diversity)
Signaling Protocol
Service setup
Protection switchover
No standard solutions (yet)
File Name
Protection Mech. - Revertive SwitchingProtection Mech. - Revertive Switching
Once the failed path has been restored, should the traffic be moved back?
Non-revertive Switching
Done when failed path is no longer going to be used with service (i.e. service rolls)
Revertive Switching
Automatic System determines primary path is acceptable Wait to Restore Time
Manual Technician determines primary path is acceptable Good in cases where the fault is experienced only under load
File Name
Protection Domain ConsiderationProtection Domain Consideration
What should be the scope of repair?
Global Repair
Traffic is restored using facilities within the global network
Local Repair
Traffic is restored using the minimum amount of facilities
Lacks network view, leading to potentially inefficient resource utilization
File Name
Protection HierarchyProtection Hierarchy
Protection functionality is defined for:
Optical Layer
SONET
ATM / Frame Relay
MPLS / IP
How should all these layers interact?
They shouldn’t
File Name
Two Layer Recovery ModelTwo Layer Recovery Model
Most providers are adopting a two-layer model, where:
Very-fast bulk restoration is done as close to the transport media as possible
Optical Switching
SONET where Optical Switching is not available
Service level restoration is done at the specific service layer
SONET -- VT1.5, STS-1, STS-3c, STS-12c, STS-48c services
ATM / FR -- Switched Data Services
MPLS -- IP Services
Layers in between are not used for restoration
Service level restoration timers are set so that transport restoration can be attempted first
File Name
Two Layer Recovery Model - Why?Two Layer Recovery Model - Why?
Why have two layers instead of one?
Optical switching allows for the greatest number of services to be restored with the least amount of overhead
Optical switching will find out about physical failures first
Loss of light
Optical AIS
Optical protection domains are typically smaller than service-level protection domains, reducing signaling time
Service layers understand service specific performance requirements best, but may have a large number of services to restore
File Name
Protection in SONET/SDHProtection in SONET/SDH
Topologies / Mechanisms Available
1+1 Linear APS
UPSR
BLSR
2-fiber Restoration channels must be reserved, reducing protected capacity
4-fiber -- two sets of Tx/Rx fibers for each line interface Span Switch: Can restore by utilizing alternate Tx/Rx fibers Ring Switch: Utilizes restoration channels located on a separate ring
Extra Traffic possible
APS, BLSR signaling done in K0 / K1 bytes of overhead
File Name
Protection in SONET/SDH (cont’d)Protection in SONET/SDH (cont’d)
Failure Criteria
Loss of Signal (LOS)
Loss of Frame (LOF)
Threshold Crossing
Bit Error Rate (BER)
Coding Violations (CV)
Excessive SONET Pointer Justifications
Alarm Indication Signal (AIS)
File Name
Applying Protection to MPLSApplying Protection to MPLS
What does this do for me?
Provides fast restoration of MPLS services
Can be done on a service-by-service basis. For example:
Best effort could be biased to use Extra Traffic links
Bronze could be put on unprotected, but avoid Extra Traffic
Silver could be protected 1:n
Gold could be protected 1+1
File Name
Applying Protection to MPLS - How?Applying Protection to MPLS - How?
Perform constraint based route selection for primary path
Working
Protect
Signal creation of working path LSP
Perform constraint-based route selection for secondary path, adding a constraint which removes links that do not meet diversity requirements
Signal “reservation” of protectpath LSP
File Name
Applying Protection to MPLS - How?Applying Protection to MPLS - How?
Extensions to IS-IS / OSPF
Utilizes the same Constraint Routing extensions as TE
New constraint: Shared Resource Link Group (SRLG)
Used for diversity determination
Extensions to CR-LDP / RSVP-TE
Add Protection LSP declaration to ERO
Add Reverse Notification Tree & Fault Notification Messages
File Name
MPLS Protection - General Mesh Mech? MPLS Protection - General Mesh Mech?
End-to-End Path Oriented
Requires:
Topology Discovery
Constrained Route Selection (x2)
Primary route
Protection route Resource affinity (diversity)
Signaling Protocol
Service setup
Protection switchover
OSPF w/ TEIS-IS w/ TE
RSVP-TECR-LDP
File Name
Benefits of a Generalized Control PlaneBenefits of a Generalized Control Plane
Extension of MPLS to non-IP technologies allows for:
Rapid provisioning of lower layer connections
Optical trails
SONET / SDH trails
Cut-through connections
Reduces traffic load on core routers
Extension of IP semantics (i.e. diff-serv)
Validates services that paid for protection are protected
File Name
Cut-through connection (simplified example)Cut-through connection (simplified example)
Four IP Routers operating over Optical Network
Initial overlay network connects routers in a hub / spoke topology
A
B
CD
New direct path is now used for A-D traffic
High traffic load exists between Router A and D
Router A realizes need for direct path (based on link load threshold crossing), and signals request for path into network
File Name
SummarySummary
New services require mechanisms to recover working traffic as fast as possible
Optical Layer protection tools provide restoration with the least amount of overhead
Service Layer protection is also necessary
MPLS-TE with extensions can provide protection support for IP Networks
Can be extended to support any mesh network
Use of MPLS to integrate Optical and IP control planes allows IP service semantics to control protection mechanisms used at lower layers
File Name
Sample DeploymentSample Deployment
S O N E TR ing
S O N E TR ing
S O N E TR ing
S O N E TR ing
DACS
DACS
DACS
DACS
T1
Router
DACS
DACS
LATA Router
LATA Router
LA TA LA TA
Distrib. Router
Distrib. Router
Distrib. Router
Distrib. Router
Core Router
Core Router
Core Router
Core Router
Core Router
In te r-connec tion
P o in t
In te r-connec tion
P o in t
In te r-connec tion
P o in t
In te r-connec tion
P o in t
ADM
T1
Router
Router
OC
3 w/
AP
S
Core RouterRouter
OC
3 w/
AP
S
OXC
OXC OXCOXC
OXC OXC
OXC
OXC
OXC
OXC
OXC
OXC
OXC
OXC OXC
OXC OXC
OXCOXC
OXC
OXC OXC
OXC OXC
OXC
Long-H au l N etwork
DACS
DACS
LATA Router
LATA Router
Core Router Core Router
In te r-connec tion
P o in t
Core Router Core Router
OXC OXC
ADM
File Name
Sample Deployment - LATASample Deployment - LATA
SONET Protection inLocal Loop Network
IP Mesh Protection inDistribution Networkfor IP services
SONET Protection inDistribution Networkfor Private Line services
S O N E TR ing
S O N E TR ing
DACS
DACS
DACS
DACS
T1 LATA Router
LATA Router
Router
LA TA
Distrib. Router
D istrib. Router
Router
OC
3 w/
AP
S
File Name
Sample Deployment - Long-HaulSample Deployment - Long-Haul
Private Line and IP services are clients of Optical Core Network
Optical Core Network is a sparse mesh protectedby MPLS mechanisms
OXC
OXC OXCOXC
OXC OXC
OXC
OXC
OXC
OXC
OXC
OXC
OXC
OXC
OXC OXC
OXC
OXC OXC
OXCOXC
OXC
OXC OXC
OXC OXC
OXC
Long-Haul Netw ork
File Name
ReferencesReferences
GR-253-CORE, “Synchronous Optical NETwork (SONET) Transport Systems: Common Generic Criteria,” Issue 2 rev 2, (Bellcore, January 1999)
GR-1230-CORE, “SONET Bi-directional Line Switched Ring (BLSR) Equipment Generic Criteria,” Issue 4, (Bellcore, December 1998)
GR-1400-CORE, “SONET Dual-Fed Unidirectional Path Switched Ring (UPSR) Equipment Generic Criteria,” Issue 2, (Bellcore, January 1999)
draft-owens-te-network-survivability-00.txt, “Network Survivability Considerations for Traffic Engineered IP Networks,” (IETF, March 2000)
draft-ietf-mpls-recovery-frmwrk-00.txt, “Framework for MPLS-based Recovery,” (IETF, September 2000)
draft-chang-mpls-path-protection-01.txt, “A Path Protection / Restoration Mechanism for MPLS Networks,” (IETF, July 2000)
draft-chang-mpls-rsvpte-path-protection-ext-00.txt, “Extensions to RSVP-TE for MPLS Path Protection,” (IETF, June 2000)
File Name