exploiting network kriging for fault localization · exploiting network kriging for fault...
TRANSCRIPT
![Page 1: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/1.jpg)
Exploiting Network Kriging for Fault Localization
K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1
OFC 2016
1: Computer Engineering and Informatics Department, University of Patras, and Computer Technology Institute and Press, Patras, Greece
2: Scuola Superiore Sant’Anna, Pisa, Italy
![Page 2: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/2.jpg)
Introduction
![Page 3: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/3.jpg)
Introduction• Evolution toward high flexibility:
• transmission parameters optimized for setup and changed in case of degradations/faults • reduction of worst-case margins to reduce costs (e.g., [a])
• soft failures (e.g., implying Quality of Transmission – QoT – degradations) more frequent à need of monitoring to re-act to both soft- and hard-failures • At the control plane, ABNO is emerging as an architecture for the control and management
• ABNO OAM Handler responsible to receive alarms, to correlate the alarms, and to take actions to preserve services
• Network Kriging (NK) [b]: mathematical framework used for correlation !In this paper: • a correlation framework based on NK for (soft or hard) fault localization • if correlation does not solve localization (ambiguity): setup of new lightpaths with the scope of identifying
unambiguously the failed elements • alternatively, pre-establishment of LPs for monitoring and failure localization
• We also propose a heuristic Failure Localization-Aware Routing and Spectrum Allocation (FLA-RSA) algorithm that provisions lightpaths with the objective of reducing the failure localization ambiguity
[a] Y. Pointurier, invited talk OFC 2016 [b] D.B. Chua et al., IEEE JSAAC 24(1)
![Page 4: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/4.jpg)
Control&management workflow for failure localization
YES
NO
Detected failed lightpaths Notify OAM Handler
OAM Handlertriggers re-routing of interested lightpaths
Link maintenance can take place
OAM Handler exploits NK for
fault localization
Exploit monitoring as a service plus
NK for fault localization
Is localization unambiguous?
Assumption: lightpath monitors based on DSP employed at the receiver
![Page 5: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/5.jpg)
Control&management workflow for failure localization
YES
NO
Detected failed lightpaths Notify OAM Handler
OAM Handlertriggers re-routing of interested lightpaths
Link maintenance can take place
OAM Handler exploits NK for
fault localization
Exploit monitoring as a service plus
NK for fault localization
Is localization unambiguous?
Assumption: lightpath monitors based on DSP employed at the receiver
![Page 6: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/6.jpg)
Control&management workflow for failure localization
YES
NO
Detected failed lightpaths Notify OAM Handler
OAM Handlertriggers re-routing of interested lightpaths
Link maintenance can take place
OAM Handler exploits NK for
fault localization
Exploit monitoring as a service plus
NK for fault localization
Is localization unambiguous?
Assumption: lightpath monitors based on DSP employed at the receiver
![Page 7: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/7.jpg)
Control&management workflow for failure localization
YES
NO
Detected failed lightpaths Notify OAM Handler
OAM Handlertriggers re-routing of interested lightpaths
Link maintenance can take place
OAM Handler exploits NK for
fault localization
Exploit monitoring as a service plus
NK for fault localization
Is localization unambiguous?
Assumption: lightpath monitors based on DSP employed at the receiver
![Page 8: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/8.jpg)
Control&management workflow for failure localization
YES
NO
Detected failed lightpaths Notify OAM Handler
OAM Handlertriggers re-routing of interested lightpaths
Link maintenance can take place
OAM Handler exploits NK for
fault localization
Exploit monitoring as a service plus
NK for fault localization
Is localization unambiguous?
Assumption: lightpath monitors based on DSP employed at the receiver
![Page 9: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/9.jpg)
Network Kriging concept
• Suppose LP1, LP2, LP3 established and monitored • LP4 to be estimated • Given additive metric y s.t. y=Gx
• y=e2e metric, G=routing matrix, x=link metric
• [y’m y’n]=[G’m G’n]x, where by m we represent the lightpaths for which monitoring data are available and by n those that it should be estimated
• Can estimate ŷ4 given y1, y2, y3
• Estimation technique = “network kriging”
yn=GnGm(GmGTm)+ym
![Page 10: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/10.jpg)
• Assumption: single link failure
Localization problem formulation with NK
![Page 11: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/11.jpg)
• Assumption: single link failure• We define the Active Parameter (AP)
Localization problem formulation with NK
![Page 12: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/12.jpg)
• Assumption: single link failure• We define the Active Parameter (AP)
• APl =1 if link l is active, while APl=0 if it has failed (these are the unknown parameters)
Localization problem formulation with NK
![Page 13: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/13.jpg)
• Assumption: single link failure• We define the Active Parameter (AP)
• APl =1 if link l is active, while APl=0 if it has failed (these are the unknown parameters)
• APp=N if path p traversing N links is active, otherwise, if it is failed APp=N-1
Localization problem formulation with NK
![Page 14: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/14.jpg)
• Assumption: single link failure• We define the Active Parameter (AP)
• APl =1 if link l is active, while APl=0 if it has failed (these are the unknown parameters)
• APp=N if path p traversing N links is active, otherwise, if it is failed APp=N-1 • yn is set to the APp of monitored lightpaths, G’n is the corresponding routing
Localization problem formulation with NK
![Page 15: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/15.jpg)
• Assumption: single link failure• We define the Active Parameter (AP)
• APl =1 if link l is active, while APl=0 if it has failed (these are the unknown parameters)
• APp=N if path p traversing N links is active, otherwise, if it is failed APp=N-1 • yn is set to the APp of monitored lightpaths, G’n is the corresponding routing
• Localization can be unambiguous (if one link is identified to have APl=0) or not
Localization problem formulation with NK
![Page 16: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/16.jpg)
• Assumption: single link failure• We define the Active Parameter (AP)
• APl =1 if link l is active, while APl=0 if it has failed (these are the unknown parameters)
• APp=N if path p traversing N links is active, otherwise, if it is failed APp=N-1 • yn is set to the APp of monitored lightpaths, G’n is the corresponding routing
• Localization can be unambiguous (if one link is identified to have APl=0) or not
Localization problem formulation with NK
A B CLP1
LP2
!
(a)
Unambiguous: APAB=1; APBC=0
![Page 17: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/17.jpg)
• Assumption: single link failure• We define the Active Parameter (AP)
• APl =1 if link l is active, while APl=0 if it has failed (these are the unknown parameters)
• APp=N if path p traversing N links is active, otherwise, if it is failed APp=N-1 • yn is set to the APp of monitored lightpaths, G’n is the corresponding routing
• Localization can be unambiguous (if one link is identified to have APl=0) or not
Localization problem formulation with NK
A B CLP1
LP2
!
(a)
Unambiguous: APAB=1; APBC=0
D E F G
LP3 LP4
!
(b)
Ambiguous: APDE=0.5; APEF=0.5; APFG=1
![Page 18: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/18.jpg)
• Assumption: single link failure• We define the Active Parameter (AP)
• APl =1 if link l is active, while APl=0 if it has failed (these are the unknown parameters)
• APp=N if path p traversing N links is active, otherwise, if it is failed APp=N-1 • yn is set to the APp of monitored lightpaths, G’n is the corresponding routing
• Localization can be unambiguous (if one link is identified to have APl=0) or not
Localization problem formulation with NK
A B CLP1
LP2
!
(a)
Unambiguous: APAB=1; APBC=0
D E F G
LP3 LP4
!
(b)
Ambiguous: APDE=0.5; APEF=0.5; APFG=1
If ambiguous à monitor as a service: set up of new lightpaths for failure localization
![Page 19: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/19.jpg)
• Objective: increase the probability of unambiguous failure localization in case of failure
• It is an extension of [c]
Failure Localization Aware RSA (FLA-RSA)
[c] K. Christodoulopoulos, P. Soumplis, E. Varvarigos, "Planning flexible optical networks under physical layer constraints," JOCN, 2013
![Page 20: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/20.jpg)
• Objective: increase the probability of unambiguous failure localization in case of failure
• It is an extension of [c]• Request from s to d
Failure Localization Aware RSA (FLA-RSA)
[c] K. Christodoulopoulos, P. Soumplis, E. Varvarigos, "Planning flexible optical networks under physical layer constraints," JOCN, 2013
![Page 21: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/21.jpg)
• Objective: increase the probability of unambiguous failure localization in case of failure
• It is an extension of [c]• Request from s to d• k paths between s and d
Failure Localization Aware RSA (FLA-RSA)
[c] K. Christodoulopoulos, P. Soumplis, E. Varvarigos, "Planning flexible optical networks under physical layer constraints," JOCN, 2013
![Page 22: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/22.jpg)
• Objective: increase the probability of unambiguous failure localization in case of failure
• It is an extension of [c]• Request from s to d• k paths between s and d• which of the k path does enrich the routing matrix with information that can
improve failure localization?
Failure Localization Aware RSA (FLA-RSA)
[c] K. Christodoulopoulos, P. Soumplis, E. Varvarigos, "Planning flexible optical networks under physical layer constraints," JOCN, 2013
![Page 23: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/23.jpg)
• Objective: increase the probability of unambiguous failure localization in case of failure
• It is an extension of [c]• Request from s to d• k paths between s and d• which of the k path does enrich the routing matrix with information that can
improve failure localization?• FLA-RSA maximizes the rank on the routing matrix Gm
Failure Localization Aware RSA (FLA-RSA)
[c] K. Christodoulopoulos, P. Soumplis, E. Varvarigos, "Planning flexible optical networks under physical layer constraints," JOCN, 2013
![Page 24: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/24.jpg)
• Objective: increase the probability of unambiguous failure localization in case of failure
• It is an extension of [c]• Request from s to d• k paths between s and d• which of the k path does enrich the routing matrix with information that can
improve failure localization?• FLA-RSA maximizes the rank on the routing matrix Gm
Failure Localization Aware RSA (FLA-RSA)
LP1
LP2
Failure Localization Unaware RSA
[c] K. Christodoulopoulos, P. Soumplis, E. Varvarigos, "Planning flexible optical networks under physical layer constraints," JOCN, 2013
![Page 25: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/25.jpg)
• Objective: increase the probability of unambiguous failure localization in case of failure
• It is an extension of [c]• Request from s to d• k paths between s and d• which of the k path does enrich the routing matrix with information that can
improve failure localization?• FLA-RSA maximizes the rank on the routing matrix Gm
Failure Localization Aware RSA (FLA-RSA)
LP1
LP2
LPNEW
Failure Localization Unaware RSA
[c] K. Christodoulopoulos, P. Soumplis, E. Varvarigos, "Planning flexible optical networks under physical layer constraints," JOCN, 2013
![Page 26: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/26.jpg)
• Objective: increase the probability of unambiguous failure localization in case of failure
• It is an extension of [c]• Request from s to d• k paths between s and d• which of the k path does enrich the routing matrix with information that can
improve failure localization?• FLA-RSA maximizes the rank on the routing matrix Gm
Failure Localization Aware RSA (FLA-RSA)
LP1
LP2
LPNEW
LP1LP2
LPNEW
Failure Localization Unaware RSA Failure Localization Aware RSA
[c] K. Christodoulopoulos, P. Soumplis, E. Varvarigos, "Planning flexible optical networks under physical layer constraints," JOCN, 2013
![Page 27: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/27.jpg)
• Compare Failure localization aware (FLA)-RSA with simple RSA • DT network topology • load is expressed as a percentage, with load=1 denoting the all-to-all
communication (note that for load=1 we have unambiguous localization)
• 100 Gbps PM-QPSK lightpaths, 37.5 GHz and 1500 km reach • (FLA)-RSA using k=3,6,10 paths • Metrics:
• number of monitors as a service required for achieving unambiguous failure localization
• number of slots required for serving traffic
Simulation scenario
![Page 28: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/28.jpg)
• Plenty extra monitors are required at low load (up to 40%) to resolve ambiguity
• FLA-RSA reduces substantially the extra monitors required • k=3 is enough for loads higher than
0.5
Results
![Page 29: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/29.jpg)
• Plenty extra monitors are required at low load (up to 40%) to resolve ambiguity
• FLA-RSA reduces substantially the extra monitors required • k=3 is enough for loads higher than
0.5
• The price paid to improve failure localization is that of longer paths à a higher spectrum utilization
• The increase in spectrum is small, which is less than 5 slots for k=3
Results
![Page 30: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/30.jpg)
• We proposed a correlation framework for (soft- or hard) fault localization, leveraging information from established lightpaths
• Since a fault can be localized with ambiguity, the control plane triggers the setup of new lightpaths (monitors as a service) with the scope of identifying the failed element
• The ambiguity can be reduced using the proposed Failure Localization-Aware Routing and Spectrum Allocation (FLA-RSA) algorithm.
à lower monitors as a service à more fast and responsive reaction to failures
ACK: The work has been supported by the ORCHESTRA project.
Conclusions
![Page 31: Exploiting Network Kriging for Fault Localization · Exploiting Network Kriging for Fault Localization K. Christodoulopoulos1, N. Sambo2, E. Varvarigos1 OFC 2016 1: Computer Engineering](https://reader034.vdocuments.us/reader034/viewer/2022050112/5f49d7a79dad5145b4451c5e/html5/thumbnails/31.jpg)
email: [email protected]