association rules extraction for the identification of ... · association rules extraction for the...
TRANSCRIPT
Association Rules Extraction for the Identification of Dependent
Abnormal Behaviours in Complex Technical Infrastructures
Politecnico di Milano Laboratory
of Signal and Risk Analysis (LASAR)
Federico ANTONELLO
Piero BARALDI
Ahmed SHOKRY
Enrico ZIO10.04.2018
CERNEN-ARP-PPM Section
Ugo GENTILE
Luigi SERIO
2Functional Dependencies Analysis for
Complex Technical infrastructures (CTIs)
t [seconds]
y4 (t)
t [seconds]
y3 (t)
t [seconds]
y2 (t)
t [seconds]
y1 (t)
Operational Data
collected from sensors
Alarm Messages
CTIs
• CTI grows and changes in time • New/updated components/ connections
• Intricate interactions among
components of different systems • Physical, functional, spatial, data,
operator etc.
Traditional
methods
cannot be
applied
• Functional logic reconstruction very
complex and time consuming
• Hidden functional dependencies and
interconnections
• Dependent Abnormal Behavior
• Failure propagations across systems
Work Objective: Data-Driven analysis 3
Dependent abnormal behaviors identification
from alarm datasets analysis
Alarm Messages
Abnormal
behaviours
Data Acquisition Solution 4
ETL (Extract-Transform-Load) tools have been developed
- ETL is the most underestimated activity when applying machine learning
- ETL is the most time-consuming.
E
x
t
r
a
c
t
i
o
n
Physical signals
Data
connector
L
O
A
D
Analysis
Tool
Data
transformation
Data
transformation
Data
transformation
Data
transformationSCADA system
Data
connector
…
Heterogeneous and big data sources:
- PSEN only graphical interface access -> very time-consuming data retrieval;
- CV/CRYO -> different access but to stored tables;
- TI Logbook -> written in natural language (French/English), no information on downtime;
Proposed Methodology: Multiple Constraints
Targeted Association Rules Mining [MCT-ARM]5
1) Database Representation
Alarms messages
datasets
MCT-ARM
INPUT
OUTPUT
Functional Dependencies
2) Association rules mining (ARM) 3) Causality
1) 2) 3)
CTI
→→→
Association rules: Probabilistic logic
expressions which describe the conditional
co-occurence of alarms
Rule 1:
Rule 2:
Rule 3:...
TimeRule 3:
⇒ ⇒
... ,
Time-dependent causal rules
1) Alarms database representation for rules
mining6
Time discretization to define
transaction (events)
Time discretization and Boolean
representation of transactions as
vectors
Ex: 𝑇 𝑡2 = [0,1,0,0,0,0]
Alarm sequences
2) Association Rules Mining (ARM):
Basics7
If alarm 𝑪𝟐𝟏 occurs then alarm 𝑪𝟑
𝟐 occurs
Support: T(𝐶21, 𝐶3
2)
Confidence C(𝐶21 → 𝐶3
2)=
=
𝑁.𝐸𝑣𝑒𝑛𝑡𝑠 𝑖𝑛 𝑤ℎ𝑖𝑐ℎ𝐶21and𝐶32
𝑜𝑐𝑐𝑢𝑟 𝑡𝑜𝑔𝑒𝑡ℎ𝑒𝑟
𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑣𝑒𝑛𝑡𝑠
Conditional probability of the
occurrence of 𝐶32 given the
occurrence of 𝐶21
T(𝑪𝟐𝟏, 𝑪𝟑
𝟐)=3/8
For each possible combination of alarms (e.g. 𝑪𝟐𝟏 and 𝑪𝟑
𝟐 ) evaluate:
2)
1)
3) If {T(𝐶21, 𝐶3
2) >threshold1 and C(𝐶21 → 𝐶3
2)>threshold2 } THEN
generate the association rule:
C(𝑪𝟐𝟏 → 𝑪𝟑
𝟐)=3/4
82) ARM for Functional Dependencies in CTIs
Multiple Constraints Targeted ARM [MCT-ARM]
Method: Selection of candidate sets of alarms for rule
mining [Target]
Cross system alarms
infrasystem alarms
Multiple constraints:
Threshold1=f(T( ), T( ))
OK
NO
CERN CTI Functional Dependencies:
• Different FDs have different frequencies
of occurrence
• FD among components of different
systems are rare if compared with FD
among components of the same system
MCT-ARM Objective:
Find FDs of Interest (e.g., Cross system FDs)
Traditional ARM requires unfeasible
computational effort to find rare rules
Th
res
ho
ld1
Min[T( ), T( )]
3) Causality 9
Rule
Alarms
sequence ⟹ ⟹
Functional
dependenceQuick sort algorithm
⟹
Pairwise causality
Alarms ID
,,
Tool Application:
CERN LHC point 810
Characterization of CERN LHC
CTI Point 8 [data of 2016]
Results:
Electric
system
Cryogeni
c system
Cooling and
ventilation
(CV) systems
Total LHC
Zone 8
No. of
malfunctions
types9,655 1,472 2,324 13,451
No. of alarms 16,034,050 2,607,030 70,657 18,711,737
Rules
Total No. of rules extracted 1031
No. Causal cross systems rules 147
No. of groups of functionally
dependent malfunctions
8
Tool application to CERN CTI zone 8:
Example of Rules Engineering Analysis (1)11
Example of Engineering analysis of rules (validated by CERN experts)
Alarms messages
datasets
INPUT
CTI
TOOL
MCT-ARM
Rule visualization:
Tool application to CERN CTI zone 8:
Example of Rules Engineering Analysis (2)12
Example of Engineering analysis of rules (validated by CERN experts)
Alarms messages
datasets
INPUT
CTI
TOOL
MCT-ARM
Rule visualization:
Conclusions (1) 13
Identification of dependent abnormal behavior components
Alarm databases
CTIs
• Evolving design
• Intricate components interactions
• Hidden functional dependencies
Facility structure & functional logic?Traditional methods
• Functional Analysis
• Logic DecompositionThis work
Alarm databases of LHC zone 8 (18 millions of messages)
8 Groups of dependent abnormal behavior components
Proposed methodology
Conclusions (2) 14
The proposed data-driven tool is capable of discovering
dependent abnormal behaviours:
Retrieving various types of dependent abnormal behaviours
(cross-systems, infrasystem) in CERN CTI;
Analysing systems dependability (causality characterization);
Easy to be applied to the overall LHC points (applicable to systems,
subsystems, groups of components, buildings, …)
Next step:
• Finalization of the graphical interface for favor the use of the tool to
CERN operators
• Dynamic update of discovered FDs on INFOR
• Use of the tool for root cause of failures reconstruction, alarm
suppression, maintencence supervision ...
Artificial Case Study:
application and results16
• 3 different systems, 300
components, 900 different
malfunctions
• Redundancy in the alarms
generation
• Simulation time: 720d
• Over 71807 alarms generated
Simulated Functional Dependencies (FD):
10 FD: C1,x C2,x [time of propagation (Tpr): 1’-30’]
1 FD: C3,1 C3,2C1,11C1,12 [Tpr : 1’-5’]
1 FD: C3,7 C3,8 C2,11C2,12C1,13C1,14 [Tpr 1’-20’]
Artificial CTI description
Results:
• 12 FD correctly identified
• Identification of the FD among
components of different systems
• Causality characterization 100%
correct
• Examples:
MCT-ARM
Functional Dependence
C2,15⟹C2,16⟹C1,13⟹𝐂1,14
C1,1⟹𝐂2,1