association rules extraction for the identification of ... · association rules extraction for the...

16
Association Rules Extraction for the Identification of Dependent Abnormal Behaviours in Complex Technical Infrastructures Politecnico di Milano Laboratory of Signal and Risk Analysis (LASAR) Federico ANTONELLO Piero BARALDI Ahmed SHOKRY Enrico ZIO 10.04.2018 CERN EN-ARP-PPM Section Ugo GENTILE Luigi SERIO

Upload: others

Post on 01-Jun-2020

14 views

Category:

Documents


0 download

TRANSCRIPT

Association Rules Extraction for the Identification of Dependent

Abnormal Behaviours in Complex Technical Infrastructures

Politecnico di Milano Laboratory

of Signal and Risk Analysis (LASAR)

Federico ANTONELLO

Piero BARALDI

Ahmed SHOKRY

Enrico ZIO10.04.2018

CERNEN-ARP-PPM Section

Ugo GENTILE

Luigi SERIO

2Functional Dependencies Analysis for

Complex Technical infrastructures (CTIs)

t [seconds]

y4 (t)

t [seconds]

y3 (t)

t [seconds]

y2 (t)

t [seconds]

y1 (t)

Operational Data

collected from sensors

Alarm Messages

CTIs

• CTI grows and changes in time • New/updated components/ connections

• Intricate interactions among

components of different systems • Physical, functional, spatial, data,

operator etc.

Traditional

methods

cannot be

applied

• Functional logic reconstruction very

complex and time consuming

• Hidden functional dependencies and

interconnections

• Dependent Abnormal Behavior

• Failure propagations across systems

Work Objective: Data-Driven analysis 3

Dependent abnormal behaviors identification

from alarm datasets analysis

Alarm Messages

Abnormal

behaviours

Data Acquisition Solution 4

ETL (Extract-Transform-Load) tools have been developed

- ETL is the most underestimated activity when applying machine learning

- ETL is the most time-consuming.

E

x

t

r

a

c

t

i

o

n

Physical signals

Data

connector

L

O

A

D

Analysis

Tool

Data

transformation

Data

transformation

Data

transformation

Data

transformationSCADA system

Data

connector

Heterogeneous and big data sources:

- PSEN only graphical interface access -> very time-consuming data retrieval;

- CV/CRYO -> different access but to stored tables;

- TI Logbook -> written in natural language (French/English), no information on downtime;

Proposed Methodology: Multiple Constraints

Targeted Association Rules Mining [MCT-ARM]5

1) Database Representation

Alarms messages

datasets

MCT-ARM

INPUT

OUTPUT

Functional Dependencies

2) Association rules mining (ARM) 3) Causality

1) 2) 3)

CTI

→→→

Association rules: Probabilistic logic

expressions which describe the conditional

co-occurence of alarms

Rule 1:

Rule 2:

Rule 3:...

TimeRule 3:

⇒ ⇒

... ,

Time-dependent causal rules

1) Alarms database representation for rules

mining6

Time discretization to define

transaction (events)

Time discretization and Boolean

representation of transactions as

vectors

Ex: 𝑇 𝑡2 = [0,1,0,0,0,0]

Alarm sequences

2) Association Rules Mining (ARM):

Basics7

If alarm 𝑪𝟐𝟏 occurs then alarm 𝑪𝟑

𝟐 occurs

Support: T(𝐶21, 𝐶3

2)

Confidence C(𝐶21 → 𝐶3

2)=

=

𝑁.𝐸𝑣𝑒𝑛𝑡𝑠 𝑖𝑛 𝑤ℎ𝑖𝑐ℎ𝐶21and𝐶32

𝑜𝑐𝑐𝑢𝑟 𝑡𝑜𝑔𝑒𝑡ℎ𝑒𝑟

𝑡𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑒𝑣𝑒𝑛𝑡𝑠

Conditional probability of the

occurrence of 𝐶32 given the

occurrence of 𝐶21

T(𝑪𝟐𝟏, 𝑪𝟑

𝟐)=3/8

For each possible combination of alarms (e.g. 𝑪𝟐𝟏 and 𝑪𝟑

𝟐 ) evaluate:

2)

1)

3) If {T(𝐶21, 𝐶3

2) >threshold1 and C(𝐶21 → 𝐶3

2)>threshold2 } THEN

generate the association rule:

C(𝑪𝟐𝟏 → 𝑪𝟑

𝟐)=3/4

82) ARM for Functional Dependencies in CTIs

Multiple Constraints Targeted ARM [MCT-ARM]

Method: Selection of candidate sets of alarms for rule

mining [Target]

Cross system alarms

infrasystem alarms

Multiple constraints:

Threshold1=f(T( ), T( ))

OK

NO

CERN CTI Functional Dependencies:

• Different FDs have different frequencies

of occurrence

• FD among components of different

systems are rare if compared with FD

among components of the same system

MCT-ARM Objective:

Find FDs of Interest (e.g., Cross system FDs)

Traditional ARM requires unfeasible

computational effort to find rare rules

Th

res

ho

ld1

Min[T( ), T( )]

3) Causality 9

Rule

Alarms

sequence ⟹ ⟹

Functional

dependenceQuick sort algorithm

Pairwise causality

Alarms ID

,,

Tool Application:

CERN LHC point 810

Characterization of CERN LHC

CTI Point 8 [data of 2016]

Results:

Electric

system

Cryogeni

c system

Cooling and

ventilation

(CV) systems

Total LHC

Zone 8

No. of

malfunctions

types9,655 1,472 2,324 13,451

No. of alarms 16,034,050 2,607,030 70,657 18,711,737

Rules

Total No. of rules extracted 1031

No. Causal cross systems rules 147

No. of groups of functionally

dependent malfunctions

8

Tool application to CERN CTI zone 8:

Example of Rules Engineering Analysis (1)11

Example of Engineering analysis of rules (validated by CERN experts)

Alarms messages

datasets

INPUT

CTI

TOOL

MCT-ARM

Rule visualization:

Tool application to CERN CTI zone 8:

Example of Rules Engineering Analysis (2)12

Example of Engineering analysis of rules (validated by CERN experts)

Alarms messages

datasets

INPUT

CTI

TOOL

MCT-ARM

Rule visualization:

Conclusions (1) 13

Identification of dependent abnormal behavior components

Alarm databases

CTIs

• Evolving design

• Intricate components interactions

• Hidden functional dependencies

Facility structure & functional logic?Traditional methods

• Functional Analysis

• Logic DecompositionThis work

Alarm databases of LHC zone 8 (18 millions of messages)

8 Groups of dependent abnormal behavior components

Proposed methodology

Conclusions (2) 14

The proposed data-driven tool is capable of discovering

dependent abnormal behaviours:

Retrieving various types of dependent abnormal behaviours

(cross-systems, infrasystem) in CERN CTI;

Analysing systems dependability (causality characterization);

Easy to be applied to the overall LHC points (applicable to systems,

subsystems, groups of components, buildings, …)

Next step:

• Finalization of the graphical interface for favor the use of the tool to

CERN operators

• Dynamic update of discovered FDs on INFOR

• Use of the tool for root cause of failures reconstruction, alarm

suppression, maintencence supervision ...

15

Artificial Case Study:

application and results16

• 3 different systems, 300

components, 900 different

malfunctions

• Redundancy in the alarms

generation

• Simulation time: 720d

• Over 71807 alarms generated

Simulated Functional Dependencies (FD):

10 FD: C1,x C2,x [time of propagation (Tpr): 1’-30’]

1 FD: C3,1 C3,2C1,11C1,12 [Tpr : 1’-5’]

1 FD: C3,7 C3,8 C2,11C2,12C1,13C1,14 [Tpr 1’-20’]

Artificial CTI description

Results:

• 12 FD correctly identified

• Identification of the FD among

components of different systems

• Causality characterization 100%

correct

• Examples:

MCT-ARM

Functional Dependence

C2,15⟹C2,16⟹C1,13⟹𝐂1,14

C1,1⟹𝐂2,1