s-cube lp: mining lifecycle event logs for enhancing sbas

26
Exploiting Knowledge on Past Process Execution to Improve SBA Analysis Mining Lifecycle Event Logs for Enhancing SBAs ISTI-CNR (CNR), TU Wien (TUW) Franco Maria Nardini, Gabriele Tolomei, CNR

Upload: virtual-campus

Post on 25-Dec-2014

330 views

Category:

Technology


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Exploiting Knowledge on Past Process Execution to Improve SBA Analysis

Mining Lifecycle Event Logs for Enhancing SBAs

ISTI-CNR (CNR), TU Wien (TUW)

Franco Maria Nardini, Gabriele Tolomei, CNR

Page 2: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Learning Package Categorization

S-Cube

Monitoring and Analysis of SBA

Process Mining

Exploiting Knowledge on Past Process Execution to Improve SBA Analysis

Page 3: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Connections to the S-Cube IRF

  Conceptual Research Framework: –  Service Composition and Coordination –  Service Infrastructure

–  Adaptation and Monitoring

  Logical Run-Time Architecture: –  Monitoring Engine –  Adaptation Engine

–  Negotiation Engine

–  Runtime QA Engine –  Resource Broker

3

Page 4: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Overview

  Introduction

  Goal

  Methodology

  Experiments

  Conclusions

Page 5: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

SBA Event Logs

  Most complex software systems collect their lifecycle usage data in event log files

  SBA event logs contain several information about service components exchanging messages –  e.g., service invocation, service failure, registry querying, etc.

  Event logs represent a huge source of “hidden” information (i.e., knowledge)

5

Page 6: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Mining SBA Event Logs

  Data Mining algorithms and techniques allow extracting valuable knowledge from event logs

  Extracted knowledge may refer to several aspects: –  e.g., service usage patterns, service failure patterns, etc.

  If properly exploited, such knowledge might help improving the overall quality of the system: –  recommending frequent invoked services;

–  avoiding/handling anomalous situations, etc.

6

Page 7: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Process Mining (PM)

  Process Mining (PM) is an application of data mining techniques to SBA event logs

  PM aims at discovering structured process models derived from patterns that are present in actual traces of service executions

  Each process is usually represented by a digraph and the problem of PM has been modeled as: –  finite state machine [CW96]

–  sequential pattern mining (SPM) [AGL98]

–  Petri-net [vdAWM04]

7

Page 8: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Another Example: Web Search Engines

  Web Search Engines (WSEs) are another example of systems that benefit from mining their event log data (i.e., Query Logs)

  Query Log Mining (QLM) has proven to be effective for enhancing the overall performances of WSEs

  We propose a QLM technique for identifying search patterns (tasks) from the stream of queries recorded in query logs [LOPST11]

8

Page 9: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Overview

  Introduction

  Goal

  Methodology

  Experiments

  Conclusions

Page 10: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Goal

  Treat PM as an instance of the SPM problem

  Detect frequent sequential patterns of service invocation, i.e., services that are frequently co-invoked within the same sequence –  e.g., service Y is usually invoked afterwards service X

  Find which/how services are actually used –  service recommendation

–  avoiding/handling anomalous situations

10

Page 11: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Overview

  Introduction

  Goal

  Methodology

  Experiments

  Conclusions

Page 12: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Sequential Pattern Mining

  Event log might be viewed as sequences of events that change with time (time-series)

  We are interested in finding sequences of services that are frequently invoked in a specific order, i.e., sequential patterns

  Sequential Pattern Mining (SPM) is the process of extracting sequential patterns whose support exceeds a predefined minimal support threshold min_supp

12

Page 13: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

PrefixSpan

  One of the most efficient algorithm for finding sequential patterns [PHMP01]

  Mines the complete set of patterns but greatly reduces the efforts of candidate subsequence generation

  Takes only into account the chronological order between events

-  i.e., it only cares if X comes before Y without worrying about the actual time interval

13

Page 14: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

MiSTA

  Hint: observing that two services are invoked really close rather than far away to each other in a sequence could lead to distinct conclusions

  MiSTA [GNPP06] is able to deal with the actual time interval between any two consecutive service invocations

  It needs a time threshold tau for specifying the maximum time interval of events in a frequent sequence

14

Page 15: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Overview

  Introduction

  Goal

  Methodology

  Experiments

  Conclusions

Page 16: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Data Set: VRESCo

  VRESCo is the runtime environment for Service-oriented Computing developed by VITALab@TUW

  It collects usage data (i.e., events) in the form of XML log file

  VRESCo event log file contains information about: invoked services, service rebinding, service failure, etc.

  We only focus on service invocation events

16

Page 17: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

PrefixSpan: min_supp=25%

17

Page 18: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

PrefixSpan: min_supp=50%

18

Page 19: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

PrefixSpan: min_supp=66%

19

Page 20: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

MiSTA: min_supp=32%, tau=5sec.

20

Page 21: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

MiSTA: min_supp=32%, tau=60sec.

21

Page 22: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

MiSTA: min_supp=32%, tau=300sec.

22

Page 23: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Results

  The service logs coming from the VRESCo runtime environment contain frequent patterns of services;

  Those patters contains information about: invoked services, service rebinding, service failure, etc;

  Those patterns could be collected by considering co-occurring sequences and also by considering the time;

  Such inferred knowledge can be used to enhance SBAs: e.g., by means of novel design tools like service recommendation.

23

Page 24: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Overview

  Introduction

  Goal

  Methodology

  Experiments

  Conclusions

Page 25: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

Conclusions

  Event logs collected by complex software systems represent a huge source of information (knowledge)

  Find sequences of frequently co-invoked services from SBA event logs using Sequential Pattern Mining (SPM)

  2 SPM algorithms run on top of a real-world SBA event log (VRESCo): PrefixSpan, MiSTA

  Experimental results show that some services are often invoked together in a frequent sequence

  Exploit such inferred knowledge to enhance SBAs: e.g., by means of novel design tools like service recommendation

Page 26: S-CUBE LP: Mining Lifecycle Event Logs for Enhancing SBAs

References

–  [CW96] J. E. Cook and A. L. Wolf, “Discovering models of software processes from event-based data”. Research Report Technical Report CUCS-819-96, Computer Science Dept., Univ. of Colorado, 1996.

–  [AGL98] R. Agrawal, D. Gunopulos, and F. Leymann, “Mining Process Models from Workflow Logs”. In Sixth International Conference on Extending Database Technology, pp. 469–483, 1998

–  [vdAWM04] W. van der Aalst, T. Weijters, and L. Maruster, “Workflow Mining: Discovering Process Models from Event Logs”. IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 9, pp. 1128–1142, Sep. 2004.

–  [LOPST11] C. Lucchese, S. Orlando, R. Perego, F. Silvestri, and G. Tolomei, “Identifying task-based sessions in search engine query logs”, in WSDM ’11. ACM, 2011, pp. 277–286.

–  [PHMP01] J. Pei, J. Han, B. Mortazavi-Asl, and H. Pinto, “Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth,” in ICDE ’01. IEEE, 2001

–  [GNPP06] F. Giannotti, M. Nanni, D. Pedreschi, and F. Pinelli, “Mining sequences with temporal annotations,” in SAC ’06. ACM, 2006, pp. 593–597.