phd day 2011

21
Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management Jan Claes, Geert Poel Monday 16 May 202 FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION Merging Log Files for Process Mining Jan Claes Promotor: Prof. Dr. Geert Poels Copromotor: Prof. Dr. Ir. Birger Raa

Upload: jan-claes

Post on 25-Dec-2014

461 views

Category:

Business


0 download

DESCRIPTION

Slides of my presentation at our faculty's PhD Day, 24 May 2011, Gent, B

TRANSCRIPT

Page 1: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels10 April 2023

FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION

Merging Log Files for Process Mining

Jan Claes

Promotor: Prof. Dr. Geert PoelsCopromotor: Prof. Dr. Ir. Birger Raa

Page 2: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels2 / 21

Roadmap

1. Process Mining

2. Merging log files

3. Genetic algorithm

4. Experimentresults

5. Discussion

© http://maps.google.com

Page 3: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION

Jan Claes, Geert Poels3 / 21

1. Process Mining

Page 4: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels4 / 21

A plane crashed... What happened?

Analyse the ‘black box’

Page 5: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels5 / 21

A process failed... What happened?

Analyse the ‘black box’: look for historical data

Process Mining: Reconstruct and analyse processes From historical process data

• Log files• Audit trails• Database history fields/tables

Page 6: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels6 / 21

Process Mining

Audit trail, database fields, csv log file ProM Analyses

ProM Log File

Page 7: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION

Jan Claes, Geert Poels7 / 21

2. Merging log files

Page 8: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels8 / 21

Merging log files

Process Mining

Page 9: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels9 / 21

Merging log files

Process MiningLog Merging

Page 10: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION

Jan Claes, Geert Poels10 / 21

3. Genetic algorithm

Page 11: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels11 / 21

Genetic algorithm

1st generation 2nd generation 3th generation

cross-over

mutation

survival of the fittest

Page 12: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels12 / 21

Genetic algorithm

1st generation 2nd generation 3th generation

mutation

cross-over

survival of the fittest

14

27

6

18

29

5

18

28

Fitness function score

32

Page 13: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION

Jan Claes, Geert Poels13 / 21

4. Experiment results

Page 14: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels14 / 21

Experiment: proof of concept

Simulated data Given model Generate

• random set of logs• single log (=solution)

Use merge algorithm to merge set of logs Check resulting log with solution log

Page 15: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels15 / 21

Experiment: proof of concept

Advantages of using simulated data Solution is known Controllable parameters

(e.g. noise, overlap, matching id)

Disadvantages of using simulated data Limited internal validity (are results realistic?) No external validity (results not generalisable)

Page 16: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels16 / 21

0 10 20 50 75 1000%

10%20%30%40%50%60%70%80%90%

100%

Percentage of incorrect links Matching id values

Overlap Noise

Experiment results

Mean time4 min

Results of version of 31/03/2011: GA inspired algorithm

Page 17: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels17 / 21

Experiment results

0 10 20 50 75 1000%

10%20%30%40%50%60%70%80%90%

100%

Percentage of incorrect links No matching id values

Overlap Noise

Mean time4 min

Results of version of 31/03/2011: GA inspired algorithm

Page 18: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels18 / 21

0 10 20 50 75 1000%

20%

40%

60%

80%

100%

Percentage of incorrect links No matching id values

Overlap Noise

Experiment results: NEW ALGORITHM

Mean time350 msec

Results of version of 15/05/2011: AIS algorithm

Page 19: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

FACULTY OF ECONOMICS AND BUSINESS ADMINISTRATION

Jan Claes, Geert Poels19 / 21

5. Discussion

Page 20: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels20 / 21

Future work

Optimise genetic algorithm Less incorrect links Faster implementation Fitness function factors

Validation with real test cases Ghent University DPO (Human Resources) Century21 (Real Estate) BNP Paribas Fortis (Loan approvements) ...

Page 21: PhD Day 2011

Ghent University, Faculty of Economics and Business Administration Department of Management Information and Operations Management

Jan Claes, Geert Poels21 / 21

Contact information

Jan [email protected]://processmining.ugent.be

FEB08, Tweekerkenstraat 29000 Gent, Belgium