chapter 6 advanced process discovery techniques · process modeling and analysis chapter 3 data...
TRANSCRIPT
![Page 1: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/1.jpg)
Chapter 6Advanced Process Discovery Techniquesprof.dr.ir. Wil van der Aalstwww.processmining.org
![Page 2: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/2.jpg)
Overview
PAGE 1
Part I: Preliminaries
Chapter 2 Process Modeling and Analysis
Chapter 3Data Mining
Part II: From Event Logs to Process Models
Chapter 4 Getting the Data
Chapter 5 Process Discovery: An Introduction
Chapter 6 Advanced Process Discovery Techniques
Part III: Beyond Process Discovery
Chapter 7 Conformance Checking
Chapter 8 Mining Additional Perspectives
Chapter 9 Operational Support
Part IV: Putting Process Mining to Work
Chapter 10 Tool Support
Chapter 11 Analyzing “Lasagna Processes”
Chapter 12 Analyzing “Spaghetti Processes”
Part V: Reflection
Chapter 13Cartography and Navigation
Chapter 14Epilogue
Chapter 1 Introduction
![Page 3: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/3.jpg)
Process discovery
PAGE 2
software system
(process)model
eventlogs
modelsanalyzes
discovery
records events, e.g., messages,
transactions, etc.
specifies configures implements
analyzes
supports/controls
enhancement
conformance
“world”
people machines
organizationscomponents
businessprocesses
![Page 4: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/4.jpg)
Challenge
PAGE 3
process discovery
fitness
precisiongeneralization
simplicity
“able to replay event log” “Occam’s razor”
“not overfitting the log” “not underfitting the log”
![Page 5: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/5.jpg)
Observing a stable process infinitely long
PAGE 4
trace in event log
frequent behavior
all behavior(including noise)
![Page 6: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/6.jpg)
Target model
PAGE 5
target model
![Page 7: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/7.jpg)
Non-fitting model
PAGE 6
non-fitting model
![Page 8: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/8.jpg)
Overfitting model
PAGE 7
overfitting model
![Page 9: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/9.jpg)
Underfitting model
PAGE 8
underfitting model
![Page 10: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/10.jpg)
Characteristics of process discovery algorithms
• Representational bias− Inability to represent concurrency− Inability to deal with (arbitrary) loops− Inability to represent silent actions− Inability to represent duplicate actions− Inability to model OR-splits/joins− Inability to represent non-free-choice behavior− Inability to represent hierarchy
• Ability to deal with noise• Completeness notion assumed• Approach used (direct algorithmic approaches, two-
phase approaches, computational intelligence approaches, partial approaches, etc.)
PAGE 9
![Page 11: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/11.jpg)
Examples
PAGE 10
• Algorithmic techniques• Alpha miner• Alpha+, Alpha++, Alpha#• FSM miner• Fuzzy miner• Heuristic miner• Multi phase miner
• Genetic process mining• Single/duplicate tasks• Distributed GM
• Region-based process mining• State-based regions• Language based regions
• Classical approaches not dealing with concurrency• Inductive inference (Mark Gold, Dana Angluin et al.)• Sequence mining
![Page 12: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/12.jpg)
Heuristic mining
• To deal with noise and incompleteness.• To have a better representational bias than the α
algorithm (AND/XOR/OR/skip).• Uses C-nets.
PAGE 11
a
register claim
c e
close case
check damage
dconsult expert
b
check policy
![Page 13: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/13.jpg)
Example log; problem α algorithm
PAGE 12
a
b
c
ed
p2
end
p4
p3p1
start
p5
![Page 14: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/14.jpg)
Taking into account frequencies
PAGE 13
![Page 15: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/15.jpg)
Dependency measure
PAGE 14
![Page 16: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/16.jpg)
Example
PAGE 15
![Page 17: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/17.jpg)
Lower threshold (2 direct successions and a dependency of at least 0.7)
PAGE 16
a c e
d
b
11(0.92)
11(0.92)
13(0.93)
5(0.83)
4(0.80)
13(0.93)
11(0.92)
11(0.92)
![Page 18: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/18.jpg)
Higher threshold (5 direct successions and a dependency of at least 0.9)
PAGE 17
a c e
d
b11(0.92)
11(0.92)
13(0.93) 13(0.93)
11(0.92)
11(0.92)
![Page 19: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/19.jpg)
Learning splits and joins
PAGE 18
a40
c21
e 40
d 17
b21
5
20
20
13
20
20
13
4
5
20
13
20
20 20
20
5
20
13 1313
4
4
![Page 20: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/20.jpg)
Alternative visualization
PAGE 19
a40
c21
e 40
d 17
b21
5
20
20
13
20
20
13
4
5
20
13
20
20 20
20
5
20
13 1313
4
4
a c e
d
b
AND AND
![Page 21: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/21.jpg)
Characteristics of heuristic mining
• Can deal with noise and therefore quite robust.• Improved representational bias.• Split and join rules are only considered locally
(therefore most of the discovered model are not sound and require repair actions).
PAGE 20
![Page 22: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/22.jpg)
Genetic process mining
PAGE 21
next generationcomputefitness
elitism
parents
crossover
children
mutation
create initial population
“dead” individuals
tournament
select best individual
event log
termination
![Page 23: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/23.jpg)
Design decisions
• Representation of individuals• Initialization• Fitness function• Selection strategy (tournament and elitism)• Crossover• Mutation
PAGE 22
next generationcomputefitness
elitism
parents
crossover
children
mutation
create initial population
“dead” individuals
tournament
select best individual
event log
termination
![Page 24: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/24.jpg)
Example: crossover
PAGE 23
a
start register request
b
examine thoroughly
c
examine casually
d
check ticket
decide
pay compensation
reject request
reinitiate request
e
g
h
f
end
a
start register request
b
examine thoroughly
c
examine casually
d
check ticket
decide
pay compensation
reject request
reinitiate request
e
g
h
f
end
a
start register request
b
examine thoroughly
c
examine casually
d
check ticket
decide
reinitiate request
e
f
a
start register request
b
examine thoroughly
c
examine casually
d
check ticket
decide
pay compensation
reject request
reinitiate request
e
g
h
f
end
pay compensation
reject request
g
hend
![Page 25: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/25.jpg)
Example: mutation
PAGE 24
a
start register request
b
examine thoroughly
c
examine casually
d
check ticket
decide
pay compensation
reject request
reinitiate request
e
g
h
f
end
a
start register request
b
examine thoroughly
c
examine casually
d
check ticket
decide
pay compensation
reject request
reinitiate request
e
g
h
f
end
remove place
added arc
![Page 26: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/26.jpg)
Characteristics of genetic process mining
• Requires a lot of computing power.• Can be distributed easily.• Can deal with noise, infrequent behavior, duplicate tasks,
invisible tasks, etc.• Allows for incremental improvement and combinations
with other approaches (heuristics post-optimization, etc.).PAGE 25
![Page 27: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/27.jpg)
Region-based mining
• Two types of regions theory:− State-based regions− Language-based regions
• All about discovering places (like in the α algorithm)!
PAGE 26
a1
...
a2
am
b1
b2
bn
p(A,B) ...
A={a1,a2, … am} B={b1,b2, … bn}
![Page 28: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/28.jpg)
State-based regions
Two steps:1.Discover a transition system (different abstractions
are possible)2.Convert transition system into an “equivalent” Petri
net.
PAGE 27
![Page 29: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/29.jpg)
Step 1: learning a transition system
• past, future, past+future• sequence, multiset, set abstraction• limited horizon to abstract further• filtering e.g. based on transaction type, names, etc.• labels based on activity name or other features
PAGE 28
a b c d c d c d e f a g h h h ipast future
current state
past and future
trace:
![Page 30: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/30.jpg)
Past without abstraction (full sequence)
PAGE 29
‹a,e,d›‹a,e›‹a›
‹a,b,c,d›
‹›
‹a,c,b,d›
‹a,b,c›‹a,b›
‹a,c,b›‹a,c›
a
c d
e d
b dc
b
![Page 31: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/31.jpg)
Future without abstraction
PAGE 30
‹a,e,d› ‹d›
‹a,b,c,d›
‹›
‹a,c,b,d›
d
‹b,c,d›‹c,d›
‹c,b,d›‹b,d›
ea
a
a
b
bc
c
‹e,d›
![Page 32: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/32.jpg)
Past with multiset abstraction
PAGE 31
[ ]
a
d
b dc
e
c
b
[a,b,c,d][a,b,c][a,c]
[a,b]
[a,e]
[a,d,e]
[a]
![Page 33: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/33.jpg)
Only last event matters for state
PAGE 32
‹›a b
c
d
e d
d‹a›
‹b›
‹c›
‹d›
‹e›
c b
![Page 34: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/34.jpg)
Step 2: constructing a Petri net using regions
PAGE 33
a
a
b
c
d
fpR
e
a
b
e
c
d
df
f
e
e
a = enterb = enterc = exitd = exite = do not crossf = do not cross
R
![Page 35: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/35.jpg)
Example
PAGE 34
[ ]
a
d
b dc
e
cb
[a,b,c,d][a,b,c][a,c]
[ a,b][a,e] [a,d,e]
[a]
a
b
c
de
p2
end
p4
p3p1
start
![Page 36: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/36.jpg)
Language based regions
PAGE 35
a1
a2
b1
b2
dpR
e
c1
c
f
YX
Region R = (X,Y,c) corresponding to place pR: X = {a1,a2,c1} = transitions producing a token for pR, Y = {b1,b2,c1} = transitions consuming a token from pR, and c is the initial marking of pR.
![Page 37: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/37.jpg)
Based idea: enough tokens should be present when consuming
PAGE 36
a1
a2
b1
b2
dpR
e
c1
c
f
YX
A place is feasible if it can be added without disabling any of thetraces in the event log.
![Page 38: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/38.jpg)
Example
PAGE 37
![Page 39: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/39.jpg)
Regions
PAGE 38
![Page 40: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/40.jpg)
Model
PAGE 39
b
c
a
e
d
p1 p2 p3 p4
p6
p5
![Page 41: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/41.jpg)
Characteristics of region-based mining
• Can be used to discover more complex control-flow structures.
• Classical approaches need to be adapted (overfitting!).
• Representational bias can be parameterized (e.g., free-choice nets, label splitting, etc.).
• Problems dealing with noise.
PAGE 40
![Page 42: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/42.jpg)
Other approaches, e.g. fuzzy mining
PAGE 41
![Page 43: Chapter 6 Advanced Process Discovery Techniques · Process Modeling and Analysis Chapter 3 Data Mining Part II: From Event Logs to Process Models Chapter 4 Getting the Data Chapter](https://reader031.vdocuments.us/reader031/viewer/2022022601/5b4a44897f8b9a93238c19ef/html5/thumbnails/43.jpg)
Evaluating the discovered process
PAGE 42
Structure: Is this the simplest model (Occam's Razor)?
Fitness: Is the event log possible according to the model?
Precision: Is the model not underfitting (allow for too much)?
Generalization: Is the model not overfitting (only allow for the “accidental” examples)?