An introduction to Process Mining
Diogo R. Ferreira
DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018
Introduction
• Example of a business process
• sequence of activities
• branching and parallel behavior
• process model vs. process instances
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 2
a. Fill out request
b. Approve request
c. Archive request
d. Order product
e. Receive product
f. Updatestock
g. Handle payment
h. Close request
Requisitionapproved?
Yes
No
+ +
Introduction
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 3
a. Fill out request
b. Approve request
c. Archive request
d. Order product
e. Receive product
f. Updatestock
g. Handle payment
h. Close request
Requisitionapproved?
Yes
No
+ +
a b ca b d e f g ha b d e g f ha b d g e f h
Introduction
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 4
a. Fill out request
b. Approve request
c. Archive request
d. Order product
e. Receive product
f. Updatestock
g. Handle payment
h. Close request
Requisitionapproved?
Yes
No
+ +
a b ca b d e f g ha b d e g f ha b d g e f h
Introduction
• Possible questions• what is the most frequent path?
• how many requests (%) get approved?
• how much time does the process (or each activity) take?
• does every instance comply with the model?
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 5
a. Fill out request
b. Approve request
c. Archive request
d. Order product
e. Receive product
f. Updatestock
g. Handle payment
h. Close request
Requisitionapproved?
Yes
No
+ +
Introduction
• Task allocation
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 6
ha a
bb b d g
fe f
ce
dgha
cb
u3 u4 u5 u6 u7 u8u2u1
tasks
users
a. Fil l out request
b. Approve request
c. Archive request
a. Fil l out request
b. Approve request
d. Order product
e. Receive product
f. Updatestock
g. Handle payment
h. Close request+ +
a. Fil l out request
b. Approve request
c. Archive request
a. Fil l out request
b. Approve request
d. Order product
e. Receive product
f. Updatestock
g. Handle payment
h. Close request+ +
Introduction
• More possible questions• what is the distribution of work, i.e. who does what?
• how do users interact and collaborate with each other?
• what is the performance of each user in each activity?
• are certain resources being overloaded?
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 7
ha a
bb b d g
fe f
ce
dgha
cb
u3 u4 u5 u6 u7 u8u2u1
tasks
users
a. Fil l out request
b. Approve request
c. Archive request
a. Fil l out request
b. Approve request
d. Order product
e. Receive product
f. Updatestock
g. Handle payment
h. Close request+ +
a. Fil l out request
b. Approve request
c. Archive request
a. Fil l out request
b. Approve request
d. Order product
e. Receive product
f. Updatestock
g. Handle payment
h. Close request+ +
Event logs
• Example of an event log
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 8
Event logs
• Process mining perspectives• control-flow perspective (task column)
• organizational perspective (user column)
• performance perspective (timestamp column)
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 9
• Task transitions within each case id
Control-flow perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 10
• Transition matrix
Control-flow perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 11
a b c d e f g h
a 3
b 1 2
c
d 1 1
e 2
f 1 1
g 1 1
h
• Transition graph
Control-flow perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 12
• Transition graph for a larger event log
• percentage of approvals: 5683 / (1866 + 5683) ≈ 75%
Control-flow perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 13
• Improving the visualization• edge thickness
• activity counts
• node coloring
Control-flow perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 14
• Handover of work
Organizational perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 15
• Transition matrix
Organizational perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 16
u1 u2 u3 u4 u5 u6 u7 u8
u1 1
u2 1 1
u3 1 1
u4 1
u5 1
u6 1 2
u7 2
u8 1 1
• Transition graph
Organizational perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 17
• Transition graph for a larger event log
Organizational perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 18
• Working together
Organizational perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 19
• Number of cases where each pair of users have worked together
Organizational perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 20
u1 u2 u3 u4 u5 u6 u7 u8
u1 3 2 1 1 2 2 2
u2 3 2 1 1 2 2 2
u3 2 2 1 2 2 2
u4 1 1
u5 1 1 1 1 1 1
u6 2 2 2 1 2 2
u7 2 2 2 1 2 2
u8 2 2 2 1 2 2
• Number of cases where each pair of users have worked together
Organizational perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 21
• Number of cases where each pair of users have worked together
Organizational perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 22
• Distribution of work
Organizational perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 23
• Distribution of work
Organizational perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 24
• Timestamp difference between events
Performance perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 25
• Timestamp difference between events
Performance perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 26
• Timeline of events
Performance perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 27
• Relative time
Performance perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 28
• Activity duration
Performance perspective
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 29
Conclusion
• Where to go from here• process mining books
• A Primer on Process Mining: Practical Skills with Python and Graphviz, D. R. Ferreira (Springer, 2017)
• Process Mining: Data Science in Action, W. van der Aalst (Springer, 2016)
• process mining courses• Process Mining: Data Science in Action
https://www.coursera.org/learn/process-mining
• Introduction to Process Mining with ProMhttps://www.futurelearn.com/courses/process-mining
• process mining website• http://www.processmining.org
• process mining tools: ProM, Disco, etc.
© 2018 Diogo R. Ferreira DELix – Lisbon Winter School on Data Science and Engineering | February 15-17, 2018 30