business process management -...
TRANSCRIPT
Business Process Management
Paolo Bottoni
Lecture 11: Process Mining
Adapted from the slides for the book :
Dumas, La Rosa, Mendling & Reijers: Fundamentals of Business Process Management, Springer 2013
http://courses.cs.ut.ee/2013/bpm/uploads/Main/ITlecture7.ppt
Where are we?
2
Business Process Management
• Runtime Monitoring (Business Activity Monitoring)
– Viewing the load of the process
– Identifying problematic cases
– Identifying late cases (risk of missing deadlines), etc.
• Post-mortem Monitoring (aka Business Process Analytics)
– Performance KPIs: cycle times, resource utilization, error rates, …
– Identification of bottlenecks
• See for example:
– BizAgi BAM: http://wiki.bizagi.com/en/index.php?title=Analysis_Reports_BAM
– Analytics: http://wiki.bizagi.com/en/index.php?title=Analysis_Reports_Analytics
Types of Process Monitoring
Process Monitoring: Dashboards
Process
Frequency
of Order
Processing
Process Cycle
Time
of Order
Processing
Process Cycle Time
of Order Processing
split up to different
Plants
IDS (2003)
Process Mining Tools
• ARIS Process Performance Manager
• Percetive Reflect
• Fujitsu Interstage (BPM Analytics)
• ProM
Process Mining
Starting point: Event Logs
<B A E F E>
<A A D F C D E>
<A B D F E>
<D D A D F C D E>
<F A D E>
Starting point: Event Logs
Case ID Task Name Originator Timestamp
Case ID Task Name Originator Timestamp
1 File Fine Anne 20-07-2004 14:00:00 3 Reminder John 21-08-2004 10:00:00
2 File Fine Anne 20-07-2004 15:00:00 2 Process Payment system 22-08-2004 09:05:00
1 Send Bill system 20-07-2004 15:05:00 2 Close case system 22-08-2004 09:06:00
2 Send Bill system 20-07-2004 15:07:00 4 Reminder John 22-08-2004 15:10:00
3 File Fine Anne 21-07-2004 10:00:00 4 Reminder Mary 22-08-2004 17:10:00
3 Send Bill system 21-07-2004 14:00:00 4 Process Payment system 29-08-2004 14:01:00
4 File Fine Anne 22-07-2004 11:00:00 4 Close Case system 29-08-2004 17:30:00
4 Send Bill system 22-07-2004 11:10:00 3 Reminder John 21-09-2004 10:00:00
1 Process Payment system 24-07-2004 15:05:00 3 Reminder John 21-10-2004 10:00:00
1 Close Case system 24-07-2004 15:06:00 3 Process Payment system 25-10-2004 14:00:00
2 Reminder Mary 20-08-2004 10:00:00 3 Close Case system 25-10-2004 14:01:00
Process Mining
Process Models
Process discovery
α algorithm
22 Opbergen en
einde
10 registreren
14 eindcontrolere,
tekenen Standaard
17 bepalen vervolg
9 Bepalen vervolg1
18 registreren offerte
gesloten
13 inv., 1e controle,
printen STANDAARD
3 controleren
compleetheid/juistheid
1 start
2 collectief of
particulier
12 Bepalen offerte
standaard of NIET
klaar voor invoeren
Goedgekeurde offerte
begin proces
klaar voor controle
compleet/juist
klaar voor registreren
naar registreren
offerte uitgeprint
klaar voor einde
Standaard offerte
afgekeurde offerte
20 ontvangst
verklaring
P2 accoord
verklaring
7 ontvangst
gegevens
P1 ontbrekende
gegevens
19 wachten op
accoord verklaring
16 eindcontrolere,
tekenen niet std.
15 inv, 1e controle,
printen NIET STD.
retour gewenst
wachten2
4 dubbele aanvraag?
5 navraag VA
(telefoon)
6 opvragen
ontbrekende
gegevens
NS uitgeprint
D2 geen retour
ontvangen
Niet Standaard offerte
21 registreren offerte
afgelegd
is collectief
opvagen gegevens
wachten
dubbele
D1 Geen reactie
8 verlopen deadline
11 afwijzen
Afgekeurd NS
afgewezen
collectief retour reeds ontvangen
P of C retour gewenst
particulier zonder retour
collectief
particulier en invoeren
particulier en afwijzen
niet compleet/onjuist
particulier
collectief
incompleet
voldoende
onvoldoende
www.processmining.org
Conformance Checking
Register
order
Prepare
shipment
Ship
goods
Receive
payment
(Re)send
bill
Contact
customer
Archive
order
Materialis released
TO itemconfirmed
withoutdifferences
Warehouse/Stores
Transferorderitem
is confirmed
Paymentmust
be effected
PurchaseRequisition
Requirementfor materialhas arisen
Requisitionreleased
for schedulingagreement
schedule/SA release
InvoiceVerification
Purchaserequisitionreleased
for purchaseorder
Inbounddeliveryentered
Goodsreceived
Goodsreceiptposted
GoodsReceipt
Purchaseorder
created
Purchasing
Invoicereceived
Decide To Buy Computer
Choose Model
Save Money
Read Test Reviews
Check Bank Account
[reviews ok]
[bad reviews]
[enough]
Order Machine
Order Screen
Receive Machine
[desktop]
Receive Screen
Set Up And Connect
Plug In And Power On
[laptop]
Open Lid
Choose Operating System
Order Windows
Receive Windows
[windows]
Download Linux
[linux]
Install Operating System
Work Hard
[not enough]
[laptop]
[desktop]
www.processmining.org
ProM Demo
www.workflowcourse.com
http://www.processmining.org/
Motivation
• Up until now: – Designed or pre-defined models
– Assumption that they are appropriate
• Process Mining
– Consideration of information from the
execution of proceses
– This is covered in log data
• Logs
– Sequence of log entries, which capture
events in a company that relate to
processes
15 Business Process Management
Log entries
• Examples of log entries
– Check Invoice for Invoice No. 4567 completed on 12.11.2010 at
9:19:57
– Function StoreCustomerData(„Müller“, c1987, „Bad Bentheim“)
completed on 12.11.2010 at 9:22:24
– Send Invoice for Invoice No. 4567 completed on 12.11.2010 at
9:23:18
– Function ContactCustomer(c1987, PromoMailing) completed on
12.11.2010 at 9:24:10
– Function StoreCustomerData(„Miller“, c1988, „Osnabrück“)
completed on 12.11.2010 at 9:26:08
– Check Invoice for Invoice No. 4568 completed on 12.11.2010 at
9:26:38
– Function ContactCustomer(c1988, PromoMailing) completed
on 12.1 1.2010 at Send 9:27:32
–
16 Business Process Management
Logs bear valuable information
Logs bear valuable information to answer
questions like
– When and how many process instances have
been executed?
– Are there recurring patterns in the execution of
activities?
– Can process models be derived from the data?
– Which paths of execution are used how often in
the process models?
– Are there paths which are never taken?
17 Business Process Management
Process Discovery
• Process Discovery is a technique for deriving a
process model from log data
• Input: execution logs as ordered lists of activities
with time stamp and case id
• Output: process model which could have
generated the execution logs
• The case id is often not directly covered in the
data, and needs to be generated in pre-processing
18 Business Process Management
Process Conformance
• Process Conformance is a technique to analyze
the relationship between log data and process
models
• Input: Logs and process model
• Output: information on the relationship, e.g. fitness
19 Business Process Management
Overview
20 Business Process Management
Execution Logs
• Assumption – Execution log defines complete order of events, which
can all be related to process activities
– All events in the execution log relate to process instances of the considered process
• Hint – Often log entries refer to different process models
– This warrants filtering activities
• Abstraction – Techniques often work on abstraction of logs
– Focus on case id and activities
21 Business Process Management
Execution Log Format
• Log format – (caseID, activity)
• Example
– Check Invoice for Invoice No. 4567 completed on
12.11.2010 at 9:19:57
– Function StoreCustomerData(„Müller“, c1987, „Bad
Bentheim“) completed on 12.11.2010 at 9:22:24
– Send Invoice for Invoice No. 4567 completed on
12.11.2010 at 9:23:18
• Resulting Log
– (4567, Check Invoice), (c1987, StoreCustomerData),
(4567, Send Invoice), etc.
22 Business Process Management
Execution Log
• Further abstraction
– A‘s and B‘s
– (case id, task id)
• Additional information
– Event type, time,
resource, data
– Not considered here
• Assumption
– Activity execution
captured by one event
– No intermediate activities
case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D
23 Business Process Management
Order relations
Log based order relations for pairs of activities
a, b T in a workflow log W:
• Direct successor
a >w b i.e. in an execution sequence b directly follows a
• Causality
a w b i.e. a >w b and not b >w a
• Concurrency
a ║w b i.e. a >w b and b >w a
• Exclusiveness
a w b i.e. not a >w b and not b >w a
– Activity pairs which never succeed each other
case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D
• W = {ABCD, ACBD, EF} • Direct successor
• Causality
• Concurrency
Execution log analysis
case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D
A>B
A>C
B>C
B>D
C>B
C>D
E>F
A B
A C
B D
C D
E F
B||C
C||B
1) 2) 3)
• W = {ABCD, ACBD, EF}
• Direct successor
• Causality
• Concurrency
Execution log analysis
α-Algorithm
• The idea is to utilize order relations for deriving a
workflow net that is compliant with these
relations
• Precisely, each order relation results in a petri
net fragment, which imposes the respective
relationship
α-Algorithm
• Idea (a)
a b
α-Algorithm
• Idea (b)
a b, a c and b # c
α-Algorithm
• Idea (c)
b d, c d and b # c
α-Algorithm
• Idea (d)
a b, a c and b || c
α-Algorithm
• Idea (e)
b d, c d and b || c
The Alpha-Algorithm (simplified)
• 1. Identify the set of all tasks in the log as TL.
• 2. Identify the set of all tasks that have been observed as the
first task in some case as TI.
• 3. Identify the set of all tasks that have been observed as the
last task in some case as TO.
• 4. Identify the set of all connections to be potentially
represented in the process model as a set XL. Add the
following elements to XL:
– a. Pattern (a): all pairs for which hold a→b.
– b. Pattern (b): all triples for which hold a→(b#c).
– c. Pattern (c): all triples for which hold (b#c)→d.
• Note that triples for which Pattern (d) a→(b||c) or Pattern
(e) (b||c)→d hold are not included in XL.
The Alpha-Algorithm (cont.)
• 5. Construct the set YL as a subset of XL by:
– a. Eliminating a→b and a→c if there exists some a→(b#c).
– b. Eliminating b→c and b→d if there exists some (b#c)→d.
• 6. Connect start and end events in the following way:
– a. If there are multiple tasks in the set TI of first tasks, then draw
a start event leading to an XOR-split, which connects to every
task in TI. Otherwise, directly connect the start event with the only
first task.
– b. For each task in the set TO of last tasks, add an end event and
draw an arc from the task to the end event.
The Alpha-Algorithm (cont.)
• 7. Construct the flow arcs in the following way:
– a. Pattern (a): For each a→b in YL, draw an arc a to b.
– b. Pattern (b): For each a→(b#c) in YL, draw an arc from a to an
XOR-split, and from there to b and c.
– c. Pattern (c): For each (b#c)→d in YL, draw an arc from b and c
to an XOR-join, and from there to d.
– d. Pattern (d) and (e): If a task in the so constructed process
model has multiple incoming or multiple outgoing arcs, bundle
these arcs with an AND-split or AND-join, respectively.
• 8. Return the newly constructed process model.
α-Algorithm Example
case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D
α-Algorithm Example
case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D
(W):
α-Algorithm
Log Completeness
• Level of completeness required for a log
– Assume for the execution sequence EF, there is a log missing
– Then, the correct process model cannot be derived
• Basic assumption: each execution sequence must be part of the log
– Consequence: the complete behaviour is visible
– Problem: amount of required instances grows dramatically
– Example:
• 10 activities are executed in parallel
• Amount of potential execution sequences:
10! = 3.628.800
Log Completeness
• Result
– For the α-Algorithm it is sufficient to have completeness in terms of the successor relationship (>w)
• Reason
– All other relations are derived from direct successorship
• Interpretation
– Each time two activities may succeed each other, this must be
visible in at least one execution sequence
• Hint
– In case of highly concurrent process models, this reduces the
amount of required execution sequences dramatically
Advanced Features: Decision mining
Advanced Features: Social network
mining