discovering interacting artifacts from erp systems ...ceur-ws.org/vol-1701/paper2.pdfjan mendling...

4
Jan Mendling and Stefanie Rinderle-Ma, eds.: Proceedings of EMISA 2016, Gesellschaft f ¨ ur Informatik, Bonn 2016 Discovering Interacting Artifacts from ERP Systems (Extended Abstract) 3 D. Fahland 1 , X. Lu 1 , Marijn Nagelkerke 2 , Dennis van de Wiel 2 Abstract: Enterprise Resource Planning (ERP) systems are widely used to manage business doc- uments along a business processes and allow very detailed recording of event data of past process executions and involved documents. This recorded event data is the basis for auditing and detecting unusual flows. Process mining techniques can analyze event data of processes stored in linear event logs to discover a process model that reveals unusual executions. Existing techniques assume a linear event log that use a single case identifier to which all behavior can be related. However, in ERP sys- tems processes such as Order to Cash operate on multiple interrelated business objects, each having their own case identifier, their own behavior, and interact with each other. Forcing these into a single case creates ambiguous dependencies caused by data convergence and divergence which obscures unusual flows in the resulting process model. We present a new semi-automatic, end-to-end approach for analyzing event data in a plain database of an ERP system for unusual executions. We identify an artifact-centric process model describing the business objects, their life-cycles, and how the various objects interact along their life-cycles. The technique was validated in two case studies and reliably revealed unusual flows later confirmed by domain experts. The work summarized in this extended abstract has been published in [Lu15]. Keywords: Process Mining, ERP-System, Artifact-Centric Model, Object Life-Cycle, Interaction Discovery 1 Introduction and Problem Description Information systems (IS) not only store and process data in an organization but also record event data about how and when information changed. This “historical event data” can be used to analyze, for instance, whether information processing in the past conformed to the prescribed processes or to compliance requirements. Process mining [Aa11] offers auto- mated techniques for this task. In particular exploring visual models discovered from event data allows to identify unusual flows and their circumstances; based on which concrete measures for process improvement can be devised [Ec15]. Prerequisite to this analysis is a process event log that holds events about information changes with the assumption that each event belong to one specific execution of a specific process. In general, information access is not tied to a particular process execution; rather the same information can be accessed and changed from various processes and applications. For 1 [email protected], Eindhoven University of Technology, The Netherlands 2 KPMG IT Advisory N.V., Eindhoven, The Netherlands 3 This article summarizes problem, approach, and selected findings of a study published as Xixi Lu, Marijn Nagelkerke, Dennis van de Wiel, and Dirk Fahland. Discovering Interacting Artifacts from ERP Systems. Ser- vices Computing, IEEE Transactions on, 8(6), 2015 doi:10.1109/TSC.2015.2474358 [Lu15]. 1

Upload: others

Post on 25-Apr-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Discovering Interacting Artifacts from ERP Systems ...ceur-ws.org/Vol-1701/paper2.pdfJan Mendling and Stefanie Rinderle-Ma, eds.: Proceedings of EMISA 2016, Gesellschaft fur Informatik,

Jan Mendling and Stefanie Rinderle-Ma, eds.: Proceedings of EMISA 2016,Gesellschaft fur Informatik, Bonn 2016

Discovering Interacting Artifacts from ERP Systems(Extended Abstract)3

D. Fahland1, X. Lu1, Marijn Nagelkerke2, Dennis van de Wiel2

Abstract: Enterprise Resource Planning (ERP) systems are widely used to manage business doc-uments along a business processes and allow very detailed recording of event data of past processexecutions and involved documents. This recorded event data is the basis for auditing and detectingunusual flows. Process mining techniques can analyze event data of processes stored in linear eventlogs to discover a process model that reveals unusual executions. Existing techniques assume a linearevent log that use a single case identifier to which all behavior can be related. However, in ERP sys-tems processes such as Order to Cash operate on multiple interrelated business objects, each havingtheir own case identifier, their own behavior, and interact with each other. Forcing these into a singlecase creates ambiguous dependencies caused by data convergence and divergence which obscuresunusual flows in the resulting process model. We present a new semi-automatic, end-to-end approachfor analyzing event data in a plain database of an ERP system for unusual executions. We identify anartifact-centric process model describing the business objects, their life-cycles, and how the variousobjects interact along their life-cycles. The technique was validated in two case studies and reliablyrevealed unusual flows later confirmed by domain experts. The work summarized in this extendedabstract has been published in [Lu15].

Keywords: Process Mining, ERP-System, Artifact-Centric Model, Object Life-Cycle, InteractionDiscovery

1 Introduction and Problem Description

Information systems (IS) not only store and process data in an organization but also recordevent data about how and when information changed. This “historical event data” can beused to analyze, for instance, whether information processing in the past conformed to theprescribed processes or to compliance requirements. Process mining [Aa11] offers auto-mated techniques for this task. In particular exploring visual models discovered from eventdata allows to identify unusual flows and their circumstances; based on which concretemeasures for process improvement can be devised [Ec15]. Prerequisite to this analysis isa process event log that holds events about information changes with the assumption thateach event belong to one specific execution of a specific process.

In general, information access is not tied to a particular process execution; rather the sameinformation can be accessed and changed from various processes and applications. For

1 [email protected], Eindhoven University of Technology, The Netherlands2 KPMG IT Advisory N.V., Eindhoven, The Netherlands3 This article summarizes problem, approach, and selected findings of a study published as Xixi Lu, Marijn

Nagelkerke, Dennis van de Wiel, and Dirk Fahland. Discovering Interacting Artifacts from ERP Systems. Ser-vices Computing, IEEE Transactions on, 8(6), 2015 doi:10.1109/TSC.2015.2474358 [Lu15].

1

mendling
Stamp
Page 2: Discovering Interacting Artifacts from ERP Systems ...ceur-ws.org/Vol-1701/paper2.pdfJan Mendling and Stefanie Rinderle-Ma, eds.: Proceedings of EMISA 2016, Gesellschaft fur Informatik,

Documents Changes

Change id Date changed Reference id Table name Change type Old Value New Value

1 17-5-2020 S1 SD Price updated 100 80

2 19-5-2020 S1 SD Delivery block released X -

3 19-5-2020 S1 SD Billing block released X -

4 10-6-2020 B1 BD Invoice date updated 20-6-2020 21-6-2020

Billing documents (BD)

BD id Date created Document type Clearing date

B1 20-5-2020 Invoice 31-5-2020

B2 24-5-2020 Invoice 5-6-2020

Delivery documents (DD)

DD id Date created Reference SD id Reference BD Document type Picking date

D1 18-5-2020 S1 B1 Delivery 31-5-2020

D2 22-5-2020 S1 B2 Delivery 5-6-2020

D3 25-5-2020 S2 B2 Delivery 5-6-2020

D4 12-6-2020 S3 null Return Delivery NULL

Sales documents (SD)

SD id Date created Reference id Document type Value Last change

S1 16-5-2020 null Sales Order 100 10-6-2020

S2 17-5-2020 null Sales Order 200 31-5-2020

S3 10-6-2020 S1 Return Order 10 NULL

F1

F2

F4 F3

Parent

table

Child

table

Fig. 1: The tables of the simplified OTC example

(Sales Order)

(Delivery)

(Invoice)

(Delivery)

(Invoice)

(Return Order)

(Return Delivery)

(Sales Order) (Invoice) (Delivery)

Divergence

Convergence

7 events “Created” related to S1

3 events “Created” related to S2

(a)

Legend:

Return delivery

Invoice

Sales order

created2

Delivery

created3

created2

Return order

created1

created1

2

1

2

1

1

artifact

Event type

Causal relationor interaction

Deviating interaction

Sales orderSales order created2

Deliverycreated

3

Invoicecreated

3

Return order

created1

Return deliverycreated

1

1

1

2

2

1 1

(b)

(c)

Fig. 2: Creation of documents of Fig. 1 along time (a), a classic process model based on a single caseidentifier (b), and an artifact-centric model based on 5 case identifiers (c).

instance, in Enterprise Resource Planning (ERP) systems (such as SAP and Oracle En-terprise), information is stored in business objects (or documents) which are linked viaone-to-many and many-to-many relations, typically in the relational database. The objectsthemselves are encapsulated in services [AMZ00] which are invoked by high-level end-to-end business processes; each invocation is called a transactions which is logged in thedata object itself.

Fig. 1 shows a simplified example of the transactional data of an Order to Cash (OTC)process supported by SAP systems; Fig. 2(a) visualizes the events of Fig. 1 that are relatedto document creation. There are two sales orders S1 and S2; creation of S1 is followedby creation of a delivery document D1, an invoice B1, another delivery document D2, andanother invoice B2 which also contains billing information about S2. Creation of S2 isalso followed by creation of another delivery document D3. Further, there is a return orderS3 related to S1 with its own return delivery document D4. The many-to-many relationsbetween documents surface in the transactional data of Fig. 1: a sales document can berelated to multiple billing documents (S1 is related to B1 and B2) and a billing documentcan be related to multiple sales document (B2 is related to S1 and S2). This behavioralready contains an unusual flow: delivery documents were created twice before the billingdocument (main flow), but once the order was reversed (B2 before D3).

When applying classical process mining techniques, one first has to extract an event logbased on a single case identifier to which all event data can be related. Choosing SD idin Fig. 1 leads to the two sequences of events shown in Fig. 2(a). Process discovery on

2

Page 3: Discovering Interacting Artifacts from ERP Systems ...ceur-ws.org/Vol-1701/paper2.pdfJan Mendling and Stefanie Rinderle-Ma, eds.: Proceedings of EMISA 2016, Gesellschaft fur Informatik,

this log yields the model of Fig. 2(b) which is wrong: two invoices are created beforetheir deliveries instead of one, and three invoices are created instead of two (known asdivergence and convergence, respectively) [Pi11].

2 Approach: Discovering Artifact-Centric Models

We propose to approach the problem under the “conceptual lens” of artifact-centric mod-els [CH09]. An artifact is a data object over an information model; each artifact instanceexposes services that allow changing its informational contents; a life-cycle model gov-erns when which service of the artifact can be invoked; the invocation of a service in oneartifact may trigger the invocation of another service in another artifact. Information mod-els of different artifacts can be in one-to-many and many-to-many relations allowing todescribe behavior over complex data in terms of multiple objects interacting via serviceinvocations. Under this lens, each document of an ERP system can be seen as an arti-fact; a transaction on a document is a service call on the artifact; behavioral dependenciesbetween transactions of documents can be seen as life-cycle behavior and dependenciesof service calls. Describing the transactional data of Fig. 1 with artifact-centric conceptsyields the model of Fig. 2(c); it visualizes the order in which objects are created and alsohighlights the unusual flow of invoice B2 being created before delivery D2.

Case A1Case A2

Case An

Case B1Case B2

Case Bm

Case C1Case C2

Case Ck

Data Source

Database Schema

Artifact Type

Event Log

1.2 Extract

1.1 Discover

2.1 Discover

Type-Level Interaction

2.2 Add casereferences

Life-Cycle Models...

2.3 Discover

+ Activity-Level Interactions

1.3 Discover

Fig. 3: An overview on our approach.

The problem of discovering an artifact-centricprocess model from relational ERP data de-composes into two sub-problems. (1) Given arelational data source, identify a set of artifacts,extract for each artifact an event log, and dis-cover a model of its life-cycle. (2) Given a setof artifacts and their data source, identify inter-actions between the artifacts, between their in-stances, between their event types and betweentheir events. Figure 3 shows the overview ofour approach. (1.1) We use the data schema ofthe data source to discover artifact types whichdetail all timestamped columns related to a par-ticular business object. (1.2) For each artifactwe then extract a classical event log [Aa11],each case describes all events related to oneinstance of the artifact. (1.3) Existing processdiscovery algorithms allow discovering a life-cycle model of the artifact. In parallel, (2.1)we discover interactions between artifacts fromforeign key relations in the data source; (2.2)during log extraction, each case of an artifact isannotated with references to cases of other artifacts this case interacts with. (2.3) The casereferences are refined into interactions between activities of different artifact life-cycles.

3

Page 4: Discovering Interacting Artifacts from ERP Systems ...ceur-ws.org/Vol-1701/paper2.pdfJan Mendling and Stefanie Rinderle-Ma, eds.: Proceedings of EMISA 2016, Gesellschaft fur Informatik,

3 Results

We implemented our approach based on [NvDF12] and conducted two case studies. Byseparating data into artifacts along one-to-many relations, we eliminated divergence andconvergence, the interaction flows discovered from one-to-many relations were meaning-ful to business users, and unusual flows were detected.

8

1

1472

1

6

341

346

53

59

142

18

17

90

108

138

1

141

13

3

257

22

65

47

2

278

1360

1

2439

23

907

1

46

620

597

1

1

212

1

1

1

2

22

10

12

6

1

2

1

15

107

78

26

31

7

12

371

316

2

11

49

9

15

22 1

89

1

3

18

1298

854

354

42

5 9

1

14

15

1140

646

39

46

55

641

14

304

271

568

734

625

4

8

49

7

5

6

Created

2581

Delivery H_Created

5118

Payment05or15_Payment Received

822

PostInAR_PostedInAR

3479

Invoice H_Created

2629

InvoiceCancellation H_Created

40

Contract H_Created

741

ProFormaInvoice H_Created

730

ReturnDelivery H_Created

1

CreditMemo H_Created

34

ReturnOrder H_Created

1

CreditMemoRequest H_Created

33

DebitMemoRequest H_Created

2

DebitMemo H_Created

14

Fig. 4: SAP OTC process, classical process modelobtained from single event log (top) and artifact-centric model highlighting outliers (bottom)

Fig. 4 shows models obtained from 2months data of an SAP Order-to-Cash pro-cess (11 document header tables, 134,826records of 5-49 attributes); the model at thetop was obtained with a classical approach(only 29 of 77 edges are correct); themodel at the bottom was obtained usingour approach. In both case studies the dis-covered process models were assessed asaccurate graphical representations of thesource data by domain experts; all edgesincluding outlier edges were assessed ascorrect and traced back to the source datatogether with domain experts. These in-sights could be obtained exploratively andmuch faster than with existing best practices.

References[Aa11] Aalst, W.M.P. van der: Process Mining: Discovery, Conformance and Enhancement of

Business Processes. Springer, 2011.

[AMZ00] Al-Mashari, Majed; Zairi, Mohamed: Supply-chain re-engineering using enterprise re-source planning (ERP) systems: an analysis of a SAP R/3 implementation case. IJPDLM,30(3/4):296–313, 2000.

[CH09] Cohn, D.; Hull, R.: Business artifacts: A data-centric approach to modeling businessoperations and processes. Bulletin of the IEEE Computer Society TCDE, 32(3):3–9,2009.

[Ec15] van Eck, Maikel L.; Lu, Xixi; Leemans, Sander J. J.; van der Aalst, Wil M. P.: PM ˆ2: A Process Mining Project Methodology. In: CAiSE 2015. volume 9097 of LNCS.Springer, pp. 297–313, 2015.

[Lu15] Lu, Xixi; Nagelkerke, Marijn; van de Wiel, Dennis; Fahland, Dirk: Discovering Inter-acting Artifacts from ERP Systems. IEEE Trans. Services Computing, 8(6):861–873,2015.

[NvDF12] Nooijen, Erik H. J.; van Dongen, Boudewijn F.; Fahland, Dirk: Automatic Discoveryof Data-Centric and Artifact-Centric Processes. In: DAB’12. volume 132 of LNBIP.Springer, pp. 316–327, 2012.

[Pi11] Piessens, D.A.M.: Event Log Extraction from SAP ECC 6.0. Master’s thesis, EindhovenUniversity of Technology, 2011.

4