representing and visualizing mined artful processes in m ail o f m ine claudio di ciccio, massimo...
TRANSCRIPT
Representing and Visualizing Mined Artful Processes in MailOfMine
Claudio Di Ciccio, Massimo Mecella, Tiziana Catarci
HCI-KDD @ USAB 2011: Information Quality in e-HealthGraz, 2011, November the 24th
Motivation (1)
• Artful processes [HillEtAl06]– informal processes typically
carried out by those people whose work is mental rather than physical (managers, professors, researchers, engineers, etc.)
• “knowledge workers”[ACTIVE09]
• Knowledge workers create artful processes “on the fly”
• Though artful processes are frequently repeated, they are not exactly reproducible, even by their originators, nor can they be easily shared.
Artful processes and knowledge workers
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 2
Motivation (2)
• In collaborative contexts, knowledge workers share their information and outcomes with other knowledge workers
– E.g., a software development mgr.
• Typically, by means of several e-mail conversations
– E-mail conversations are actual traces of running processes that knowledge workers adhere to
E-mail conversations
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 3
Motivation (3)
• From the collection of e-mail messages, you can extract the processes that lay behind– Related e-mail conversations are traces of their runs
• Valuable advantages for users– Automated discovery of formal representations
• with no effort for knowledge workers
– Tidy organization for naïve best practices kept only in mind
– Opportunity to share and compare the knowledge on methodologies
– Automated discovery of bottlenecks, delays, structural defects• from the analysis of previous runs
• E-mail conversations are a kind of semi-structured text– this approach is not tailored to the electronic mail
• it can be extended to the analysis of other semi-structured texts
Processes from e-mail conversations
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 4
Motivation (4)
• Personal information management (PIM)– how to organize one’s own activities, contacts, etc. through the
usage of software [CatarciEtAl07, ACTIVE09]
• Information warfare– in supporting anti-crime intelligence agencies
• Enterprise engineering– for knowledge-heavy industries, where preserving documents
making up product data is not enough[SmVortex, Heutelbeck11]
• eHealth– for the automatic discovery of medical treatment procedures on
top of patient health records
Some areas of applicability
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 5
The approach
• Representation– Declarative workflows [vanDerAalstEtAl09] for representing artful
processes
– Regular grammars to express declarative workflows constraints
• Mining– Object Matching [ZardettoEtAl10] for
• clustering e-mail conversations• finding the matching between activity and tasks instances
– Regular expression mining [GarofalakisEtAl99] for inferring constraints
– Supervised learning to group activities into processes
– Text mining information extraction to determine tasks out of e-mail messages [CohenEtAl04, SakuraiEtAl05]
How to represent and infer artful processes
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 6
Algorithm (1)From the e-mail archive to key parts
Mail archive Mail Database Conversations KeyParts
Multi-format mail storageplug-in based crawlers
[ZardettoEtAl10]-basedclustering algorithm
[CarvalhoEtAl04]-based filter
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 7
Algorithm (2)From key parts to the activities
Activityindicium
Tasks
Key PartsConcatenation
[ZardettoEtAl10]-based
Activities
[GarofalakisEtAl99]-based
pattern miner
[CohenEtAl04,SakuraiEtAl05]
-based task extractor
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 8
Algorithm (3)From activities to the processes
[GarofalakisEtAl99]-based
pattern miner
Supervised learning
Process indicium
Process
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 9
On the visualization of processes
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 10
The imperative model
• Represents the whole process at once
• The most used notation is based on a subclass of Petri Nets (namely, the Workflow Nets)
On the visualization of processes
• Rather than using a procedural language for expressing the allowed sequence of activities, it is based on the description of workflows through the usage of constraints• the idea is that every task
can be performed, except the ones which do not respect such constraints
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 11
The declarative modelIf A is performed,
B must be perfomed,no matter
before or afterwards(responded existence)
Whenever B is performed,C must be performed
afterwardsand B can not be repeated
until C is done(alternate response)
The notation here is based on [VanDerAalstEtAl06] (DecSerFlow)
On the visualization of processes
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 12
Imperative vS declarative
Imperative
Declarative
Declarative models work better in presence of a partial specification of the process scheme
Representation of artful processesAn example of expected outcome
Existenceconstraints
Relationconstraints
Tasks
Notation based on [VanDerAalstEtAl06] (DecSerFlow)
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 13
On the visualization of processesAn example of DecSerFlow [VanDerAalstEtAl06] notation
No, it is not the initialaction
You could even start from here
• You might want to run a legal trace like this:
⟨ a3, a3, a3, a2, a2, a3, a4, a5, a6, a7, a6, a5 ⟩• What we want to state here is that such a notation is probably not quite intuitive
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 14
On the visualization of processesOur proposal
• We do not consider a static graph-based global representation alone the best suitable solution.
• A graphical representation, easy to understand at a first glimpse, must be used.
• Idea:– when presenting the process schema (static view):
1) a local view on tasks/activities, showing related constraints only;
2) a global view on the process, either:a) basic (less information, less symbols), or
b) extended (more information, more symbols, extending (a));
– (2) can work as a kind of navigation map for (1)
1) when presenting the running instance (dynamic view):1) a dynamic interactive trace representation diagram, based on the
local static view notation.
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 15
On the visualization of processesIntroducing the new local view: the rationale
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 16
On the visualization of constraintsThe static local view: some examples
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 17
On the representation of processesThe static global view
Basic Extended
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 18
A GUI sketchLocal and global views together
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 19
On the representation of constraintsDynamic view
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 20
References
• [BranderEtAl11] Brander, S., Hinkelmann, K., Hu, B., Martin, A., Riss, U. V., Thönssen, B., Witschel, H. F.. Refining process models through the analysis of informal work practice. In BPM, Lecture Notes in Computer Science (6896), 116–131. Springer (2011).
• [HillEtAl06] Hill, C., Yates, R., Jones, C., Kogan, S.L.: Beyond predictable workflows: Enhancing productivity in artful business processes. IBM Systems Journal 45(4), 663–682 (2006)
• [ACTIVE09] Warren, P., Kings, N., et al.: Improving knowledge worker productivity - the active integrated approach. BT Technology Journal 26(2), 165–176 (2009)
• [CatarciEtAl07] Catarci, T., Dix, A., Katifori, A., Lepouras, G., Poggi, A.: Task-centred information management. Proc. DELOS Conference, LNCS 4877 (2007)
• [SmVortex] Smart vortex – management and analysis of massive data streams to support large-scale collaborative engineering projects. FP7 IP Project: http://www.smartvortex.eu/
• [Heutelbeck11] Heutelbeck, D.: Preservation of enterprise engineering processes by social collaboration software (2011), personal communication, to appear in Proc. COLLIN2011 - 2nd Symposium on Collective Intelligence
• [vanDerAalstEtAl09] van der Aalst, W.M.P., Pesic, M., Schonenberg, H.: Declarative workflows: Balancing between flexibility and support. Computer Science - R&D 23(2), 99–113 (2009)
• [ZardettoEtAl10] Zardetto, D., Scannapieco, M., Catarci, T.: Effective automated object matching. Proc. ICDE 2010
• [CohenEtAl04] Cohen, W.W., Carvalho, V.R., Mitchell, T.M.: Learning to classify email into “speech acts”. Proc. EMNLP 2004
• [SakuraiEtAl05] Sakurai, S., Suyama, A.: An e-mail analysis method based on text mining techniques. Appl. Soft Comput. 6(1), 62–71 (2005)
• [CarvalhoEtAl04] Carvalho, V.R., Cohen, W.W.: Learning to extract signature and reply lines from email. Proc. CEAS 2004
• [GarofalakisEtAl99] Garofalakis, M.N., Rastogi, R., Shim, K.: Spirit: Sequential pattern mining with regular expression constraints. Proc. VLDB 1999
• [vanDerAalstEtAl06] van der Aalst, W.M.P., Pesic, M.: Decserflow: Towards a truly declarative service flow language. Proc. WS-FM 2006
• [Pnueli77] Pnueli, A.: The Temporal Logic of Programs. Proc. 18th Annual Symposium on Foundations of Software Technology and Theoretical Computer Science, 1977
Cited articles and resources, in order of appearance
Representing and Visualizing Mined Artful Processes in MailOfMineClaudio Di Ciccio (DIS, SAPIENZA – Università di Roma) P. 21