visual analytics meets process mining: challenges and...
Post on 20-Jun-2019
218 Views
Preview:
TRANSCRIPT
Visual Analytics Meets Process Mining: Challenges and OpportunitiesTheresia Gschwandtner and Silvia Miksch
Information Overload[Howson, 2008]
[Aigner - presentation 2015]
Image: Brett Ryder (The Economist, 2010)
Information Overload
Electronic health records
Real-time sensors
Communication logs
Transactions
Motivation: An Information Gap
Somewhere in the data there is valuable information.
S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S SS S S S S S S S
S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S SS S S S S S S S
S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S SS S S S S S S S
[Card et al. 1999]
One Approach: Machine Learning & Data Mining
Tap the power of computersStatistical analysisReports
$
S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S SS S S S S S S S
S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S SS S S S S S S S
S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S SS S S S S S S S
[Card et al. 1999]
well grounded field
Descriptive statistics
Confirmatory data analysis
main challenge: find a model
Bayesian statistics
Exploratory data analysis [Tukey, 1977]
visual exploration methods
insights about what the data looks like find a model
Statistics
[Fekete et al, 2008]
Another Approach: Visualization
Tap the power of human perceptionComplex view of the dataInteractive controls to explore data and see patterns
S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S SS S S S S S S S
S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S SS S S S S S S S
S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S S SS S S S S S S S
[Card et al. 1999]
WHY VISUALIZATION?
Anscombe's Quartet
[http://en.wikipedia.org/wiki/Anscombe's_quartet; Anscombe, 1973]
Anscombe's Quartet
[http://en.wikipedia.org/wiki/Anscombe's_quartet; Anscombe, 1973]
Goal: automatically find interesting facts in the data
Hertzsprung Russel Diagram
[Fekete et al., 2008]
X-axis : temperature of stars and the
Y-axis: their magnitude.
Let's play a game: the game of 15. The ‘pieces’ for the game are the nine digits - 1, 2, 3, 4, 5, 6, 7, 8, 9. Each player takes a digit in turn. Once a digit is taken, it cannot be used by the other player. The first player to get three digits that sum to 15 wins.
Player A takes 8Player B takes 2 Player A takes 4 Player B takes 3 Player A takes 5
Question 1: Suppose you are now to step in and play for B. What move would you make?
Game of 15
[Norman, 1993]
Tic-Tac-Toe
xox o
x
[Norman, 1993]
Game of 15
[Norman, 1993]
[Few, 2006]
high bandwidth fast, parallel pattern recognition pre-attentive expand human working memory
“The eye... the window of the soul, is the principal means by which the central sense can most completely and abundantly appreciate the infinite works of nature.”
Leonardo da Vinci (1452 – 1519)
Human Vision
EXTERNAL COGNITION
»The power of the unaided mind is highly overrated. […] How have we increased memory, thought, andreasoning? By the invention of external aids: It isthings that make us smart.«
[Norman, 1993, p. 43]
External Cognition
34x 72
0
20
40
60
80
100
120
Mental Paper & Pencil
Tim
e to
Mul
tiply
(sec
)
6823802448
2
1
[Card et al., 1999]
Visualization Success Story
Mystery: What is causing a cholera epidemic in London in 1854?
[Tufte, 1997]adapted from [Hearst , 2004]
London 1854
Analytical Reasoning Process
[Thomas & Cook 2005]
Why Visualization?
Increasing cognitive resources
such as by using a visual resource to expand human working memory
Reducing search
such as by representing a large amount of data in a small space
Enhancing the recognition of patterns
such as when information is organized in space by its time relationships
Supporting the easy perceptual inference of relationships
that are otherwise more difficult to induce
Perceptual monitoring of a large number of potential events
Providing a manipulable medium
that, unlike static diagrams, enables the exploration of a space of parameter values
[Card et al., 1999]
Method
[Aigner - presentation 2015]
MACHINE LEARNING & DATA MINING
Machine Learning & Data Mining
[ Aigner et al., 2011]
Visualization vs. Computation
[http://infoproc.blogspot.co.at/2013/06/spy-vs-spy.html]
VISUAL ANALYTICSCOMBINATION OF VISUAL AND ANALYTICA METHODS
[http://infoproc.blogspot.co.at/2013/06/spy-vs-spy.html]
Screen resolution: 1600 * 900 = 1.440.000Yearly measurements of water level in Low.Austria:1 5.256.000Number of cellular phones in Austria (2005):2 8.160.000Transmitted emails every hours (world-wide):3 35.388.000
Whole data often not presentableStatistics, Machine Learning & Data Mining
Most important data and information
Results
Huge Amounts of Data vs. Screen Resolution
1 ... Amt der NÖ Landesregierung, Abt. WA5 - Hydrologie, http://www.noel.gv.at/SERVICE/WA/WA5/htm/wnd.htm2 ... CIA Factbook, https://www.cia.gov/cia/publications/factbook/3 ... How Much Information?, UC Berkeley, http://www2.sims.berkeley.edu/research/projects/how-much-info-2003/
Visual Analytics
“Visual analytics isthe science of analytical reasoning
facilitated by interactive visual interfaces.”
[Thomas and Cook, 2005]
Visual Analytics – Process
[Keim, et al. 2008]
Visual Analytics
Data Mining vs InfoVis Analytic Process
[Bertini and Lalanne, 2010]
INTERACTIVITY
„Interaction between human and computer is at the heart ofmodern information visualization and for a single overridingreason: the enormous benefit that can accrue from beingable to change one's view of a corpus of data. […]
Those who wish to acquire insight must explore, interactively subsets of that corpus to find their way towardsthe view that triggers an 'a ha!' experience.“
[Spence, 2007]
InteractivityPast
Only passive observationsRepresentation not changeable“one fits all”
TodayActive examination with visualizationsDynamically adaptable and modifiable
→ Different users, goals, and data
Interactivity
[http://www.sxc.hu]
[http://www.google.com]
[http://www.google.com]
[http://www.google.com]
[http://www.google.com]
[http://www.google.com]
[http://www.google.com]
[http://www.google.com]
[http://www.google.com]
Interaction Taxonomy
[Yi et al., 2007] and [Raskin, 2000]
IndicateSelectExploreReconfigureEncodeAbstract/ElaborateFilterConnectActivateModify
InfoScope
Indicate: show me where I am pointing at
[Brodbeck and Girardin, 2003]
Select: mark something as interesting
[Brodbeck and Girardin, 2003]
InfoScope
Explore: show me something else
Overview + Detail,Zooming + Panning
[Aigner and Miksch, 2006]
[Bade et al., 2004]
Gravi++
Reconfigure: show me a different arrangement
[Hinum et al., 2006]
Encode: show me a different representation
Multiple Views: Brushing & Linking
[Baldonado et al., 2000]
Displaying detailed information about data case(s)
Abstract/Elaborate: show me more or less detail
[Weishapl and Aigner, 2007]
Filter: show me something conditionally
[Shneiderman, 1994 ff]
Connect: show me related items
[Brodbeck and Girardin, 2003]
Activate: trigger actions
[Rind et al., 2010]
VisuExplore
generate
delete
move
transform
copy
Modify: manipulate elements
Value of Interaction
Reduction of distanceReducing the gulfs of execution and evaluation
[Norman, 1988]
Value of Interaction
Reduction of distanceReducing the gulfs of execution and evaluation
Reduction of cognitive load Cognitive offloading, external anchoring, information foraging
Higher engagement Feeling of being in control / first person-ness
Higher expressiveness of the user interface language Richer possibilities for input and output
VISUAL APROACHES FORPROCESS MINING
C9: Combining Process Mining With Other Types of Analysis
„By combining automated process mining techniques with interactive visual analytics, it is possible to extract more insights from event data.”
[van der Aalst et al., 2011]
Process Mining Tasks
[van der Aalst et al., 2011]
Process Discovery
[Günther and van der Aalst, 2007]
Fuzzy Miner
Outflow: aggregated temporal event data
Process Discovery
[Wongsuphasawa, 2011]
Process Discovery
Glyph for a reoccurring graph pattern [Maguire et al., 2013]
Process Discovery
Different views for a process model: logical and time based
[Hipp et al., 2012]
Process Discovery[Vrotsou et al., 2009]
ActiviTree
Process Conformance
[Adriansyahet et al., 2011]
Process Conformance
[http://www.processmining.org/online/conformance_checker]
Process Enhancement [Rozinat, 2009]
Disco
ProM Performance Analysis Plugin
Color: waiting time
Labels: transition probabilities
Process Enhancement
[http://www.processmining.org]
Process Enhancement
[van der Aalst, 2011]
Potential of Visual Analytics
Visualizations, interactions & mining techniques to
Explore the event log data beforehand
Understand data
Identify interesting behaviour
Investigate and finetune process model
EXAMPLES
Guideline Conformance
Actions scheduled by Clinical Practice Guideline vs. actions executed by caregiver
Valid Action:
Applied correctly (conditions fulfilled)
Invalid Action:
Applied by caregiver, but not according to guideline
Missing Action/Missing Action Interval:
Missing in the execution (although scheduled)
Guideline Conformance
[Bodesinsky et al., 2013]
Guideline Conformance
Temporal view of executed actions
Valid Actions as diamond
Invalid Actions marked with “X”
Time spans (missing action execution): connected upper and lower parallel bars
[Bodesinsky et al., 2013]
Guideline Conformance
Highlighting of time points
Vertical line for quick identification
[Bodesinsky et al., 2013]
Plan Strips[Seyfang et al., 2012]
EventExplorer
books – sports & wellnesssportssportswear
[Bodesinsky et al., 2015]
EventExplorer
viewrelated itembuy
[Bodesinsky et al., 2015]
EventExplorer
viewrelated itembuy
[Bodesinsky et al., 2015]
EventExplorer
viewrelated itembuy
[Bodesinsky et al., 2015]
EventExplorer
[Bodesinsky et al., 2015]
EventExplorer: Temporal Scale
[Bodesinsky et al., 2015]
EventExplorer
[Bodesinsky et al., 2015]
EventExplorer: Pattern Mining
Automatic
Select minimum frequency and length of patterns of interest
Overview of occuring patterns
Interactive
Select sequence of interest and count re-occurences
Search for a specific pattern
Use of wildcards
Other FeaturesGrouping events
Grouping similar events
E.g., ‚indian restaurant‘ and ‚vietnamese restaurant‘ into ‚exoticrestaurant‘
Sorting sessions
Time
Pattern count
Sequence length
Display other attributes
Filtering
Open Issues
Visualization
Scalability and aggregation
Better visual pattern detection
Interactions
Ordering sessions by similarity
Different vertical alignments (sequential and temporal)
Pattern mining
Fuzzy pattern mining
Pattern mining with temporal distances
FURTHER CHALLENGES
Data quality and uncertainty
Complex and less structured logs
Case heterogeneity
Event granularity
Concept drift
Missing, incorrect, imprecise, uncertain, irrelevant data, noise
van der Aalst et al.: finding, merging, and cleaning event data
[Bose et al.,2013] [van der Aalst et al., 2011]
Time has a Complex Structure
Modelling Time
[Aigner et al.,2011]
Modelling Time
[Aigner et al.,2011]
Callenges – Summary
Intertwining Process Mining with Visual Analytics
Interaction to properly support process discovery and enhancement
Scalable analysis from single event sequences to multiple event logs
Data quality and uncertainty
Complexity of time-oriented data
Evaluation
Thanks to…
Wolfgang Aigner,Peter Bodesisnsky,Paolo Federico,Silvia Miksch,Wil van der Aalst,and many more
...for I reused and adapted some of their slides, pictures, and ideas
top related