percon : a personal digital library for heterogeneous data center for the study of digital libraries...
TRANSCRIPT
![Page 1: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/1.jpg)
PerCon : A Personal Digital Library for Heterogeneous Data
Center for the Study of Digital LibrariesDepartment of Computer Science & Engineering
Texas A&M University
![Page 2: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/2.jpg)
2
Outline
• Background
• Motivation
• Objective
• Approach
• System Evaluation
• Results
• Conclusion
• Appendix
![Page 3: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/3.jpg)
3
Buzzword in CS
Big Data
Hadoop
Distributed computing
Machine Learning
Cloud
Multithread
Multicore
Web services
Social network
Platform
Crowdsourcing
Information retrieval
Agile Algorithm
Data science
Data mining
![Page 4: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/4.jpg)
4
Background
• Data explosion / Data-intensive scientific discovery
• Interdisciplinary researches• Advances in devices/sensors, software• …
• More data of more data types
• Demands on collecting, managing, and interpreting heterogeneous data
![Page 5: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/5.jpg)
5
Motivation
• Data management and analysis • Domain-specific representations, visualizations,
interfaces, tools, etc.• Separate “silos” for data of each data type
• Needs for a heterogeneous data environment • Ingesting, processing, and indexing data • Searching, browsing, visualizing, annotating, and
annotating data• Representing and sharing information and
knowledge • Facilitating interactions between a user and a
system with heterogeneous data
![Page 6: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/6.jpg)
6
Objective
• A digital library that supports the collection, management, and interpretation of unanticipated collections of data types
• PerCon: Personalized and Contextual Data Environment• A personal or small group digital library system for
data management and analysis
![Page 7: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/7.jpg)
7
PerCon: Designed Workflow
Data /QueryProcessing
Query Parser
T4T4T3 T3 T2T2T1T1Timestamp:
User Interface
Data Analysis
Database Repository
Application
Data Ingestion
Heterogeneous Scientific Dataset
User
System Resource
User Information/Knowledge SpaceSystem
Web Server
DataProcessing
Domain-dependent feature space
Cross-domain feature space
Personalized feature space
Feature / Knowledge Space : Data flow
: Query flow
![Page 8: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/8.jpg)
8
System Architecture
Resource Layer: Original data objects, computed/filtered datasets, and metadata.
Middleware Layer: Data ingestion, access, automated analysis, visualization, workspace, etc.
Application Layer: User interfaces, external systems access
![Page 9: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/9.jpg)
9
System Interfaces
Menu &Toolbars
Visualworkspace
Repository Viewer
SuggestionHistory
![Page 10: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/10.jpg)
10
Workspace
• Exploration and representation of data with visual and spatial attributes
• Translation of data into information in multiple representations
• Knowledge discovery from information • Data object model : multiple applicable data
visualization
System Base (Object) Panel- Visual and spatial attributes- User expression for data
interpretation
User Application (Object) Panel- Individual visualization/application- Application-specific interaction
![Page 11: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/11.jpg)
11
• Interoperation with history mechanism
• Event records used for mixed-initiative interaction
• Representation of any Java application as individual data objects
Integrated Visual Workspace
![Page 12: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/12.jpg)
12
Mixed-Initiative Interaction
• User-control system
• Menu, toolbar, button, etc.
• System-control system
• Automated system
• Ex) Call center
• Mixed-control system
• Turn-taking & Alternating control
• High computation + high interpretation
• Recommender system
![Page 13: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/13.jpg)
13
Recommendation in PerCon
• Inference of user interests depending user behaviors/events/tasks/goals
• Location and recommendation of related data within the current collection
![Page 14: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/14.jpg)
14
Procedures
1. Building feature space
(Understanding relationships in data)
2. Recording workspace events (history)
3. Inferring user interests
using probabilistic networks
4. Selecting relevant data
5. Recording user’s acceptance/rejection
6. Adopting user feedback
![Page 15: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/15.jpg)
15
Data Analysis Agent
• Internal process for mixed-initiative interaction
Relationship(Similarity, Distance, etc.)
Metadata
AgentUser
Dataset
Inference(Probabilistic)
Network
Suggestion
Feedback
User Events
TrainingInference
Processing
Suggestion History
Feedback Recording
Mixed-initiative interaction
Update
Index
Index
Data Matrix
Data ID TypeDate
User Activity / History
Visual AttributesSpatial Attributes
Annotation
Query Exploration
Workspace Monitoring
Suggestion Request
![Page 16: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/16.jpg)
16
Probabilistic Inference Network
River level – river levelPrecipitation - precipitationRiver level – Precipitation …. . .
Hidden variable
ADD_SYMBOLDELETE_SYMBOLMOVE_SYMBOL RESIZE_SYMBOLCHANGE_BORDER_COLOR …
Data source(s) that a user is interested in
Visual attribute1 ( Background color)
Activity in workspace
Data relationships between data objectsexplored
Data application
PlotTimelineMultimedia playerXML viewerDB viewerCalendar viewer. . .
BlueRedBlackGreen. . .
Visual attribute2 (Border Color)
BlueRedBlackGreen. . .
Data source creation
River level River dischargePrecipitationTemperatureHumidity…
River level River dischargePrecipitationTemperature Humidity…
Annotation
Yes No
Observable variable
![Page 17: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/17.jpg)
17
System Evaluation - User Study
• Hypotheses
• H1: Visual workspace helps a user to manage data and to translate data into
knowledge about the domain
• H2: Mixed-initiative recommendations improve a user’s ability to explore and
analyze data
• 24 Participants
• 1 undergraduate, 4 Masters, 16 PhD students, and 3postdoctoral researchers
• Age from 24 to 36
• Various disciplines
• Computer science, computer engineering, electrical engineering, soil hydrology,
biomedical engineering, industrial engineering, and management information systems
![Page 18: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/18.jpg)
18
Domain Data for Participant Analysis
• Two years of weather and river data (from 2011 to 2013)• Weather data from NOAA
• Temperature, precipitation, relative humidity, wind speed, and wet bulb temperature
• River data from Brazos River Authority in Texas • River level and discharge
• Two equivalent “weather and river” datasets • Dataset 1 collected from College Station, Waco, and Temple• Dataset2 collected from South Bend, Seymour, and Fort Griffin
Upstream
Downstream
![Page 19: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/19.jpg)
19
Tasks
• Task1 (20 minutes): Classifying and organizing data
• Organize and classify river level and precipitation data according to
common trends, quantities, durations, or other user-perceived criteria.
• Task2 (10 minutes): Investigating and identifying data correlation
• Investigate what and how weather factor(s) affects river level.
• Investigate how rivers at different places are correlated.
• Task3 (5 minutes): Interpreting and estimating river data events/causes
• Estimate the (average) time delay regarding the flow if you find any
• Explain the changes considering weather factors and other river stream
flows
![Page 20: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/20.jpg)
20
System Conditions
Configuration 1 Configuration 2
Configuration 3 Configuration 4
WO/ Visual Workspace
WO/ Mixed-Initiative Recommendation W/ Mixed-Initiative Recommendation
W/ Visual Workspace
![Page 21: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/21.jpg)
21
System Conditions
![Page 22: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/22.jpg)
22
Task Procedure (120 minutes)
• Participants learning with a user manual + 10 minute video clip• 5 minutes trial of how to use PerCon• Tasks (Task 1,2, and 3)
Group Subgroup Tasks with dataset 1 Tasks with dataset 2
Group 1A Configuration 1 Configuration 2
B Configuration 2 Configuration 1
Group 2 A Configuration 1 Configuration 3
B Configuration 3 Configuration 1
Group 3A Configuration 1 Configuration 4
B Configuration 4 Configuration 1
Group 4 A Configuration 2 Configuration 3
B Configuration 3 Configuration 2
Group 5A Configuration 2 Configuration 4
B Configuration 4 Configuration 2
Group 6A Configuration 3 Configuration 4
B Configuration 4 Configuration 3
![Page 23: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/23.jpg)
23
Responses to Questions Related to Workspace
Q1 Q2 Q3 Q40
1
2
3
4
5
6
7
Visual Workspace
config 1confg 2config 3config 4Sc
ore
Statements
Q1I had enough support to understand data content in the workspace
Q2 I had enough support to express relationships in the way I wanted
Q3 It was easy to interpret and characterize given/created objects in the workspace
Q4I had enough support to effortlessly / quickly browse and select data
![Page 24: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/24.jpg)
24
Responses to Questions Related to Recommendations
Q5 Q6 Q7 Q80
1
2
3
4
5
6
7
Mixed-initiative interaction
Config 2Config 4
Scor
e
Statements
Q5 I was satisfied with the data suggested
Q6 I was satisfied with the suggestion request
Q7 I had enough support to find and interpret data I was interested in
Q8 I had enough support to find correlations within the dataset
![Page 25: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/25.jpg)
25
Participant Work Practices
Config 1 Config 2 Config 3 Config 40
5
10
15
20
25
30
35
40
45
50
Avg. number of data objects classified/analyzed
Avg. # of data objects
Avg
. # o
f dat
a ob
ject
s
![Page 26: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/26.jpg)
26
Distribution of Activities
• Ordering of user events in repository browser and workspace shows distinct patterns of work
User 1
User 2
User 3
User 4
Config. 1 (without visual workspace) Config. 3 (with visual workspace)
![Page 27: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/27.jpg)
27
A Sequence of Recommendation Events
User1
User2
User3
User4
User5
User6
User7
User8
User9
User10
User11
User12
User-Requested System-Triggered
![Page 28: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/28.jpg)
28
History in PerCon
Storing and processing data Visualizing data
Managing and analyzing data Human activities of locating, annotating, and interpreting data / Data platform
![Page 29: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/29.jpg)
29
Conclusion
• Workspace has a large effect on data (analysis) practice• Recommendation overcomes the difficulty of locating
data for users • Visual workspace
• Facilitates information representation • Aids in identification and interpretation of relationships
between datasets• Helps users learn, solve problems, and make decisions
• Mixed-initiative interaction(recommendation)• Encourages users to explore data • Leads to identify more evidence of correlation among
datastreams• Is valuable for data analysis
![Page 30: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/30.jpg)
30
Future Work
• Improvement of workspace interactions
• More dynamic and tailorable data visualization
• Expansion of recommendation subsystem
• Cross-domain/cross-data-type similarities in the workspace
• Various similarity metrics
• Recommendation algorithms
• Exploration the user of PerCon
• In new domains
• With new user communities.
![Page 31: PerCon : A Personal Digital Library for Heterogeneous Data Center for the Study of Digital Libraries Department of Computer Science & Engineering Texas](https://reader038.vdocuments.us/reader038/viewer/2022110210/56649e7a5503460f94b7a0fd/html5/thumbnails/31.jpg)
31
Question?