The Prajna ProjectUtilities for Understanding
Edward Swing
Why Another Toolkit?
• Most toolkits focus on Information Control• Information retrieval (databases)• Information presentation (displays)• Linking to other information (cross-referencing)• Less emphasis on information understanding
• Prajna: Sanskrit for “Wisdom”• Provide tools and capabilities to enhance the user’s understanding
Knowledge Engineering
• Involves Developer and User• Steps for understanding and applying the knowledge
• Explicit and inherent structures of data• Identify secondary sources of information• Add reasoning components
Prajna Design
• Java Utility Library• Extensible Classes
• Allow developers to integrate specific features
• Integration Points for other Software Libraries, Products, and Utilities• Still Evolving
Prajna
Conceptual Design
Data Source Data Source
Data Accessor
SemanticReasoning
Data Fusion
Data Accessor
FutureCapabilities
Databases, Data Files, Other Content
Retrieves Data, Assembles Structures(Trees, Graphs, Grids, Datasets)
Reasoning Layer: Augments Data
Visualization Display Data Export Representation Layer
Core Components
• Data Structures• Trees, Node-link Graphs, Grids (Volumes)• Geographic, Temporal Data
• Semantic Content: Data Accessors• Common representation of data• Access data from a variety of sources
• Semantic Reasoning• Ontologies• Intelligent Fusion
• Visualization Displays
Data Accessors
Data Source
Data Accessor
Tree Graph GridDataSet
Data ElementsStrings EnumsIntegers MeasuresLocations DateTime
Data ConfigurationStructuresFieldsReasoners
Representation Components
Visualization Display
Data Export
PrefuseVisualization
PrajnaVisualizations
KML GraphML
JFreeChartCharts
Serialized Objects(Servlet)
Data AccessorUpdate
GeographicDisplay
Advantages
• Easy Extensibility• Configurable Data Accessors
• Regardless of underlying data• Use data where it exists
• Uses other toolkits where appropriate• New toolkit interfaces • No reinventing the wheel
• Integrates easily into larger projects
Integration Points
• Data Accessors• Endeca, XML formats, JDBC, SOLR, …• Streaming Data (in development)
• Data Generation• Common data formats: SVG, GraphML, …
• Visualization Displays• Prefuse toolkit for visualizations• JFreeChart for charting
• Geographic Data• ESRI shape files• Google Earth KML
Demo: Fusion with Reasoning
• Core data: CSV file of wine data• Fields include cost, winery, region, flavors, vintage year, review score and date,
etc.• 57000+ records• Not clean data• Some fields with multiple values (e.g., flavors)
• Additional information:• Dbpedia access to determine grape species• GeoNames access to determine region location• Wine Ontology for additional information
• Display data in List, Map or Chart
Demo: Wine Data WebApp
• Core Data: Wine data, accessed from Endeca Information Access Platform• Faceted Navigation Web Interface
• Allows easy search, filtering• Standard Endeca capability
• Augmented Data• Wine Ontology, GeoNames• Food Ontology for matching Food to Wine
• Displays• Google Maps display• Simile Timeline
Demo: VAST Cell Phone Challenge
• Scenario: Phone network analysis of Paraiso leadership over 10-day period• Data Provided: cell phone call records (CSV)
• Fields included ID of caller, receiver, cell phone tower, time and duration of call• Auxiliary data:
• Map with estimated locations of cell towers• Goal: identify key personnel, changes to the social network over time period• Displays: Network, Map, Statistical Charts, Prefuse Force-directed display
Further Information
• Other Applications• Other VAST contest entries• Data Fusion & Visualization applications
• Prajna released to SourceForge:• http://sourceforge.net/projects/prajna• Includes various demonstration projects
• Contact Information• [email protected] or [email protected]