using formal ontology for integrated spatial data mining julie sungsoon hwang department of...
Post on 21-Dec-2015
216 Views
Preview:
TRANSCRIPT
Using formal ontology for integrated spatial data mining
Julie Sungsoon HwangDepartment of GeographyState University of New York at BuffaloICCSA04 Perugia, ItalyMay 14, 2004
Research purposes
Enlighten the role of formal ontology in KDD
Propose the conceptual framework for ontology-based spatial data mining
Case study: ontology-based spatial clustering algorithms
Problems in focus (cont.)
No single algorithm is best suited to all research purposes and application domains.
The same algorithm can yield results inconsistent with fact without considering domain knowledge
The same data may have to be analyzed in different ways depending on users’ goal
Problems in focus
Developing new algorithms
AlgorithmD
AlgorithmC
AlgorithmA
AlgorithmB
AlgorithmD’
DomainDomain TaskTask
Re-using existing algorithms
Suited to domain and task
How can algorithms be customized to varying domain and task?
Relation between data mining and ontology construction
KnowledgeKnowledge
OntologyOntologyOntology Construction
(Knowledge acquisition)
Level o
f abstra
ction
DataData
InformationInformationData Mining
(Knowledge discovery)
KnowledgeKnowledge
Role of formal ontology in KDD
Provide the context in which the knowledge Provide the context in which the knowledge extracted from data is interpreted and evaluatedextracted from data is interpreted and evaluated
Guide algorithms such that they can be suitable Guide algorithms such that they can be suitable for domain-specific and task-oriented conceptsfor domain-specific and task-oriented concepts
KDD Process DiagramKDD Process Diagram
Using ontology for spatial data mining
Ontology formalizes how the knowledge is conceptualized, thereby making implicit meaning explicit
Data mining extracts a high-level knowledge from a low-level data, thereby enhancing the level of understanding
DomainDomainModelModel
TaskTaskModelModel
Ontology Spatial Data Mining
Low-level dataLow-level data
High-level knowledgeHigh-level knowledge
Domain-specific spatial data mining
Let’s compare two different domains: traffic accident versus retailers
Domain of traffic accident
Domain of retailers
Is-a
Spatial constraints
Event Physical object
In road network Outside of road network
Spatial data mining algorithms should take into account different conceptualization (domain-specific properties)
Task-oriented spatial data mining
Let’s compare two different tasks: detecting hotspots of traffic accident versus partitioning market areas based on the location of retail
Detect hotspots of traffic accident
Partition market areas to a retailer
# of clusters k
Level of details
Spatial data mining algorithms should take into account different tasks and users’ need
Depend on spatial distributn.
Given (resource constraint)
Varies with scale (depends on area of users’ interest)
Doesn’t vary with scale
Ontology as an active component of information system
e.g. medicine e.g. diagnosing
e.g. space, time, matter, object, event
Application OntologyApplication Ontology
Task OntologyTask OntologyTask OntologyTask OntologyDomain OntologyDomain OntologyDomain OntologyDomain Ontology
Top-level OntologyTop-level Ontology
dependence
subject
From Guarino, 1998
OBSDM:: Input:: Metadata
Tag structure of XML can be utilized to inform domain ontology of the semantics of data
OBSDM:: OBSDMM:: Domain Ont.
Terms within the “theme” tag in the metadata are used as a token to locate the appropriate domain ontology
Domain ontology specifies the definition, class, and properties Class example: Accident is a Subclass-Of Temporal-
Thing Properties example: Road has a Geographic-Region
as a Value-Type
Properties of class inherit from top-level ontology
Domain ontology := Traffic accident
Theory TRAFFIC-ACCIDENT-DOMAIN As a spatial thing,
Point(x) On(x, y) Roadway(y) Line(y) In(y, z) Geographic-Region(z)
As a temporal thing, Point(x) At(x, y) Time(y) Event(x) <=> Occurrence(x) Notification(x) Response(x)
Arrival(x) Before(Occurrence(x), Notification(x))
As an intangible thing, Accident (x) RelatedTo(x, y) Vehicle(y)
OBSDM:: Input:: User Interface
Users can specify a goal, level of detail, and geographic area of interest through UI
OBSDM:: OBSDMM:: Task Ont.
The inputs specified by users in the user interface are translated into task ontology
Task ontology explicitly specify goal, methods, requirements, and constraint
Task ontology := Spatial clustering
Theory SPATIAL-CLUSTERING-TASK Documentation:
This theory defines a task ontology for the spatial clustering task. The spatial clustering task, which is a class of clustering task, is a problem of grouping similar spatial objects into classes.
Super classes: Clustering Subclasses:
Sub goal: “Find hot spots” “Group similar patterns” “Partition into k-clusters”
Requirement: Assignment-Object
Source: Spatial Objects Target: Clusters
Geographic-Scale Detail-Level
Constraint: Spatial Objects Operational Constraints
OBSDM:: OBSDMM:: Alg. BuilderOBSDM:: Output:: GVis tool
Algorithm builder puts together requirements for building the best algorithm suited to domain of data and users’ input (task).
Data content is filtered through domain ontology, and the users’ requirement is filtered through task ontology.
The geographic visualization tool displays results (pattern discovered)
Case study: ontology-based spatial clustering of traffic accidents
OBSC
Input: 353 features in Erie
Setting
Metadata
Theme := Traffic Accident
User interface
Goal := “identify hot spots”
LevelOfDetail := State
PlaceName := New York
Method
Algorithm := SMTIN
Constraint := Named-RoadwayOutput: 18 clusters in Erie County
Case study:Effect of scale (Task ontology)
OBSC clusters reflect spatial distribution specific to the scale of users’ interest
Control Algorithm OBSC Algorithm
TASKTASK
LevelOfDetail := LevelOfDetail := NullNull
PlaceName := PlaceName := NullNull
DOMAINDOMAIN
Constraint := RoadwayConstraint := Roadway
TASKTASK
LevelOfDetail := LevelOfDetail := CountyCounty
PlaceName := PlaceName := New YorkNew York
DOMAINDOMAIN
Constraint := RoadwayConstraint := Roadway
Specifying area of interest doesn’t
mask details
Case study:Effect of constraint (Domain ontology)
OBSC clusters identify the physical barrier due to concept implicit in domain
Control Algorithm OBSC Algorithm
TASKTASK
LevelOfDetail := StateLevelOfDetail := State
PlaceName := New YorkPlaceName := New York
DOMAINDOMAIN
Constraint := Constraint := NullNull
TASKTASK
LevelOfDetail := StateLevelOfDetail := State
PlaceName := New YorkPlaceName := New York
DOMAINDOMAIN
Constraint := Constraint := RoadwayRoadwaySeparated by body of
water
Case study:Benefit of using ontology in spatial clustering
Incorporating ontology in spatial clustering algorithms enhances the quality of spatial clustering results
Task ontology makes clusters usable Responsive to users’ view
Domain ontology makes clusters natural Dictated by concept implicit in domain
Conclusion (cont.)
Presents how ontology are incorporated in spatial data mining algorithms
Semantic linkage between ontologies and algorithms through parameterization
Scale as a task-oriented property Constraint as a domain-specific property
Conclusion
Ontology is examined as a means to customize algorithms to varying domain and task
Ontology enables algorithms to reflect concepts implicit in domain, and adapt to users’ view
Ontology provides the semantically plausible way to re-use existing algorithms
Ontology provides the systematic way of organizing various factors that dictate mechanisms underlying data mining process
top related