chapter 2
DESCRIPTION
TRANSCRIPT
![Page 1: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/1.jpg)
Geographic Data Mining
Paradigms for Spatial and Spatio-temporal Data Mining
![Page 2: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/2.jpg)
2
About …
• Mining from spatial and spatio-temporal data
• Meta-mining as a discovery process paradigm
• Processes for theory/hypothesis management
![Page 3: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/3.jpg)
3
Mining from spatial and spatio-temporal data
• Rule types• Spatial vs. spatio-temporal data• Handling second-hand data
![Page 4: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/4.jpg)
4
Rule types
• Spatio-temporal associations• Spatio-temporal generalization• Spatio-temporal clustering• Evolution rules• Meta-rules
![Page 5: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/5.jpg)
5
Spatio-temporal associations
• X -> Y (c%, s%)• Require the use of spatial and temporal
predicates• For temporal association rules, the
emphasis moves form the data itself to changes in the data
![Page 6: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/6.jpg)
6
Spatio-temporal generalization
• Concept hierarchies are used to aggregate data
• Spatial-data-dominant– ‘South Australian summers are commonly
hot and dry’
• Nonspatial-data-dominant– ‘Hot dry summers are often experienced by
areas close to large desert systems’
![Page 7: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/7.jpg)
7
Spatio-temporal clustering
• Similar to normal clustering• Far more complex• Characteristic features of objects in a
spatio-temporal region OR spatio-temporal characteristics of a set of objects are sought
![Page 8: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/8.jpg)
8
Evolution rules
• Explicit temporal and spatial context• Describes the manner in which spatial
entities change over time• Exponential number of rules can be
generated– Example predicates
![Page 9: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/9.jpg)
9
Example predicates
• Follows– One cluster of objects traces the same (or
similar) spatial route as another cluster at a later time (spatial coordinates are fixed)
![Page 10: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/10.jpg)
10
Example predicates
• Follows• Coincides
– One cluster of objects traces the same (or similar) spatial path whenever a second cluster undergoes specified activity (temporal coordinates are fixed)
![Page 11: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/11.jpg)
11
Example predicates
• Follows• Coincides• Parallels
– One cluster of objects traces the same (or a similar) spatial pattern but offset in space (temporal coordinates are fixed)
![Page 12: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/12.jpg)
12
Example predicates
• Follows• Coincides• Parallels• Mutates
– One cluster of objects transforms itself into second cluster
![Page 13: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/13.jpg)
13
Meta-rules
• Created when rule sets rather than datasets are inspected for trends and coincidental behaviour
• Describe observations discovered amongst sets of rules– The support for suggestion X is increasing
![Page 14: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/14.jpg)
14
Spatial vs. Spatio-temporal data
• Dimensioning-up• Time: uni-directional and linear
– Relational concepts (before, during, etc,) are easily understood, communicated and accommodated
• Space: bi-directional and nonlinear
![Page 15: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/15.jpg)
15
Spatial vs. Spatio-temporal data
• Time & space: both continuous phenomena– Time: discrete and isomorphic with integers
• Larger granularity often selected (days, years, etc.)
– Space: isomorphic with real numbers• Granularity generally smaller (relative to
the domain)
![Page 16: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/16.jpg)
16
Spatial vs. Spatio-temporal data
• Dimensioning-up strategies work poorly• Are accepted data mining procedures
flawed?• No: Time scale differences between
data types generally match characteristics we wish to include in most analyses of land-cover
![Page 17: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/17.jpg)
17
Spatial vs. Spatio-temporal data
• For example– Spectral time slice provides discrimination
between vegetation types– Environmental data provide long-term
conditions witch match germination, growth and development of largest plants
• Very often, too little consideration is given to the appropriate temporal scales necessary
![Page 18: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/18.jpg)
18
Spatial vs. Spatio-temporal data
• Example– Monitoring of wetlands in dry tropics– The extend of these land-cover elements
varies considerably through time– Inter-annual variability in expend is greater
than the average annual variability– Spectral image without annual and seasonal
and without monthly rainfall and evaporation figures is meaningless
![Page 19: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/19.jpg)
19
Spatial vs. Spatio-temporal data
• Temporal scales used in conjunction with spatial data often inconsistent -> Needs to be chosen more carefully
• Considerably better results will be achieved by a considered re-coding of the temporal data– Palaeo-climate reconstruction
demonstrates: Time can be associated with the relative positioning of the Earth, the Sun and the major planets in space
![Page 20: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/20.jpg)
20
Spatial vs. Spatio-temporal data
• Time is a spatial phenomenon• A point an the Earths surface (latitude,
longitude, elevation) is not static in space, but moving through a complex energy environment
• This movement, and the dynamics of the energy environment is ‘time’
![Page 21: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/21.jpg)
21
Spatial vs. Spatio-temporal data
• Three main components to the environmental energy field– Gravity– Radiation– Magnetism
• Feedback relationships: Time
![Page 22: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/22.jpg)
22
Spatial vs. Spatio-temporal data
• Most important relationships– Relative positions of a point on the surface
of the Earth and the Sun (Diurnal cycle)– Orbit of the Moon around the Earth– Orbit of the Earth around the Sun
• These relationships have a very significant relationship with both our natural, cultural and economic environments
![Page 23: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/23.jpg)
23
Spatial vs. Spatio-temporal data
• Other relationships– Solar day: This sweeps a pattern of four
solar magnetic sectors past the Earth in about 27 days. This correlates with a fluctuation in the generation of low-pressure systems
![Page 24: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/24.jpg)
24
Spatial vs. Spatio-temporal data
• Other relationships– Solar day: 27 days– Lunar cycle: This is a 27.3-day period in the
declination of the moon during witch it moves north for 13.65 days and south for 13.65 days. This correlates with certain movements of pressure systems on Earth
![Page 25: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/25.jpg)
25
Spatial vs. Spatio-temporal data
• Other relationships– Solar day: 27 days– Lunar cycle: 27.3-day period– Solar year: The orbit of the sun around the
center of the solar system. This cycle correlates with long-term variation in a large number of natural, cultural and economic indices
![Page 26: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/26.jpg)
26
Spatial vs. Spatio-temporal data
• These relate to both the Earth’s energy environment and the sorts of scales we are most concerned with in data mining
• Recoding the time stamp on data to a relevant continues variable (eg. time of the Solar year) provides most ‘intelligent’ data mining software a considerably better chance of identifying important relationships in spatio-temporal data
![Page 27: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/27.jpg)
27
Handling second-hand data
• The need to re-use data collected for other purposes
• Few data collection methods take into account the non-deterministic nature of data mining
• Results into heterogeneous data sources being brought together
![Page 28: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/28.jpg)
28
Possible errors
• The rules reflect the heterogeneity of the data sets rather than any differences in the observed phenomenon.
• The data sets being temporally incompatible
• The collection methods being incompatible
![Page 29: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/29.jpg)
29
About …
• Mining from spatial and spatio-temporal data
• Meta-mining as a discovery process paradigm
• Processes for theory/hypothesis management
![Page 30: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/30.jpg)
30
Meta-mining as a discovery process
paradigm• Target of mining: traditionally data itself• Increase in data & polynomial
complexity of many mining algorithms– Extraction of useful rules becomes difficult
• A solution: mine from either summaries of the data or from results of previous mining exercises
![Page 31: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/31.jpg)
31
Meta-mining as a discovery process
paradigm
![Page 32: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/32.jpg)
32
Meta-mining as a discovery process
paradigm• For each rule generated some
‘irrelevant’ data is removed– Support and confidence ratings must be
taken into account– Clusters may use criteria that may mask
important outlying facts
![Page 33: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/33.jpg)
33
About …
• Mining from spatial and spatio-temporal data
• Meta-mining as a discovery process paradigm
• Processes for theory/hypothesis management
![Page 34: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/34.jpg)
34
Processes for theory/hypothesis
management• Analysis into geographic, geo-social,
socio-political and environmental issues require a more formal, strongly ethical driven approach– Environmental science uses a formal
scientific experimentation process requiring the formulation and refutation of a credible null hypothesis
![Page 35: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/35.jpg)
35
Processes for theory/hypothesis
management• Data mining over the past few years
– Largely oriented towards the discovery of previously unknown but potentially useful rules
– Some useful rule can be mined– Potential for either logical or statistical error
is extremely high– Result of much data mining is at best a set
of suggested topics for further investigation
![Page 36: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/36.jpg)
36
The process of scientific induction
• Two distinct forms of knowledge discovery– Process modeling approach: Real world is
modeled in a mathematical manner– Pattern matching approach: Prediction is
made on past experience
• Data mining is latter
![Page 37: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/37.jpg)
37
Using data mining to support scientific
induction
![Page 38: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/38.jpg)
38
The process of scientific induction
• Another view of scientific induction– Given an infinitely large hypothesis space– Rule extracted from data used to constrain
the hypothesis space
• Very complex (search space is exponential)– Less than useful answers or high
computational overhead
![Page 39: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/39.jpg)
39
Using data mining to support scientific
induction• Develop hypotheses that will constrain
the search space by defining areas within which the search is to take place– Starting point: user supplied conceptual
model– Hypothesis supported: weight is added to
confidence of conceptual model– Hypothesis not supported: change to
conceptual model or need for external input is indicated
![Page 40: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/40.jpg)
40
Using data mining to support scientific
induction
![Page 41: Chapter 2](https://reader033.vdocuments.us/reader033/viewer/2022061206/54784f74b4af9fbf708b4636/html5/thumbnails/41.jpg)
41
Using data mining to support scientific
induction
• Three aspects of interest– Able to accept alternative conceptual
models an provide a ranking. Also allows for modification to a conceptual model
– Hypothesis generation component may yield new unexplored insights into accepted conceptual models
– Reasonably efficient because of directed mining algorithms