![Page 1: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/1.jpg)
The end of geographic theory?
Prospects for model discovery in the geographic domain
Mark GaheganCentre for eResearch & Dept. Computer Science
University of Auckland, New Zealand
![Page 2: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/2.jpg)
The holy grail of analytics
Analytical models that can explain their own reasoning
– David Harvey– Peter Gould– Stan Openshaw
Computational Model Discovery (or Discovery Informatics)
![Page 3: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/3.jpg)
Recap: there are two kinds of analytical models…
- Predictive models
- Descriptive models
![Page 4: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/4.jpg)
In what way is this new?
Data mining & knowledge discovery– Does not emphasize model comprehensibility– Does not take advantage of prior knowledge– Produces predictive models that do not connect to
existing knowledge
Computational Model Discovery– Focus on interpretability of models by humans– Interested in explanations by connecting
observations to theory
![Page 5: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/5.jpg)
Explanation in Geography (Harvey 1969)
Examines the stages of geographic investigation and how together they support explanation, via:- methodological frameworks: the nature of investigation and - philosophy: the nature of the science process and its various conceptual artifacts (includes ontology), - which determine representation: how we abstract and represent the world and - analysis: how we model and analyze the world - through to explanation: which uses theory to describe what our analysis reveals.
Theory
Domain model
Scientific process model
![Page 6: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/6.jpg)
Inductive learning of models based on processes
• A process is a collection of related functions– Differential or algebraic form– Can be a single equation
• Can have unobserved variables• Specifies a causal relationship between one or
more input and output variables
![Page 7: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/7.jpg)
Computational Model DiscoveryPredator prey ecosystem
Concentration
0
5 0
1 0 0
1 5 0
2 0 0
2 5 0
3 0 0
3 5 0
Tim e (days)
1 0 1 2 1 4 1 6 1 8 2 0 2 2 2 4 2 6
A u r - O b sN a s - O B S
Prey
Predator
![Page 8: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/8.jpg)
Example Process Model (from SC-IPM, Bridewell, 2008)
Prey growth
Predation
Predator loss
Algebraic process to calculate grazing rate
Bridewell et al, 2008
![Page 9: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/9.jpg)
Inducing Process Models Summary
• Input– Time-series data– Domain knowledge– Processes and constraints
• Structure search– Combine processes together using constraints and an
evaluation strategy to limit the search• Parameter search
– For a given structure, fit parameters and evaluate• Output
– List of models ranked by score
![Page 10: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/10.jpg)
Computational Model DiscoveryGiven:– a methodology for the research and– a meta-model for the process of the research and– a set of representational forms for the observations (data)– observations for a set of variables;– a set of categories (entities) that the model may include;– a set of generic processes that specify relations among
entities;– a set of constraints that indicate plausible relations among
processes and entities;Find: – a specific process model and associated parameterization that
not only predicts the observed values but also explains them
![Page 11: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/11.jpg)
EVE, a bench robot for drug discovery?
Qi et al, 2010, Journal of Integrative Bioinformatics, 7(3):126, 2010 http://journal.imbio.de
![Page 12: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/12.jpg)
GOES early fire detection system
Koltunov et al, 2012
![Page 13: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/13.jpg)
So, how close are we, in GIScience, to discovering process models?
![Page 14: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/14.jpg)
Example domain model: OneGeology
![Page 15: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/15.jpg)
Example library of analytical functions (PySAL)
![Page 16: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/16.jpg)
One possible process for scientific investigation
Exploration: EXPLORING,
DISCOVERING
Analysis: GENERALIZING
, MODELING
Evaluation:EXPLAINING,
TESTING,GENERALIZING
Presentation: COMMUNICATING,
CONSENSUS-BUILDING
Synthesis: LEARNING,
CATEGORIZING
Data
Map
Explanation confidence
Results
Theory
Category, relation
Model
ConceptHypothesis
Gahegan, 2005
![Page 17: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/17.jpg)
CyberGIS Grand Challenge
Create a ‘Geographical Process Model Discovery System’ that integrates:
– a science model– a domain (data) model– analysis software– data– (constraints)
![Page 18: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/18.jpg)
Are there limits to what we can learn from data?
• Yes, but our learned models may still be useful• Yes, the model is—at best—as good as the
data– But this still might be better than current theory
• Yes, but as data becomes ubiquitous, then these limits will retreat
![Page 19: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/19.jpg)
End
![Page 20: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/20.jpg)
CyberGIS Workflow: 5 simple (and also very complicated) steps
1. Discover and gain access to, and – to some extent – understand (e.g. the semantics, the provenance, the limitations of) each dataset we intend to use.
2. Harmonize these datasets into a consistent form (data model), for example by re-projecting, converting from raster to vector and harmonizing the semantics. (Data Model Integration)
3. Analyze the datasets via an analytical workflow of some kind. (Software Integration)
4. Validate the accuracy and suitability of the results and5. Publish the results back into the Infrastructure. The results
are of little value unless they maintain connections to the above steps.
![Page 21: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/21.jpg)
Learn a predictive model, even when entire steps/states are missing?Bayesian belief network learning
![Page 22: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/22.jpg)
An example inferred model from GIScience
The consumer wants fit-for-purpose data, but the task and domain semantics are not given (latent variables).
Gahegan & Adams, 2014
![Page 23: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/23.jpg)
![Page 24: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/24.jpg)
![Page 25: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/25.jpg)
The education of the GIScientist?
• Better data custodian skills• Better scientific computing skills—but you
have to bring the geographic understanding too
• Deeper awareness of the processes /philosophy of our science
• A greater respect for data…• An outward gaze…
![Page 26: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/26.jpg)
Data
Concept
Results
Theory
Explanation confidence
Exploration: EXPLORING,
DISCOVERING
Analysis: GENERALIZING,
MODELING
Evaluation:EXPLAINING,
TESTING,GENERALIZING
Presentation: COMMUNICATING
, CONSENSUS-BUILDING
Synthesis: LEARNING,
CATEGORIZING Category, relation
Map
Model
Hypothesis
Scatterplot, grand tour, projection pursuit, parallel coordinate plot, iconographic displays
Self organizing map, k-means, clustering, geographical analysis machine, data mining, concept learning.
Interactive visual classification, parallel coordinate plot, separability plots, graphs of relationships
machine learning, maximum. likelihood, decision trees, regression & correlation analysis
Scene composition, information fusion, visual overlay
Statistical modeling,
Uncertainty visualization
Statistical testing, M-C simulation
Maps, navigable worlds, charts, immersive visualizations
Databases, Digital libraries, clearinghouses
…with types of inference and examples of visual and computational methods
![Page 27: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/27.jpg)
The First Paradigm:Experiment/Measurement
The Second Paradigm:Analytical Theory
The Third Paradigm:Numerical Simulations
The Fourth Paradigm:Data-Driven Science?Data fusion + data mining + synthesis/learning + explanation
The Evolving Paths to Knowledge
George Djorgovski, Caltech)
![Page 28: The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University](https://reader036.vdocuments.us/reader036/viewer/2022062421/56649f445503460f94c64ddb/html5/thumbnails/28.jpg)
Building Explanatory Models from Time-Series Data
• Process models are a natural choice• Many ways to define process• Processes are casual relations between one or
more input and output variables• Processes represent knowledge in notation
familiar to scientists– Helpful for explanation