![Page 1: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/1.jpg)
U.S. Department of the InteriorU.S. Geological Survey
Data Integration Progress and Guiding Principles
Disciplines, generalization, and open-access.
David Blodgett – [email protected] Office of Water Information Center for Integrated Data Analytics
![Page 2: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/2.jpg)
Outline
· Data Integration Disambiguation
· Barriers to moving Forward.
· Anecdotes, everyone loves anecdotes!
· Principles to go Forward!
![Page 3: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/3.jpg)
![Page 4: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/4.jpg)
Disclosures
· I’m a water guy.
· I‘m a millennial.
· I assume Internet.
· I’m a Badger.· … Forward!
![Page 5: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/5.jpg)
Data Integration – Disambiguated.
Integration is the act of combining multiple things into a whole.
![Page 6: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/6.jpg)
Data Integration – Disambiguated.
What makes something integrated?How different do things need to be to count?Do you just need to combine things?
![Page 7: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/7.jpg)
What kind of data integration is needed for decisions?
2014-
05-
12
7
Visual Integration
Data Consolidation
DataWarehouse
Data Bundling Data Fusion
Integrated Search Multi-source Data Ingest
in the Cloud?
Application / Decision Driven Model of Data Integration Slide Credit: Jeff de La Beaujardiere
![Page 8: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/8.jpg)
What kind of data integration is needed for decisions?
2014-
05-
12
8
Visual Integration
Data Consolidation
DataWarehouse
Data Bundling Data Fusion
Integrated Search Multi-source Data Ingest
in the Cloud?
![Page 9: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/9.jpg)
A general model for data integration.
Disciplinary Details
Free and Open Service Access
Generalized Standards
![Page 10: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/10.jpg)
Service OrientationOn local machines, we run software.
List, introspect, summarize, transform, integrate.Can scan the entire domain of the data!
A service may do any or all of these things.
Software on the server can summarize the domain and range of its holdings. (ie. Deliver Dynamic Metadata)
![Page 11: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/11.jpg)
Web Service – So what?
Software on the server
can summarize the
domain and range of its
holdings.
![Page 12: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/12.jpg)
Generalized Aspects of Data Services
Spatial/ Temporal
Extent
Attribute Extent
Blob of Bits
Available Formats
International Standards.
Various Communities’ Interchange
Discipline specific linked to other disciplines.
![Page 13: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/13.jpg)
Practical Barriers
‘I don’t know how to use the required software.’
‘The software I need is really expensive.’
‘The information I need is a big mess.’
‘The information I need is really big.’
![Page 14: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/14.jpg)
Understanding Barriers
‘The information is in a language I don’t know.’
‘The information is in a format I’ve never seen.’
‘The taxonomy used doesn’t work with mine.’
‘I’m not sure if what I’m seeing is a data quality issue or real.’
![Page 15: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/15.jpg)
Defensive Barriers
‘I collected this data and want to publish on it.’
‘People won’t interpret my data correctly.’
‘I don’t want to be liable for decisions made.’
‘This data’s quality is too low to stand behind.’
![Page 16: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/16.jpg)
Square Pegs and Round Holes
Coverages and Features
A grid cell IS NOT a point measurement!!!
![Page 17: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/17.jpg)
Scale Discontinuity
![Page 18: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/18.jpg)
Anecdotes!...
Because they are instructive!
![Page 19: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/19.jpg)
Water Quality Portal
http://www.waterqualitydata.us
USGS, EPA, USDA Joint service providing water quality and other environmental monitoring data.
![Page 20: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/20.jpg)
Integrated Ocean Observing System
![Page 21: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/21.jpg)
Weather Underground 42K Current Conditions Weather Stations!
![Page 22: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/22.jpg)
WeatherCommon architecture for access and processing multiple environmental data resources!
Geo Data Portal Data Integration Framework
Center for Integrated Data Analytics: Nate Booth, Tom Kunicki, Dave Blodgett, Jordan Walker, Ivan Suftin, I-Lin Kuo.
Landscape
Climate
![Page 23: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/23.jpg)
Enabling Technologies….
![Page 24: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/24.jpg)
____.data.gov – Big Win!
Data access type is a first class citizen!
Includes both human and machine metadata.
Machine-interpretability is an expectation.
Content management systems and catalogs are becoming data service providers!!!
![Page 25: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/25.jpg)
Forward!
![Page 26: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/26.jpg)
Principle #1: Data Object Patterns
We must continue to identify and model the common patterns our data adhere to.
Non-interpretive content / attributes should be provided by service ‘methods’.
These patterns must transcend discipline or implementation.
![Page 27: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/27.jpg)
Principle #2: Domain Semantics.
Semantic relationships are necessarily governed by a given scientific domain itself.
This is Foundational to all additional interdisciplinary concerns.
![Page 28: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/28.jpg)
Principle #3: __ - Agnostic Standards
Standards, specifications, and best practices must be ____ - agnostic.
A standard can be implemented using any technology, in any discipline.
eg. WaterML2 -> TimeSeriesML
![Page 29: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/29.jpg)
Principle #4: Identity Management
Uniqueness can’t be taken for granted and must be curated very deliberately.
You are not your location. Neither is a place.
Foundational to linking any and all information to an entity.
![Page 30: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/30.jpg)
A few thoughts to leave you with…
Maps are metadata.
Index-based data access is dead.
A Geospatial database should be coherent without it’s spatial table.
![Page 31: Data Integration Progress and Guiding Principles](https://reader035.vdocuments.us/reader035/viewer/2022081503/5681635a550346895dd41989/html5/thumbnails/31.jpg)
Summary
A standard is an established generalization.
Scientific disciplines govern their semantics.
Open-access (the internet) must be a given.