some problems with standard geospatial metadata
TRANSCRIPT
Simon Cox, Bruce Simons, Nick Car
12 March 2015
LAND AND WATER FLAGSHIP
Some problems with standard geospatial metadata
This presentation
• Asks some questions
• Does not provide all the answers• … but suggests some directions …
Presenter name | Presenter title
31 January 2012
ADD BUSINESS UNIT/FLAGSHIP NAME
Problems with metadata | Nick Car2 |
Outline
• ANZLIC and GeoNetwork
• Where did ANZLIC come from?
• Records
• Uses of metadata
• UML vs XML
• RDF
• RDF vocabularies
Presenter name | Presenter title
31 January 2012
ADD BUSINESS UNIT/FLAGSHIP NAME
Problems with metadata | Nick Car3 |
ANZLIC Metadata
Presenter name | Presenter title
31 January 2012
ADD BUSINESS UNIT/FLAGSHIP NAME
Problems with metadata | Nick Car4 |
Where did ANZLIC come from?
• ANZLIC a profile of ISO 19115:2003
• ISO 19115 designed by a committee • US FGDC metadata a strong
precedent
• requirements collected in 1990s
• image and map librarians
(horse designed by committee = a camel?)
dawn of the internet, dataset=file
10,000s datasets in standard series, metadata == digital ‘index cards’
Problems with metadata | Nick Car5 |
Problem #1: Data ≠ Datasets?
• When cataloguing books, maps, images, even files, the card-index metaphor is OK• A discrete record for each item of data
• Now we expect to access data at a variety of granularities, the dataset/metadata record paradigm no longer applies
• It is a sea of data, and should be matched by a sea of metadata (maybe in the same place)
Problems with metadata | Nick Car6 |
Breaking it down
• Structural decomposition
Problems with metadata | Nick Car7 |
• Functional decomposition
Lawrence, Lowry, Miller, Snaith & Woolf, Information in environmental data grids. Phil. Trans. A, 2009
Problem #2: One record can’t serve all purposes
• But one ‘record’ is all you got!
Problems with metadata | Nick Car8 |
GeoNetwork stores metadata as XML documents in a text database (Lucene)
Problems with metadata | Nick Car10 |
Problem #3: Documents package text, not objects
• Instances of UML classes = Objects
• XML document = serialization for transport
Treating the XML document as ‘canonical’ makes a basic category error: XML validation ≠ quality control
if you only intend to manage it as text, why bother with a UML analysis?
For object-oriented behavior, the serialized form must be ‘un-marshalled’ for processing
Problems with metadata | Nick Car11 |
Problem #4: Index cards are not infrastructure
• Metadata-entry paradigm encourages record counting as a KPI
• Surely there are better measures of usefulness?
• How can we know, if it is not part of a joined-up architecture
Problems with metadata | Nick Car13 |
What does everyone else do?
1. Specialist systems for specialized communities– Is spatial special? Do we want our spatial data in the mainstream?
2. Don’t bother with metadata, just index the content– The original strategy of the search engines
– Google Knowledge Graph now works with entities, not text
– (shame the entities don’t have persistent URIs …)
3. Metadata annotations – schema.org – semantic-web-lite
4. What about the Data Repositories?
Problems with metadata | Nick Car14 |
Research Data Repositories
• Still a lot of variation• RIF-CS
• MARC
• Dublin Core
• Data Catalog Vocabulary (DCAT)
• RDF vocabularies?• DC, DCAT
• FOAF, PROV-O, VoID, SKOS, ADMS, LOCN
Problems with metadata | Nick Car15 |
RDF benefits
• Standard vocabularies used in the broader community
• Intrinsically object/resource oriented
• URIs for keys - linked data
• Open world – missing information doesn’t make it invalid
• No intrinsic granularity
Problems with metadata | Nick Car18 |
Summary
ANZLIC + GeoNetwork:
Record-oriented metadata doesn’t match granularity of data
Each record must serve multiple functions
Object oriented design, but serialization-oriented processing
Incentive to create records, not architecture
Not aligned with anyone else’s metadata
RDF?:
Graph of metadata to match graph of data
Targeted metadata subsets can be constructed using SPARQL
Intrinsically resource-oriented
Part of web of Linked Data
Standard RDF vocabularies
Problems with metadata | Nick Car19 |
LAND AND WATER FLAGSHIP
Thank youLand and Water Flagship Nick CarResearch Engineer
t +61 7 3833 5600e [email protected]
Land and Water FlagshipSimon CoxResearch Scientist
t +61 3 9252 6342e [email protected] people.csiro.au/C/S/Simon-Cox