some problems with standard geospatial metadata...simon cox, bruce simons, nick car 12 march 2015...

Post on 08-Jul-2020

3 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Simon Cox, Bruce Simons, Nick Car

12 March 2015

LAND AND WATER FLAGSHIP

Some problems with standard geospatial metadata

This presentation

• Asks some questions

• Does not provide all the answers • … but suggests some directions …

Presenter name | Presenter title

31 January 2012

ADD BUSINESS UNIT/FLAGSHIP NAME

Problems with metadata | Nick Car 2 |

Outline

• ANZLIC and GeoNetwork

• Where did ANZLIC come from?

• Records

• Uses of metadata

• UML vs XML

• RDF

• RDF vocabularies

Presenter name | Presenter title

31 January 2012

ADD BUSINESS UNIT/FLAGSHIP NAME

Problems with metadata | Nick Car 3 |

ANZLIC Metadata

Presenter name | Presenter title

31 January 2012

ADD BUSINESS UNIT/FLAGSHIP NAME

Problems with metadata | Nick Car 4 |

Where did ANZLIC come from?

● ANZLIC a profile of ISO

19115:2003

5 |

Where did ANZLIC come from?

● ANZLIC a profile of ISO

19115:2003

● ISO 19115 designed by a

committee

6 |

Where did ANZLIC come from?

● ANZLIC a profile of ISO

19115:2003

● ISO 19115 designed by a

committee

7 |

(horse designed by committee =

camel)

Where did ANZLIC come from?

● ANZLIC a profile of ISO

19115:2003

● ISO 19115 designed by a

committee

○ US FGDC metadata a

strong precedent

○ requirements collected in

the 1990s

○ image and map librarians

8 |

(horse designed by committee =

camel)

Where did ANZLIC come from?

● ANZLIC a profile of ISO

19115:2003

● ISO 19115 designed by a

committee

○ US FGDC metadata a

strong precedent

○ requirements collected in

the 1990s

○ image and map librarians

9 |

(horse designed by committee =

camel)

> dawn of the internet, dataset == file

> 10,000s datasets in standard series,

metadata == digital ‘index cards’

Problem #1: Data ≠ Datasets?

• When cataloguing books, maps, images, even files, the card-index metaphor is OK • A discrete record for each item of data

• Now we expect to access data at a variety of granularities, the dataset/metadata record paradigm no longer applies

• It is a sea of data, and should be matched by a sea of metadata (maybe in the same place)

Problems with metadata | Nick Car 10 |

Breaking it down

• Structural decomposition

Problems with metadata | Nick Car 11 |

• Functional decomposition

Lawrence, Lowry, Miller, Snaith & Woolf, Information in environmental data grids. Phil. Trans. A, 2009

Problem #2: One record can’t serve all purposes

• But one ‘record’ is all you got!

Problems with metadata | Nick Car 12 |

ISO metadata was formalized as UML classes

Problems with metadata | Nick Car 13 |

GeoNetwork stores metadata as XML documents in a text database (Lucene)

Problems with metadata | Nick Car 14 |

Problem #3: Documents package text, not objects

• Instances of UML classes = Objects

• XML document = serialization for transport

• Treating the XML document as ‘canonical’ makes a basic category error: ➢XML validation ≠ quality control

➢if you only intend to manage it as text, why bother with a UML analysis?

For object-oriented behavior, the serialized form must be ‘un-marshalled’ for processing

Problems with metadata | Nick Car 15 |

Metadata creation

Problems with metadata | Nick Car 16 |

Problem #4: Index cards are not infrastructure

• Metadata-entry paradigm encourages record counting as a KPI

• Surely there are better measures of usefulness?

• How can we know, if it is not part of a joined-up architecture

Problems with metadata | Nick Car 17 |

What does everyone else do?

1. Specialist systems for specialized communities – Is spatial special? Do we want our spatial data in the mainstream?

2. Don’t bother with metadata, just index the content – The original strategy of the search engines

– Google Knowledge Graph now works with entities, not text

– (shame the entities don’t have persistent URIs …)

3. Metadata annotations – schema.org – semantic-web-lite

4. What about the Data Repositories?

Problems with metadata | Nick Car 18 |

Research Data Repositories

• Still a lot of variation • RIF-CS

• MARC

• Dublin Core

• Data Catalog Vocabulary (DCAT)

Problems with metadata | Nick Car 19 |

Research Data Repositories

• Still a lot of variation • RIF-CS

• MARC

• Dublin Core

• Data Catalog Vocabulary (DCAT)

Problems with metadata | Nick Car 20 |

Research Data Repositories

• Still a lot of variation • RIF-CS

• MARC

• Dublin Core

• Data Catalog Vocabulary (DCAT)

• RDF vocabularies? • DC, DCAT

• FOAF, PROV-O, VoID, SKOS, ADMS, LOCN

Problems with metadata | Nick Car 21 |

INSPIRE profile of DCAT-AP

Problems with metadata | Nick Car 22 |

INSPIRE metadata record as RDF

Problems with metadata | Nick Car 23 |

RDF benefits

• Standard vocabularies used in the broader community

• Intrinsically object/resource oriented

• URIs for keys - linked data

• Open world – missing information doesn’t make it invalid

• No intrinsic granularity

Problems with metadata | Nick Car 24 |

Summary

ANZLIC + GeoNetwork:

☹ Record-oriented metadata doesn’t match granularity of data

☹ Each record must serve multiple functions

☹ Object oriented design, but serialization-oriented processing

☹ Incentive to create records, not architecture

☹ Not aligned with anyone else’s metadata

RDF?:

☺ Graph of metadata to match graph of data

☺ Targeted metadata subsets can be constructed using SPARQL

☺ Intrinsically resource-oriented

☺ Part of web of Linked Data

☺ Standard RDF vocabularies

Problems with metadata | Nick Car 25 |

LAND AND WATER FLAGSHIP

Thank you Land and Water Flagship Nick Car Research Engineer

t +61 7 3833 5600 e nicholas.car@csiro.au

Land and Water Flagship Simon Cox Research Scientist

t +61 3 9252 6342 e simon.cox@csiro.au w people.csiro.au/C/S/Simon-Cox

top related