global biodiversity information facility Éamonn Ó tuama senior programme officer, ida 21 june 2011...

29
GLOBAL GLOBAL BIODIVERSITY BIODIVERSITY INFORMATION INFORMATION FACILITY FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011 WWW.GBIF.ORG WWW.GBIF.ORG Metadata Metadata publishing with publishing with the IPT the IPT

Upload: ami-berry

Post on 13-Jan-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

GLOBALGLOBALBIODIVERSITYBIODIVERSITYGLOBALGLOBALBIODIVERSITYBIODIVERSITY

INFORMATIONINFORMATIONFACILITYFACILITY

Éamonn Ó Tuama

Senior Programme Officer, IDA

21 June 2011

WWW.GBIF.OWWW.GBIF.ORGRG

Metadata Metadata publishing with publishing with the IPTthe IPT

Page 2: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Outline

- Why metadata?

- The GBIF EML profile

- Metadata standards

- Preparation of metadata

- Where does the metadata go?

- Preparing metadata using the IPT

Page 3: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Outline

- Why metadata?

- The GBIF EML profile

- Metadata standards

- Preparation of metadata

- Where does the metadata go?

- Preparing metadata using the IPT

Page 4: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

”Data Intensive Science”

”Fourth Science Paradigm”

e-Infrastructure Reflection Group (European Strategy Forum on Research Infrastructures). Report on Data Management, November 2009. http://www.e-irg.eu/images/stories/publ/task_force_reports/dmtfjointreport.pdf

”Digital Data Deluge”

The Fourth Paradigm: Data-Intensive Scientific Discovery http://research.microsoft.com/en-us/collaboration/fourthparadigm/contents.aspx

high quality metadata for long term curation and use of datasets

Key requirement :

Why metadata?

Page 5: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Why metadata?

William K. Michener, Meta-information concepts for ecological data management, Ecological Informatics, Volume 1, Issue 1, January 2006, Pages 3-7, ISSN 1574-9541, DOI: 10.1016/j.ecoinf.2005.08.004.(http://www.sciencedirect.com/science/article/B7W63-4HJRS57-3/2/ea2e08412c6776456f540e66983546c0)

Information about datasets deteriorates over time!

Page 6: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Why metadata?

Metadata supports:- Discovery- Interpretation/Evaluation

- Provenance- Quality- Fitness-for-use

- Analytical re-use

Page 7: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Outline

- Why metadata?

- The GBIF EML profile

- Metadata standards

- Preparation of metadata

- Where does the metadata go?

- Preparing metadata using the IPT

Page 8: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Metadata Standards

Ecological Metadata Language (EML) v2.1.1http://knb.ecoinformatics.org/software/eml/

Page 9: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Metadata Standards

Dublin Core http://dublincore.org/documents/dcmi-terms/

Page 10: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Metadata Standards

Directory Interchange Format (DIF)http://gcmd.nasa.gov/User/difguide/difman.html

Page 11: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Metadata Standards

ISO 19115/19139 Geographic MetadataISO 19115: http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=26020ISO 19139: http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=32557

Page 12: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Metadata Standards

Natural Collections Descriptions (NCD)http://www.tdwg.org/standards/312/

Page 13: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Metadata Standards

Federal Geographic Data Committee (FGDC) Biological Profile http://www.fgdc.gov/standards/projects/FGDC-standards-projects/metadata/biometadata/

An extension of the FGDC CSDGM (Content Standard for Digital Geospatial Metadata)

*

*

Page 14: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Metadata Standards

Multimedia Resources Metadata Schema http://www.tdwg.org/charters/article/view/448/36

Page 15: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

NAP ISO 19115

Attributes describing the metadata

Components to describe the resource

Source: http://www.fgdc.gov/standards/projects/incits-l1-standards-projects/NAP-Metadata/napMetadataProfileV101.pdf/view

Page 16: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

ISO 19115/19139North American Profile of ISO 19139http://www.fgdc.gov/standards/projects/incits-l1-standards-projects/NAP-Metadata/napMetadataProfileV101.pdf/view

Several Resources available for crosswalk; transform; view

EML to FGDC Biological Profilehttps://code.ecoinformatics.org/code/eml/trunk/lib/eml2tonbii/

# FGDC CSDGM to ISO Transform# FGDC CSDGM to ISO Crosswalk# ISO XML to HTML View: # FGDC BIO to ISO Transform# FGDC BIO to ISO Crosswalkhttp://www.ncddc.noaa.gov/technology/metadataandxml/view

FGDC CSDGM

ISO 19139

EML to ISO 19139http://code.google.com/p/gbif-metadata/source/browse/trunk/metadata/src/main/resources/eml2iso19139.xsl

Open source INSPIRE-compliant MD editor (multilingual functionality)http://www.inspire-geoportal.eu/EUOSME/

Page 17: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Metadata and Languages

A Multilingual Metadata Catalog for the ILTER: Issues and Approaches. Vanderbilt, K.L., et al., Ecological Informatics, Volume 5, Issue 3, May 2010, Pages 187-193, doi:10.1016/j.ecoinf.2010.02.002

- Adopt a lingua franca, e.g., English- data publishers provide discovery level metadata in English; - full metadata in local language.

- Just use local language with keywords from multilingual thesauri, e.g. GEMET, the GEneral Multilingual Environmental Thesaurus; 27 languages. http://www.eionet.europa.eu/gemet/

AGROVOC; agriculture, forestry, fisheries, food and related domains e.g., environment; 20 languages. http://www4.fao.org/agrovoc/default.htm

Long term solution: multilingual ontologies

- Issues? – additional burden; tools, metadata standards

Page 18: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Outline

- Why metadata?

- The GBIF EML profile

- Metadata standards

- Preparation of metadata

- Where does the metadata go?

- Preparing metadata using the IPT

Page 19: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

GBIF EML Profile- Requirements gathering

- GBIF Metadata Task Grouphttp://www2.gbif.org/GBIF-MIFTG-Report.pdf

- EML; ISO 19115; NCD; INSPIRE Directivehttp://community.gbif.org/mod/file/download.php?file_guid=10915;

http://community.gbif.org/mod/file/download.php?file_guid=5656

http://rs.gbif.org/schema/eml-gbif-profile/1.0/eml-gbif-profile.xsd

- GBIF EML schema

http://community.gbif.org/pg/groups/5258/gbif-metadata-network/

- GBIF community site: metadata network

- GBIF profile documentationhttp://links.gbif.org/gbif_metadata_profile_how-to_en_v1

http://links.gbif.org/gbif_metadata_profile_guide_en_v1

Page 20: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Outline

- Why metadata?

- The GBIF EML profile

- Metadata standards

- Preparation of metadata

- Where does the metadata go?

- Preparing metadata using the IPT

Page 21: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Preparing metadata- Metadata editors

e.g., IPT; Spreadsheet template; Morpho; EUOSME

- Scripting- Output directly from existing metadata database- transform from another metadata specification to EML

- Editing XML directly- Validation essential

Page 22: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Outline

- Why metadata?

- The GBIF EML profile

- Metadata standards

- Preparation of metadata

- Where does the metadata go?

- Preparing metadata using the IPT

Page 23: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Where does the metadata go?

http://metadata.gbif.org

Page 24: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Sources of MetadataGBIF Data Cache

- Registered IPT installations- National/regional/organisation level catalogues- Thematic catalogues, e.g., OBIS

Our approach:

-no imposed metadata standard or preferred catalogue

implementation for participants;

-avoidance of lossy conversions in submitting metadata

GBIF Participants

External networkse.g., Knowledge Network for Biocomplexity (KNB)

Page 25: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

GBIF metadata architecture

GBIFCatalogue

GBIFRegistry

EuroGEOSSCatalogue

Cataloguee.g., GBIF

Node

IPTInstance

Catalogue e.g.,KNB GBIF

Data Cache

OAI-PMH

Direct payload

GBIF metadata catalogue specification: http://links.gbif.org/gbif_metadata_catalogue_specification.pdf

Page 26: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

OAI-PMHOAI-PMH Open Archives Initiative Protocol for Metadata Harvesting Providing a low-barrier mechanism for interoperability across

distributed metadata repositories Data providers expose metadata; Service providers consume

metadata through a client application known as a harvester that issues OAI-PMH service requests over HTTP:1. GetRecord 2. Identify 3. ListIdentifiers 4. ListMetadataFormats 5. ListRecords 6. ListSets

1. return individual record 2. retrieve information about repository 3. retrieve headers of records 4. return metadata formats available 5. return records from repository 6. retrieve set structure (groupings) of repository

http://www.openarchives.org/pmh/GBIF:

role as harvester and provider

Page 27: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

Outline

- Why metadata?

- The GBIF EML profile

- Metadata standards

- Preparation of metadata

- Where does the metadata go?

- Preparing metadata using the IPT

Page 28: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

IPT metadata editor- Preparing metadata according to GBIF EML profile

- specimens/observations- names (checklists)- other (e.g., ecological data)- derived products (e.g., species distribution maps)

- Data set level metadata

- Output as part of DwC-A zip file (EML.xml)

- Metadata for published and unpublished data sets

Page 29: GLOBAL BIODIVERSITY INFORMATION FACILITY Éamonn Ó Tuama Senior Programme Officer, IDA 21 June 2011  Metadata publishing with the IPT

How to contact GBIF:How to contact GBIF:

Web site: www.gbif.orgData portal: data.gbif.org

GBIF SecretariatUniversitetsparken 152100 CopenhagenDenmark

E-mail: [email protected]: +45 3532 1470Fax: +45 3532 1480