global biodiversity information facility Éamonn Ó tuama senior programme officer, ida 21 june 2011...
TRANSCRIPT
GLOBALGLOBALBIODIVERSITYBIODIVERSITYGLOBALGLOBALBIODIVERSITYBIODIVERSITY
INFORMATIONINFORMATIONFACILITYFACILITY
Éamonn Ó Tuama
Senior Programme Officer, IDA
21 June 2011
WWW.GBIF.OWWW.GBIF.ORGRG
Metadata Metadata publishing with publishing with the IPTthe IPT
Outline
- Why metadata?
- The GBIF EML profile
- Metadata standards
- Preparation of metadata
- Where does the metadata go?
- Preparing metadata using the IPT
Outline
- Why metadata?
- The GBIF EML profile
- Metadata standards
- Preparation of metadata
- Where does the metadata go?
- Preparing metadata using the IPT
”Data Intensive Science”
”Fourth Science Paradigm”
e-Infrastructure Reflection Group (European Strategy Forum on Research Infrastructures). Report on Data Management, November 2009. http://www.e-irg.eu/images/stories/publ/task_force_reports/dmtfjointreport.pdf
”Digital Data Deluge”
The Fourth Paradigm: Data-Intensive Scientific Discovery http://research.microsoft.com/en-us/collaboration/fourthparadigm/contents.aspx
high quality metadata for long term curation and use of datasets
Key requirement :
Why metadata?
Why metadata?
William K. Michener, Meta-information concepts for ecological data management, Ecological Informatics, Volume 1, Issue 1, January 2006, Pages 3-7, ISSN 1574-9541, DOI: 10.1016/j.ecoinf.2005.08.004.(http://www.sciencedirect.com/science/article/B7W63-4HJRS57-3/2/ea2e08412c6776456f540e66983546c0)
Information about datasets deteriorates over time!
Why metadata?
Metadata supports:- Discovery- Interpretation/Evaluation
- Provenance- Quality- Fitness-for-use
- Analytical re-use
Outline
- Why metadata?
- The GBIF EML profile
- Metadata standards
- Preparation of metadata
- Where does the metadata go?
- Preparing metadata using the IPT
Metadata Standards
Ecological Metadata Language (EML) v2.1.1http://knb.ecoinformatics.org/software/eml/
Metadata Standards
Dublin Core http://dublincore.org/documents/dcmi-terms/
Metadata Standards
Directory Interchange Format (DIF)http://gcmd.nasa.gov/User/difguide/difman.html
Metadata Standards
ISO 19115/19139 Geographic MetadataISO 19115: http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=26020ISO 19139: http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=32557
Metadata Standards
Natural Collections Descriptions (NCD)http://www.tdwg.org/standards/312/
Metadata Standards
Federal Geographic Data Committee (FGDC) Biological Profile http://www.fgdc.gov/standards/projects/FGDC-standards-projects/metadata/biometadata/
An extension of the FGDC CSDGM (Content Standard for Digital Geospatial Metadata)
*
*
Metadata Standards
Multimedia Resources Metadata Schema http://www.tdwg.org/charters/article/view/448/36
NAP ISO 19115
Attributes describing the metadata
Components to describe the resource
Source: http://www.fgdc.gov/standards/projects/incits-l1-standards-projects/NAP-Metadata/napMetadataProfileV101.pdf/view
ISO 19115/19139North American Profile of ISO 19139http://www.fgdc.gov/standards/projects/incits-l1-standards-projects/NAP-Metadata/napMetadataProfileV101.pdf/view
Several Resources available for crosswalk; transform; view
EML to FGDC Biological Profilehttps://code.ecoinformatics.org/code/eml/trunk/lib/eml2tonbii/
# FGDC CSDGM to ISO Transform# FGDC CSDGM to ISO Crosswalk# ISO XML to HTML View: # FGDC BIO to ISO Transform# FGDC BIO to ISO Crosswalkhttp://www.ncddc.noaa.gov/technology/metadataandxml/view
FGDC CSDGM
ISO 19139
EML to ISO 19139http://code.google.com/p/gbif-metadata/source/browse/trunk/metadata/src/main/resources/eml2iso19139.xsl
Open source INSPIRE-compliant MD editor (multilingual functionality)http://www.inspire-geoportal.eu/EUOSME/
Metadata and Languages
A Multilingual Metadata Catalog for the ILTER: Issues and Approaches. Vanderbilt, K.L., et al., Ecological Informatics, Volume 5, Issue 3, May 2010, Pages 187-193, doi:10.1016/j.ecoinf.2010.02.002
- Adopt a lingua franca, e.g., English- data publishers provide discovery level metadata in English; - full metadata in local language.
- Just use local language with keywords from multilingual thesauri, e.g. GEMET, the GEneral Multilingual Environmental Thesaurus; 27 languages. http://www.eionet.europa.eu/gemet/
AGROVOC; agriculture, forestry, fisheries, food and related domains e.g., environment; 20 languages. http://www4.fao.org/agrovoc/default.htm
Long term solution: multilingual ontologies
- Issues? – additional burden; tools, metadata standards
Outline
- Why metadata?
- The GBIF EML profile
- Metadata standards
- Preparation of metadata
- Where does the metadata go?
- Preparing metadata using the IPT
GBIF EML Profile- Requirements gathering
- GBIF Metadata Task Grouphttp://www2.gbif.org/GBIF-MIFTG-Report.pdf
- EML; ISO 19115; NCD; INSPIRE Directivehttp://community.gbif.org/mod/file/download.php?file_guid=10915;
http://community.gbif.org/mod/file/download.php?file_guid=5656
http://rs.gbif.org/schema/eml-gbif-profile/1.0/eml-gbif-profile.xsd
- GBIF EML schema
http://community.gbif.org/pg/groups/5258/gbif-metadata-network/
- GBIF community site: metadata network
- GBIF profile documentationhttp://links.gbif.org/gbif_metadata_profile_how-to_en_v1
http://links.gbif.org/gbif_metadata_profile_guide_en_v1
Outline
- Why metadata?
- The GBIF EML profile
- Metadata standards
- Preparation of metadata
- Where does the metadata go?
- Preparing metadata using the IPT
Preparing metadata- Metadata editors
e.g., IPT; Spreadsheet template; Morpho; EUOSME
- Scripting- Output directly from existing metadata database- transform from another metadata specification to EML
- Editing XML directly- Validation essential
Outline
- Why metadata?
- The GBIF EML profile
- Metadata standards
- Preparation of metadata
- Where does the metadata go?
- Preparing metadata using the IPT
Where does the metadata go?
http://metadata.gbif.org
Sources of MetadataGBIF Data Cache
- Registered IPT installations- National/regional/organisation level catalogues- Thematic catalogues, e.g., OBIS
Our approach:
-no imposed metadata standard or preferred catalogue
implementation for participants;
-avoidance of lossy conversions in submitting metadata
GBIF Participants
External networkse.g., Knowledge Network for Biocomplexity (KNB)
GBIF metadata architecture
GBIFCatalogue
GBIFRegistry
EuroGEOSSCatalogue
Cataloguee.g., GBIF
Node
IPTInstance
Catalogue e.g.,KNB GBIF
Data Cache
OAI-PMH
Direct payload
GBIF metadata catalogue specification: http://links.gbif.org/gbif_metadata_catalogue_specification.pdf
OAI-PMHOAI-PMH Open Archives Initiative Protocol for Metadata Harvesting Providing a low-barrier mechanism for interoperability across
distributed metadata repositories Data providers expose metadata; Service providers consume
metadata through a client application known as a harvester that issues OAI-PMH service requests over HTTP:1. GetRecord 2. Identify 3. ListIdentifiers 4. ListMetadataFormats 5. ListRecords 6. ListSets
1. return individual record 2. retrieve information about repository 3. retrieve headers of records 4. return metadata formats available 5. return records from repository 6. retrieve set structure (groupings) of repository
http://www.openarchives.org/pmh/GBIF:
role as harvester and provider
Outline
- Why metadata?
- The GBIF EML profile
- Metadata standards
- Preparation of metadata
- Where does the metadata go?
- Preparing metadata using the IPT
IPT metadata editor- Preparing metadata according to GBIF EML profile
- specimens/observations- names (checklists)- other (e.g., ecological data)- derived products (e.g., species distribution maps)
- Data set level metadata
- Output as part of DwC-A zip file (EML.xml)
- Metadata for published and unpublished data sets
How to contact GBIF:How to contact GBIF:
Web site: www.gbif.orgData portal: data.gbif.org
GBIF SecretariatUniversitetsparken 152100 CopenhagenDenmark
E-mail: [email protected]: +45 3532 1470Fax: +45 3532 1480