linking scientific metadata (presented at dc2010)

Post on 19-Jan-2015

1.158 Views

Category:

Technology

4 Downloads

Preview:

Click to see full reader

DESCRIPTION

Linked entity data in metadata records builds a foundation for semantic web. Even though metadata records contain rich entity data, there is no linking between associated entities such as persons, datasets, projects, publications, or organizations. We conducted a small experiment using the dataset collection from the Hubbard Brook Ecosystem Study (HBES), in which we converted the entities and their relationships into RDF triples and linked the URIs contained in RDF triples to the corresponding entities in the Ecological Metadata Language (EML) records. Through the transformation program written in XML Stylesheet Language (XSL), we turned a plain EML record display into an interlinked semantic web of ecological datasets. The experiment suggests a methodological feasibility in incorporating linked entity data into metadata records. The paper also argues for the need of changing the scientific as well as general metadata paradigm.

TRANSCRIPT

School of Information StudiesSyracuse University

Linking Entities in Scientific Metadata

Jian Qin, Miao Chen, Xiaozhong Liu, & Andrea Wiggins

School of Information Studies, Syracuse University

The context: Islands of research information

04/10/2023

Linking Entities in Scientific Metadata -- DC2010 2

Data

Projects

Publications

Research interest

Researchers

Unlinked entities

Same entity!

04/10/2023 3Linking Entities in Scientific Metadata -- DC2010

Duplication of entity data entry

04/10/2023

Linking Entities in Scientific Metadata -- DC2010 4

Seamless Daily Precipitation for the Conterminous United States

Metadata:Identification_InformationData_Quality_InformationSpatial_Data_Organization_InformationSpatial_Reference_InformationEntity_and_Attribute_InformationDistribution_InformationMetadata_Reference_Information

What’s lacking in scientific metadata?• Standards focus on describing datasets, not

entities• No mechanism is provided for linking entities

– It is considered as an implementation issue• Islands of entities duplication of data entry

for the same entity – Increased costs and time in creating metadata– Effect in resource discovery and browse

04/10/2023 5Linking Entities in Scientific Metadata -- DC2010

Defining the research Problem

04/10/2023 6Linking Entities in Scientific Metadata -- DC2010

How can we build an interlinked network of entities for a scientific domain?

How can we associate the linked entities with their corresponding metadata records?

Linked Data: A solution

04/10/2023 7Linking Entities in Scientific Metadata -- DC2010

Relational database

containing entities and relationships

Metadata records in

XML format

Problem: Lack relationships between entities

Problem: Not related to metadata records

ResourcePropertyType

Value

RDF TriplesConvert to Embed RDF triples into

Solution

Linked data: How it works

04/10/2023

Linking Entities in Scientific Metadata -- DC2010 8

Linked data is

04/10/2023

Linking Entities in Scientific Metadata -- DC2010 9

“…a recommended best practice for exposing, sharing, and connecting

pieces of data, information, and knowledge on the Semantic Web

using URIs and RDF.”

--Wikipedia, http://en.wikipedia.org/wiki/Linked_Data

A case study

04/10/2023 10Linking Entities in Scientific Metadata -- DC2010

Dataset collection search interface at HBES (http://hubbardbrook.org/data/dataset_search.php)

Hubbard Brook Ecosystem Study (HBES)• Long term ecological research sites since 1960s• 3,160 hectare reserve• Six principle organizations & 10 other participants:

– USDA Forest Service– Cornell– Dartmouth– Syracuse– Yale– the Institute of Ecosystem Studies (IES)– the U.S. Geological Survey

• Over 300 datasets available and 2000 publications

04/10/2023 11Linking Entities in Scientific Metadata -- DC2010

HBES Data Collection• Focused on entities on the HBES site:

– Projects– Persons– Publications– Subject interests– Datasets– Events

• Verified Person and Project information against the Long-Term Ecological Research (LTER) directory if necessary;

• Stored the entities in relational database• Metadata records in EML format

04/10/2023 12Linking Entities in Scientific Metadata -- DC2010

Ecological Metadata Language (EML) Structure and Modules

04/10/2023

Linking Entities in Scientific Metadata -- DC2010 13

Conditions required for interlinking

• URI-identified entities• Relationships between these entities• Relationships between the entities

and metadata records

04/10/2023 14Linking Entities in Scientific Metadata -- DC2010

Experiment stage 1: Data prep• Two sets of data:

– Entities and their relationships• Person, subject interest, project, dataset, and paper• Many-to-many relations between the entities

– Sample EML records in XML format• Downloaded from HBES website• Entity URIs added to the corresponding XML files to be

used as semantic identifiers and hyperlinks to the entities

• 126 XML files in total

04/10/2023 15Linking Entities in Scientific Metadata -- DC2010

Entity relationships

04/10/2023 16Linking Entities in Scientific Metadata -- DC2010

Experiment stage 2: Converting to RDF• Toolkit: D2R, a service for converting

relational databases into RDF triples and publishing them on the web– Turn each table into a class– Turn each column as class property– Make each value in a column as an instance– Assign a URI to each class, property, and instance

04/10/2023 17Linking Entities in Scientific Metadata -- DC2010

04/10/2023 Linking Entities in Scientific Metadata -- DC2010 18

Experiment stage 3: Incorporating URI into XML records

• Add the URIs generated from the D2R software to their corresponding entities in EML records by using an XSL program

• Transform the EML records with inserted URIs into the HTML format for display in browser

04/10/2023 19Linking Entities in Scientific Metadata -- DC2010

Example of name with URI inserted

04/10/2023 Linking Entities in Scientific Metadata -- DC2010 20

Original EML record without URI URI added to individual name element

<individualName> <givenName>Thomas G</givenName> <surName>Siccama</surName></individualName>

<individualName> <givenName>Thomas G. </givenName> <surName>Siccama</surName> <personURI>page/people/tsiccama </personURI></individualName>

04/10/2023

Linking Entities in Scientific Metadata -- DC2010 21

Original display of EML record RDF-enabled display of EML record

Discussion• Methodology for transforming islands of entities

into linked scientific metadata• A larger scale data set needed to test its

scalability• Potentials:

– Reducing duplicate entity data entry – Applicable to legacy metadata generated using older

data model– Linking semantic data already published on the web– Facilitating data/metadata visualization??

04/10/2023 22Linking Entities in Scientific Metadata -- DC2010

DEMO

http://sdl.syr.edu/eml/

04/10/2023 Linking Entities in Scientific Metadata -- DC2010 23

top related