department of computer science semantic web meets biology · department of computer science. 3....

57
Department of Computer Science Prof. Eero Hyvönen, director Semantic Computing Research Group (SeCo) Aalto University and University of Helsinki Helsinki Centre for Digital Humanities (HELDIG) http://seco.cs.aalto.fi/ http://heldig.fi/ Semantic Web meets biology: Finnish experiences in modelling, managing, and using biological linked data

Upload: others

Post on 22-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

Department of Computer Science

Prof. Eero Hyvönen, directorSemantic Computing Research Group (SeCo)Aalto University and University of HelsinkiHelsinki Centre for Digital Humanities (HELDIG)http://seco.cs.aalto.fi/ http://heldig.fi/

Semantic Web meets biology: Finnish experiences in modelling, managing, and using biological linked data

Page 2: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)
Page 3: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Digital Humanities

Social SciencesHumanities

ComputerScience

Digital Humanities

Page 4: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Digital Biology

BiologyHumanities

ComputerScience

Digital Biology

Page 5: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

Big Data

Page 6: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

Interoperability

Page 7: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

ArtificialIntelligence

Page 8: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

Study Biologyas Data

Page 9: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

Contents

1. Semantic Web and Linked Data

2. Problem: Aggregating and publishing distributed collection data

3. Solution: Shared Semantic Web infrastructure

4. What is needed for an infrastructure?

5. Examples of applications

Page 10: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

http://seco.cs.aalto.fi/

Semantic weband Linked Data

Page 11: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Semantic Web Activity at W3C Started 2001

Page 12: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Web of PagesWWW

Web of DataGGG

(Giant Global Graph)

Page 13: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

WWW Document:Semantic portal HealthFinland

Page 14: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Corresponding GGG graph (in RDF)

Page 15: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

LODStats 2018 (http://stats.lod2.eu/): 10000 datasets,150 billion triples

Linked Open Data Clound 2018

Page 16: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

GGG: Big Boys Have Entered the Game …http://schema.org

• Google Knowledge Graph

• Microsoft Satori Knowledge Base

Page 17: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

2001

(Hyvönen, ed., 2002)

Page 18: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Building Semantic Web Infrastructurein Finland with Applications

Ca. 50 funding organizations, including LUOMUS

• National Finnish Ontology Project 2003-2012

• Linked Data Finland 2012-

• Linked Open Science 2014-2017

Page 19: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

http://seco.cs.aalto.fi/

Linked Data for Managing Distributed

Collections

Page 21: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Problem 2: Cultural Content Production System- Distributed and Independent

Museums

Libraries

Archieves

Land survey

Linked Data

Web 2.0 sites

Media

Citizens

Page 22: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

”Sampo” Model for Semantic Portals

Ontology & Data Infrastructure

SemanticMetadata

ContentProviders

Land survey Museums

Archieves

Linked DataCitizens

Libraries

Web 2.0 sites

Media

Page 24: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Traditional Infras: (rail)roads, electricity, …

Semantic Content Infra:Ontologies, data, metadata

Page 30: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

‹#›

RDF Connects and Harmonizes Linked Data into a GGG

H1

Lemu

ArtistPerson

”Akseli Gallen-Kallela”

H2

Askainen

Marshall

”Gustaf Mannerheim”

type

type

name

name

profession

profession

birthPlace

birthPlace

T1

1929

maalaus

tekijä

aiheaika

tyyppi

Varsinais-Suomen lääni Finland

Turku

part-of part-of

part-of part-of

Concept

Endurant

Place

Profession County

type

type

type

subClassOf

subClassOf

subClassOf

subClassOf

yläluokka

Time

subClassOfA bstractPerdurant

PhysicalObject

Province

yläluokka

...

PortalTriplestore

Page 36: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

‹#›

Concepts (IDs) and Data Model Glue Data TogetherSolution: Shared Ontology Infrastructure

P1

Lemu

artistperson

”Akseli Gallen-Kallela”

P2

Askainen

marshal

”Gustaf Mannerheim”

type

type

name

name

occupation

occupation

birth place

birth place

W1

1929

painting

creator

topictime

type

Varsinais-Suomi Finland

Turku

part-of part-of

part-of part-of

concept

endurant

place

occupation municipality

type

type

type

subclassOf

subclassOf

subclassOf

subclassOf

subclassOf

time

subclassOfabstractperdurant

physical object

province

subclassOf

...

Page 37: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Components of a Semantic Web Intrastructure

RDF

Metadata Models

Domain Ontologies

Datasets

Page 38: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

1. Metadata Modelsfor Modeling the Domain of Discourse

❏ Object and document centric

❏ Traditional approach in museums

❏ E.g. Darwin Core

❏ Event-centric harmonizing models

❏ E.g. CIDOC CRM, FRBRoo

Page 39: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

TaxMeOn Metaontology:The big picture

OWL/RDF(S)

TaxMeOn

Ontology

Metadata

Data

Page 40: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

Three parts of the model

Environmentalauthorities, ecologists

Biological research results Specieslists

Commonnames

Researchers

Hobbyists, journalists, translators

Page 41: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Pterotarsus

Elateridae

GalbapotentialRelation

Publication

Lameere, 1900

publishedIn

Split

Balgus

Fleutiaux, 1920

Eucnemidaebefore

after

after

Galbites

Fleutiaux,1918

changeInTaxonomicConcept

Fleutiaux, 1945

Muona,1987

isPartOfHigherTaxon

Elateridae

Pterotarsus

Publication Publication

Eucnemidae

PterotarsuschangeInTaxonomicConcept

Eucnemidae

Galbites

Eucnemidae

changeInTaxonomicConcept

Publication

Fleutiaux1920

Publication

Eucnemidae

Publication

Schenkling1928 Crowson,

1967

Throscidae

BalguscongruentTaxon

Publication

isOlderThan

Cobos,1961

Elateridae

Balgus

Publication

Publication

Guerin-Meneville,1831

Publication

Publication

1 2 3

4a 4b 4c

5a 5b 5c

4d

Galba

tuberculata

Eucnemidae

Guerin-Meneville,1830

Guerin-Meneville,1838. In the illustrations of the book were publishe later and Galba tuberculata had a name Pterotarsus marmorata

historio

Publication

changeInTaxonomicConcept

Fleutiaux,1945

isPartOfHigherTaxon

(Tuominen, Laurenne, Hyvönen, ESWC 2011)

Research Case Study:Modeling False Click Beetles

Page 42: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

http://schema.onki.fi/taxmeon

Page 43: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

2. Domain Ontologiesfor Populating Metadata Schemas

❏ Tangible Objects

❏ Intangible Subject Matter

❏ Contemporaty Places & Historical places

❏ Actors (Persons, Groups, Organizations)

❏ Events (historical and provenance)

❏ Nomenclatures (e.g. taxonomies, chemical compounds, …)

❏ …

Page 44: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Resolving Identities: URIs

Page 45: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

(Centralized)g ONKI Ontology Services

1. Ontology Developers- Colloborative development of

interdependent ontologies- Versioning and support for updates

2. Information Searchers- Support concept-based search- Keyword disambiguation- Finding the right search concepts

2. Information Indexers- Support indexing concept finding- Keyword disambiguation- Support indexing patterns

Nokia:company or city?

Page 46: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

ONKI widget for using ontologies in mashup applications

● Ontology services are automatically available after publishing a vocabulary or ontology with ONKI

● Simple AJAX-based widget for creating mash-ups

46

Page 47: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

Page 48: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

Department ofComputer Science

48

Page 49: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

ONKI Light deployed January 2014 by the National Library as Finto

Permanent free national service funded by Ministry of Education and Culture and Ministry of Finance from the state budget

Page 50: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

3. Datasetsfor Research and Applications

❏ Museum collections

❏ Observations (e.g., GBIF)

❏ Environmental measurements

❏ Encyclopedic data about biology & sciences

❏ Geographical data, biotype data, etc

❏ Weather data

❏ Legal data, regulations, red lists, …

Page 51: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

Example Application: BirdWatch

● ”Tiira.fi” citizen science observations / GBIF -> RDF● Birds of the World ontology● Ontology of Finnish bird characteristics● Geographical places & maps (Google)

Page 52: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

http://seco.cs.aalto.fi/applications/birdwatch/ESWC WS, 2013

Page 53: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

Filling in Observation Form

Page 54: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)
Page 55: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)
Page 56: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

Department of Computer Science

● Big data about biology is available● Cannot be investigated manually● Enriched big data provide new insights

● For the public in semantic portals● For researchers in data analysis

● Linked data is a promising approach● Data harmonization● Data aggregation● Knowledge-based analysis

Conclusions

Page 57: Department of Computer Science Semantic Web meets biology · Department of Computer Science. 3. Datasets. for Research and Applications Museum collections Observations (e.g., GBIF)

57

More Info – Questions?

https://www.amazon.com/Publishing-Cultural-Heritage-Synthesis-Technology/dp/1608459977

Semantic Web & Linked Datahttp://www.w3.org/standards/semanticweb/Sampo Model & Applicationshttp://seco.cs.aalto.fi/publications

https://www.gaudeamus.fi/semanttinen-web/