department of computer science semantic web meets biology · department of computer science. 3....
TRANSCRIPT
Department of Computer Science
Department of Computer Science
Prof. Eero Hyvönen, directorSemantic Computing Research Group (SeCo)Aalto University and University of HelsinkiHelsinki Centre for Digital Humanities (HELDIG)http://seco.cs.aalto.fi/ http://heldig.fi/
Semantic Web meets biology: Finnish experiences in modelling, managing, and using biological linked data
Digital Humanities
Social SciencesHumanities
ComputerScience
Digital Humanities
Digital Biology
BiologyHumanities
ComputerScience
Digital Biology
Department of Computer Science
Big Data
Department of Computer Science
Interoperability
Department of Computer Science
ArtificialIntelligence
Department of Computer Science
Study Biologyas Data
Department of Computer Science
Contents
1. Semantic Web and Linked Data
2. Problem: Aggregating and publishing distributed collection data
3. Solution: Shared Semantic Web infrastructure
4. What is needed for an infrastructure?
5. Examples of applications
http://seco.cs.aalto.fi/
Semantic weband Linked Data
Semantic Web Activity at W3C Started 2001
Web of PagesWWW
Web of DataGGG
(Giant Global Graph)
WWW Document:Semantic portal HealthFinland
Corresponding GGG graph (in RDF)
LODStats 2018 (http://stats.lod2.eu/): 10000 datasets,150 billion triples
Linked Open Data Clound 2018
GGG: Big Boys Have Entered the Game …http://schema.org
• Google Knowledge Graph
• Microsoft Satori Knowledge Base
2001
(Hyvönen, ed., 2002)
Building Semantic Web Infrastructurein Finland with Applications
Ca. 50 funding organizations, including LUOMUS
• National Finnish Ontology Project 2003-2012
• Linked Data Finland 2012-
• Linked Open Science 2014-2017
http://seco.cs.aalto.fi/
Linked Data for Managing Distributed
Collections
Problem 1: Content Compexity- Heterogenous and Interlinked
Encyclopedia
Artefacts
Maps
Videos
Buildings
Fine arts
Narratives
Literature
Cultural sites
Music
Problem 2: Cultural Content Production System- Distributed and Independent
Museums
Libraries
Archieves
Land survey
Linked Data
Web 2.0 sites
Media
Citizens
”Sampo” Model for Semantic Portals
Ontology & Data Infrastructure
SemanticMetadata
ContentProviders
Land survey Museums
Archieves
Linked DataCitizens
Libraries
Web 2.0 sites
Media
”Intellectuals solve problems - geniuses prevent them”
Albert Einstein
Why infrastructure?
Traditional Infras: (rail)roads, electricity, …
Semantic Content Infra:Ontologies, data, metadata
‹#›
How Does Linked Data Work in Practise?
‹#›
Biographical Registries Collect Data about Persons
henkilö nimi ammatti syntymapaikka ...H1 Akseli Gallen-Kallela taiteilija LemuH2 Gustaf Mannerheim marsalkka Askainen
...
H1
Lemu
ArtistPerson
”Akseli Gallen-Kallela”
H2
Askainen
Marshall
”Gustaf Mannerheim”
type
type
name
nanme
profession
profession
birthPlace
birthPlace
Biography Center
Person Name Profession Birth Place
‹#›
Art Museum Catalogs Paintings
...
T1
1929
Painting
creator
time
type
”Gustaf Mannerheim”nimi
subject
name”Akseli Gallen-Kallela”
teos nimi tekijä aika aihe ...T1 Mannerheimin muotokuva Akseli Gallen-Kallela 1929 Gustaf MannerheimT2 Aino-triptyykki Akseli Gallen-Kallela 1891 Aino, Kalevala
...
Art Museum Collection
‹#›
Land Survey Organizations Know Places
Varsinais-Suomen lääni Finland
Askainen
Lemu
Turku
kunta lääniAskainen Varsinais-Suomen lääniHelsinki Uudenmaan lääniLemu Varsinais-Suomen lääniTurku Varsinais-Suomen lääni...
part-ofpart-of
part-of
part-of
County
type
Province
type...
type Land Survey
‹#›
Ontologies are Developed by Semantic Web Researchers
ArtistPerson
Marshall
Painting
Concept
Endurant
Place
Profession CountysubClassOf
TimePeriod
AbstractPerdurant
PhysicalObject
Province
KOKO-ontologySubclass Hierarchy
FinnONTO
subClassOf
subClassOf
subClassOf
subClassOf
‹#›
RDF Connects and Harmonizes Linked Data into a GGG
H1
Lemu
ArtistPerson
”Akseli Gallen-Kallela”
H2
Askainen
Marshall
”Gustaf Mannerheim”
type
type
name
name
profession
profession
birthPlace
birthPlace
T1
1929
maalaus
tekijä
aiheaika
tyyppi
Varsinais-Suomen lääni Finland
Turku
part-of part-of
part-of part-of
Concept
Endurant
Place
Profession County
type
type
type
subClassOf
subClassOf
subClassOf
subClassOf
yläluokka
Time
subClassOfA bstractPerdurant
PhysicalObject
Province
yläluokka
...
PortalTriplestore
‹#›
1+1>2
‹#›
AI
‹#›
WarSampoFinnish WW II on the Semantic Web
http://sotasampo.fi
‹#›
https://vimeo.com/212249404
‹#›
In Principle a Piece of Cake but …
How to align concepts (URIs) used by different organizations?
How to align metadata models used by different organization?
SHARED INFRA NEEDED!
‹#›
Concepts (IDs) and Data Model Glue Data TogetherSolution: Shared Ontology Infrastructure
P1
Lemu
artistperson
”Akseli Gallen-Kallela”
P2
Askainen
marshal
”Gustaf Mannerheim”
type
type
name
name
occupation
occupation
birth place
birth place
W1
1929
painting
creator
topictime
type
Varsinais-Suomi Finland
Turku
part-of part-of
part-of part-of
concept
endurant
place
occupation municipality
type
type
type
subclassOf
subclassOf
subclassOf
subclassOf
subclassOf
time
subclassOfabstractperdurant
physical object
province
subclassOf
...
Components of a Semantic Web Intrastructure
RDF
Metadata Models
Domain Ontologies
Datasets
Department of Computer Science
1. Metadata Modelsfor Modeling the Domain of Discourse
❏ Object and document centric
❏ Traditional approach in museums
❏ E.g. Darwin Core
❏ Event-centric harmonizing models
❏ E.g. CIDOC CRM, FRBRoo
Department of Computer Science
TaxMeOn Metaontology:The big picture
OWL/RDF(S)
TaxMeOn
Ontology
Metadata
Data
Department of Computer Science
Three parts of the model
Environmentalauthorities, ecologists
Biological research results Specieslists
Commonnames
Researchers
Hobbyists, journalists, translators
Pterotarsus
Elateridae
GalbapotentialRelation
Publication
Lameere, 1900
publishedIn
Split
Balgus
Fleutiaux, 1920
Eucnemidaebefore
after
after
Galbites
Fleutiaux,1918
changeInTaxonomicConcept
Fleutiaux, 1945
Muona,1987
isPartOfHigherTaxon
Elateridae
Pterotarsus
Publication Publication
Eucnemidae
PterotarsuschangeInTaxonomicConcept
Eucnemidae
Galbites
Eucnemidae
changeInTaxonomicConcept
Publication
Fleutiaux1920
Publication
Eucnemidae
Publication
Schenkling1928 Crowson,
1967
Throscidae
BalguscongruentTaxon
Publication
isOlderThan
Cobos,1961
Elateridae
Balgus
Publication
Publication
Guerin-Meneville,1831
Publication
Publication
1 2 3
4a 4b 4c
5a 5b 5c
4d
Galba
tuberculata
Eucnemidae
Guerin-Meneville,1830
Guerin-Meneville,1838. In the illustrations of the book were publishe later and Galba tuberculata had a name Pterotarsus marmorata
historio
Publication
changeInTaxonomicConcept
Fleutiaux,1945
isPartOfHigherTaxon
(Tuominen, Laurenne, Hyvönen, ESWC 2011)
Research Case Study:Modeling False Click Beetles
http://schema.onki.fi/taxmeon
Department of Computer Science
2. Domain Ontologiesfor Populating Metadata Schemas
❏ Tangible Objects
❏ Intangible Subject Matter
❏ Contemporaty Places & Historical places
❏ Actors (Persons, Groups, Organizations)
❏ Events (historical and provenance)
❏ Nomenclatures (e.g. taxonomies, chemical compounds, …)
❏ …
Resolving Identities: URIs
(Centralized)g ONKI Ontology Services
1. Ontology Developers- Colloborative development of
interdependent ontologies- Versioning and support for updates
2. Information Searchers- Support concept-based search- Keyword disambiguation- Finding the right search concepts
2. Information Indexers- Support indexing concept finding- Keyword disambiguation- Support indexing patterns
Nokia:company or city?
ONKI widget for using ontologies in mashup applications
● Ontology services are automatically available after publishing a vocabulary or ontology with ONKI
● Simple AJAX-based widget for creating mash-ups
46
Department of Computer Science
Department of Computer Science
Department ofComputer Science
48
ONKI Light deployed January 2014 by the National Library as Finto
Permanent free national service funded by Ministry of Education and Culture and Ministry of Finance from the state budget
Department of Computer Science
3. Datasetsfor Research and Applications
❏ Museum collections
❏ Observations (e.g., GBIF)
❏ Environmental measurements
❏ Encyclopedic data about biology & sciences
❏ Geographical data, biotype data, etc
❏ Weather data
❏ Legal data, regulations, red lists, …
Department of Computer Science
Example Application: BirdWatch
● ”Tiira.fi” citizen science observations / GBIF -> RDF● Birds of the World ontology● Ontology of Finnish bird characteristics● Geographical places & maps (Google)
Department of Computer Science
http://seco.cs.aalto.fi/applications/birdwatch/ESWC WS, 2013
Department of Computer Science
Filling in Observation Form
Department of Computer Science
● Big data about biology is available● Cannot be investigated manually● Enriched big data provide new insights
● For the public in semantic portals● For researchers in data analysis
● Linked data is a promising approach● Data harmonization● Data aggregation● Knowledge-based analysis
Conclusions
57
More Info – Questions?
https://www.amazon.com/Publishing-Cultural-Heritage-Synthesis-Technology/dp/1608459977
Semantic Web & Linked Datahttp://www.w3.org/standards/semanticweb/Sampo Model & Applicationshttp://seco.cs.aalto.fi/publications
https://www.gaudeamus.fi/semanttinen-web/