what is linked historical data?
DESCRIPTION
Traditionally, historians have distinguished between primary and secondary sources in order to guarantee independence and reliability in their reconstruction of History. A particularly interesting characteristic of primary sources is that they need to be immutable, that is, to be curated and preserved from change over time. In this presentation we study how theories of persistence (through part of the DOLCE ontology) and metaproperties of 'healthy' ontologies (through the OntoClean metaproperties) might uncover interesting semantics for designing an ontological framework of historical sources on the Web.TRANSCRIPT
What Is Linked
Historical Data?
Albert Meroño-Peñuela
Rinke Hoekstra
@albertmeronyo
EKAW 2014, Linköping, Sweden 26/11/2014
Primary sources
Secondary sources
Historical Sources…
• Independence
• Reliability
• Immutability
…as RDF Graphs?
1. An IRI, once minted, should never change its intended referent.
2. Literals, by design, are constants and never change their value.
3. A relationship that holds between two resources at one time may not hold at another time.
4. RDF sources may change their state over time. That is, they may provide different RDF graphs at different times.
5. Some RDF sources may, however, be immutable snapshots of another RDF source, archiving its state at some point in time.
From http://www.w3.org/TR/rdf11-concepts/#change-over-time on RDF and change over time
…as RDF Graphs?
1. An IRI, once minted, should never change its intended referent.
2. Literals, by design, are constants and never change their value.
3. A relationship that holds between two resources at one time may not hold at another time.
4. RDF sources may change their state over time. That is, they may provide different RDF graphs at different times.
5. Some RDF sources may, however, be immutable snapshots of another RDF source, archiving its state at some point in time.
From http://www.w3.org/TR/rdf11-concepts/#change-over-time on RDF and change over time
Linked Historical Data: A Matter of
Life and Death
Dichotomy:
• Alive Web
• Dead Web
An Ontological Framework of
Historical Sources
• Problem: fundamental requirements from historians on historical sources seem flawed by design in Linked Data
• (part-of) Solution: gain understanding on the essential characteristics of historical sources
• Gain understanding = explicitly state their semantics– Persistence theories (e.g. inst. in DOLCE)
– OntoClean methodology
Persistence
• The continued or prolonged existence of
something
• Perdurantism: ordinary things have
temporal parts (i.e. persist by perduring)
• Endurantism: ordinary things are wholly
present whenever they exist (i.e. persist
by enduring)
– Can “genuinely” change over time
Persistence of Historical
Sources• Secondary sources are endurants
• Primary sources
– Same enduring properties
– Requirement: perdurance immutability (can
change but should not)
– Strong endurants (i.e. can’t change over
time)
The Identity Problem: OntoClean
If sources can change over time, how can
we guarantee that they are the same
entity?
Study of the essential characteristics of
primary and secondary sources – OntoClean
metaproperties
Metaproperties of Historical
Sources
• Rigidity (+R): a rigid property is a
property that is essential to all its
instances, i.e. ∀x φ(x) →☐φ(x)
• Non-rigid (-R), anti-rigid (~R)
• E.g. person(x), student(x)
• Primary sources = +R
• Secondary sources = ~R
Metaproperties of Historical
Sources
• Sortals (+I): classes all of whose
instances are identified in the same way
• Identity criteria of historical sources as
RDF graphs?
• Primary sources = +I
• Secondary sources carry no identity
criteria
Metaproperties of Historical
Sources
• Unity (+U): classes all of whose
individuals are wholes under the same
relation (a whole does not create
instances of its class when subdivided).
• E.g. person(x), clay(x)
• Primary sources = ~U
• Secondary sources = +U
Metaproperties of Historical
Sources
• Dependence (+D): a property is
dependent if each instance of it implies
the existence of another entity.
• E.g. student(x) → teacher(y)
• Primary sources = -D
• Secondary sources = +D
Violating Historical Source
Metaproperties• Historical graphs published in arbitrary sources
on the Web
• The AAA rule: Anyone can say Anything about Any topic– Historical graph ?g with { ?s ?p ?o } changed by
• Unauthoritative statement on a primary source:
{ ?s’ ?p’ ?o’ } with ?s’ = ?s
• Inbound links
{ ?s’ ?p’ ?o’ } with ?o’ = ?s
• Reliability? Independence?
Trusted primary sources
from digital archives
True / false answer on the existence of authoritative primary
source statements
ASK
Reliability
Trusted in-archive
IRI dereferencing services
• IRIs of the primary source remain intact
• Copy has prov:wasDerivedFrom relations
• Resolution and dereferenceability mechanisms
Independence
easy:anne-frank-diary ?
Qualified copy
Primary source RDF graph
Future Work
• Further theoretical study w/ historians
• Experimental evaluation w/ historians
– Metaproperties
• Existing historical ontologies (scarce)
• New ones
– Primary source resolution http://easy.dans.knaw.nl/
• Use cases, historical concepts
– Dutch historical censuses http://cedar-project.nl/
– Dutch book trade http://stcn.data2semantics.org/
Thank youPrimary Source Secondary Source
Dead (archived) graphs Living LOD
Strong endurant Endurant
Rigid (+R) Anti-rigid (~R)
Sortal (+I) Non-sortal (-I)
Anti-unity (~U) Unity (+U)
Independent (-D) Dependent (+D)
Dereferenceable only by archives Dereferenceable by anyone
Comments, suggestions most welcome
@albertmeronyo
https://www.cedar-project.nl/