pattern based knowledge base enrichment · buhmann, lehmann 2013/10/25 pattern based knowledge base...
TRANSCRIPT
tugraz
AKSW Research Group
Pattern Based Knowledge Base Enrichment
Lorenz Buhmann, Jens Lehmann
Agile Knowledge Engineering and Semantic Web (AKSW)University of Leipzig
25th October 2013
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
0 / 34
tugraz
AKSW Research Group
Outline
1 Motivation
2 Approach
3 Experiments
4 Conclusion
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
0 / 34
tugraz
AKSW Research Group
Table of Contents
1 Motivation
2 Approach
3 Experiments
4 Conclusion
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
0 / 34
tugraz
AKSW Research Group
rise in the availability and usage of knowledge bases
still a lack of knowledge bases that consist of sophisticated schemainformation and instance data adhering to this schema
e.g. in the life sciences several knowledge bases
only consist of schema informationto a large extent, a collection of facts without a clearstructure(e.g. information extracted from databases)
combination of sophisticated schema and instance data would allowpowerful reasoning, consistency checking, and improved querying
→ create schemata based on existing data
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
1 / 34
tugraz
AKSW Research Group
Example
Given knowledge base with
property birthPlace
subjects in triples of this property, e.g. Brad Pitt, Angela Merkel,Albert Einstein
Suggestions: birthPlace may be functional and has the domainPerson, ...
O b j e c t P r o p e r t y : b i r t h P l a c eC h a r a c t e r i s t i c s : F u n c t i o n a lDomain : PersonRange : P l a c eSubPropertyOf : hasBeenAt
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
2 / 34
tugraz
AKSW Research Group
Advantages of more complex schemas
additional implicit information can be inferred
axioms serve as documentation for the purpose and correct usage ofschema elements
dbo:author for booksdbo:writer for film scripts
improve the application of schema debugging techniques
next slides
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
3 / 34
tugraz
AKSW Research Group
Each person was only born at one place?!
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
4 / 34
tugraz
AKSW Research Group
birthPlace birthPlace
6=
birthPlace is functional
SELECT ? s WHERE {? s dbo : b i r t h P l a c e ?o1 .? s dbo : b i r t h P l a c e ?o2 .FILTER (? o1 != ?o2 )}
}
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
5 / 34
tugraz
AKSW Research Group
birthPlace birthPlace
6=
birthPlace is functional
SELECT ? s WHERE {? s dbo : b i r t h P l a c e ?o1 .? s dbo : b i r t h P l a c e ?o2 .FILTER (? o1 != ?o2 )}
}
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
5 / 34
tugraz
AKSW Research Group
birthPlace birthPlace
6=
birthPlace is functional
SELECT ? s WHERE {? s dbo : b i r t h P l a c e ?o1 .? s dbo : b i r t h P l a c e ?o2 .FILTER (? o1 != ?o2 )}
}
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
5 / 34
tugraz
AKSW Research Group
birthPlace birthPlace
6=
birthPlace is functional
SELECT ? s WHERE {? s dbo : b i r t h P l a c e ?o1 .? s dbo : b i r t h P l a c e ?o2 .FILTER (? o1 != ?o2 )}
}
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
5 / 34
tugraz
AKSW Research Group
birthPlace birthPlace
6=
birthPlace is functional
SELECT ? s WHERE {? s dbo : b i r t h P l a c e ?o1 .? s dbo : b i r t h P l a c e ?o2 .FILTER (? o1 != ?o2 )}
}
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
5 / 34
tugraz
AKSW Research Group
birthPlace birthPlace
6=
birthPlace is functional
SELECT ? s WHERE {? s dbo : b i r t h P l a c e ?o1 .? s dbo : b i r t h P l a c e ?o2 .FILTER (? o1 != ?o2 )}
}
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
5 / 34
tugraz
AKSW Research Group
Table of Contents
1 Motivation
2 Approach
3 Experiments
4 Conclusion
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
5 / 34
tugraz
AKSW Research Group
Basic Approach
We provide a
light-weight method for the
semi-automatic enrichment of
SPARQL knowledge bases to
reduce the effort of creating and maintaining such schemainformation.
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
6 / 34
tugraz
AKSW Research Group
3 Steps to get a schema
SPARQLEndpoint
Input: Entity URI, Axiom Type, Knowledge Base (SPARQL Endpoint)
3-Phase EnrichmentLearning Approach:
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
7 / 34
tugraz
AKSW Research Group
3 Steps to get a schema
1. obtain schema information
SPARQLEndpoint
Input: Entity URI, Axiom Type, Knowledge Base (SPARQL Endpoint)
Background Knowledge
3-Phase EnrichmentLearning Approach:
(onl
y ex
ecu
ted
once
per
know
ledg
e ba
se)
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
8 / 34
tugraz
AKSW Research Group
3 Steps to get a schema
1. obtain schema information
Reasoner
SPARQLEndpoint
Input: Entity URI, Axiom Type, Knowledge Base (SPARQL Endpoint)
Background Knowledge
BackgroundKnowledge+ Relevant Instance Data
(opt
ion
alin
voca
tion
)
2. obtain axiom type and entity specific data
3-Phase EnrichmentLearning Approach:
(onl
y ex
ecu
ted
once
per
know
ledg
e ba
se)
(sam
ple
dat
aif
nece
ssar
y)
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
9 / 34
tugraz
AKSW Research Group
3 Steps to get a schema
1. obtain schema information
Reasoner
SPARQLEndpoint
EnrichmentOntology
Input: Entity URI, Axiom Type, Knowledge Base (SPARQL Endpoint)
Background Knowledge
BackgroundKnowledge+ Relevant Instance Data
List of Axiom Suggestions+ Metadata
(opt
ion
alin
voca
tion
)
2. obtain axiom type and entity specific data
3. run machine learning algorithm
3-Phase EnrichmentLearning Approach:
(onl
y ex
ecu
ted
once
per
know
ledg
e ba
se)
(sam
ple
dat
aif
nece
ssar
y)
Learner
DL-Learner
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
10 / 34
tugraz
AKSW Research Group
3 Steps to get a schema
1. obtain schema information
Reasoner
SPARQLEndpoint
EnrichmentOntology
Input: Entity URI, Axiom Type, Knowledge Base (SPARQL Endpoint)
Background Knowledge
BackgroundKnowledge+ Relevant Instance Data
List of Axiom Suggestions+ Metadata
(opt
ion
alin
voca
tion
)
2. obtain axiom type and entity specific data
3. run machine learning algorithm
3-Phase EnrichmentLearning Approach:
(onl
y ex
ecu
ted
once
per
know
ledg
e ba
se)
iterate over all axiom typesand schema entities for fullenrichment
(sam
ple
dat
aif
nece
ssar
y)
Learner
DL-Learner
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
11 / 34
tugraz
AKSW Research Group
Starting Point
SPARQL endpoint: http://dbpedia.org/sparql
Entity URI: http://dbpedia.org/ontology/author
Axiom Type: Object Property Domain
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
12 / 34
tugraz
AKSW Research Group
Step 1 - Obtain Schema Information
CONSTRUCT WHERE {? sub r d f s : s u b C l a s s O f ? sup .
}ORDER BY DESC(? sub ) LIMIT 1000 OFFSET 1000
dbo : D i s e a s e r d f s : s u b C l a s s O f owl : Thing .dbo : Book r d f s : s u b C l a s s O f dbo : WrittenWork .dbo : WrittenWork r d f s : s u b C l a s s O f dbo : Work .dbo : Work r d f s : s u b C l a s s O f owl : Thing .dbo : P h i l o s o p h e r r d f s : s u b C l a s s O f dbo : Person .dbo : Person r d f s : s u b C l a s s O f dbo : Agent .dbo : Agent r d f s : s u b C l a s s O f owl : Thing .dbo : S p o r t r d f s : s u b C l a s s O f dbo : A c t i v i t y .dbo : A c t i v i t y r d f s : s u b C l a s s O f owl : Thing .dbo : F i s h r d f s : s u b C l a s s O f dbo : Animal .
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
13 / 34
tugraz
AKSW Research Group
Step 1 - Obtain Schema Information
CONSTRUCT WHERE {? sub r d f s : s u b C l a s s O f ? sup .
}ORDER BY DESC(? sub ) LIMIT 1000 OFFSET 1000
dbo : D i s e a s e r d f s : s u b C l a s s O f owl : Thing .dbo : Book r d f s : s u b C l a s s O f dbo : WrittenWork .dbo : WrittenWork r d f s : s u b C l a s s O f dbo : Work .dbo : Work r d f s : s u b C l a s s O f owl : Thing .dbo : P h i l o s o p h e r r d f s : s u b C l a s s O f dbo : Person .dbo : Person r d f s : s u b C l a s s O f dbo : Agent .dbo : Agent r d f s : s u b C l a s s O f owl : Thing .dbo : S p o r t r d f s : s u b C l a s s O f dbo : A c t i v i t y .dbo : A c t i v i t y r d f s : s u b C l a s s O f owl : Thing .dbo : F i s h r d f s : s u b C l a s s O f dbo : Animal .
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
13 / 34
tugraz
AKSW Research Group
Step 1 - Obtain Schema Information
CONSTRUCT WHERE {? sub r d f s : s u b C l a s s O f ? sup .
}ORDER BY DESC(? sub ) LIMIT 1000 OFFSET 1000
dbo : D i s e a s e r d f s : s u b C l a s s O f owl : Thing .dbo : Book r d f s : s u b C l a s s O f dbo : WrittenWork .dbo : WrittenWork r d f s : s u b C l a s s O f dbo : Work .dbo : Work r d f s : s u b C l a s s O f owl : Thing .dbo : P h i l o s o p h e r r d f s : s u b C l a s s O f dbo : Person .dbo : Person r d f s : s u b C l a s s O f dbo : Agent .dbo : Agent r d f s : s u b C l a s s O f owl : Thing .dbo : S p o r t r d f s : s u b C l a s s O f dbo : A c t i v i t y .dbo : A c t i v i t y r d f s : s u b C l a s s O f owl : Thing .dbo : F i s h r d f s : s u b C l a s s O f dbo : Animal .
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
13 / 34
tugraz
AKSW Research Group
Step 2 - Obtain axiom type and entity specific data
CONSTRUCT WHERE {? i n d dbo : a u t h o r ?o .? i n d a ? t y p e .
}ORDER BY DESC(? i n d ) LIMIT 1000 OFFSET 2000
...d b p e d i a : The Adventures o f Tom Sawyer
dbo : a u t h o r d b p e d i a : Mark Twain ;r d f : t y p e dbo : Book .
d b p e d i a : T h e Z o m b i e S u r v i v a l G u i d edbo : a u t h o r d b p e d i a : Max Brooks ;r d f : t y p e dbo : WrittenWork .
d b p e d i a : Web Therapydbo : a u t h o r d b p e d i a : L i sa Kudrow ;r d f : t y p e dbo : Book .
...
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
14 / 34
tugraz
AKSW Research Group
Step 2 - Obtain axiom type and entity specific data
CONSTRUCT WHERE {? i n d dbo : a u t h o r ?o .? i n d a ? t y p e .
}ORDER BY DESC(? i n d ) LIMIT 1000 OFFSET 2000
...d b p e d i a : The Adventures o f Tom Sawyer
dbo : a u t h o r d b p e d i a : Mark Twain ;r d f : t y p e dbo : Book .
d b p e d i a : T h e Z o m b i e S u r v i v a l G u i d edbo : a u t h o r d b p e d i a : Max Brooks ;r d f : t y p e dbo : WrittenWork .
d b p e d i a : Web Therapydbo : a u t h o r d b p e d i a : L i sa Kudrow ;r d f : t y p e dbo : Book .
...Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
14 / 34
tugraz
AKSW Research Group
Step 3 - Machine Learning
d b p e d i a : The Adventures o f Tom Sawyerdbo : a u t h o r d b p e d i a : Mark Twain ;r d f : t y p e dbo : Book .
d b p e d i a : T h e Z o m b i e S u r v i v a l G u i d edbo : a u t h o r d b p e d i a : Max Brooks ;r d f : t y p e dbo : WrittenWork .
d b p e d i a : Web Therapydbo : a u t h o r d b p e d i a : L i sa Kudrow ;r d f : t y p e dbo : Book .
Score(Domain(dbo:author, dbo:Book))= 23 ≈ 66.7%
Score(Domain(dbo:author, dbo:WrittenWork))= 13 ≈ 33.3%
dbo : Book r d f s : s u b C l a s s O f dbo : WrittenWork .
Score(Domain(dbo:author, dbo:WrittenWork))= 33 = 100%
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
15 / 34
tugraz
AKSW Research Group
Step 3 - Machine Learning
d b p e d i a : The Adventures o f Tom Sawyerdbo : a u t h o r d b p e d i a : Mark Twain ;r d f : t y p e dbo : Book .
d b p e d i a : T h e Z o m b i e S u r v i v a l G u i d edbo : a u t h o r d b p e d i a : Max Brooks ;r d f : t y p e dbo : WrittenWork .
d b p e d i a : Web Therapydbo : a u t h o r d b p e d i a : L i sa Kudrow ;r d f : t y p e dbo : Book .
Score(Domain(dbo:author, dbo:Book))= 23 ≈ 66.7%
Score(Domain(dbo:author, dbo:WrittenWork))= 13 ≈ 33.3%
dbo : Book r d f s : s u b C l a s s O f dbo : WrittenWork .
Score(Domain(dbo:author, dbo:WrittenWork))= 33 = 100%
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
15 / 34
tugraz
AKSW Research Group
Step 3 - Machine Learning
d b p e d i a : The Adventures o f Tom Sawyerdbo : a u t h o r d b p e d i a : Mark Twain ;r d f : t y p e dbo : Book .
d b p e d i a : T h e Z o m b i e S u r v i v a l G u i d edbo : a u t h o r d b p e d i a : Max Brooks ;r d f : t y p e dbo : WrittenWork .
d b p e d i a : Web Therapydbo : a u t h o r d b p e d i a : L i sa Kudrow ;r d f : t y p e dbo : Book .
Score(Domain(dbo:author, dbo:Book))= 23 ≈ 66.7%
Score(Domain(dbo:author, dbo:WrittenWork))= 13 ≈ 33.3%
dbo : Book r d f s : s u b C l a s s O f dbo : WrittenWork .
Score(Domain(dbo:author, dbo:WrittenWork))= 33 = 100%
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
15 / 34
tugraz
AKSW Research Group
Step 3 - Machine Learning
d b p e d i a : The Adventures o f Tom Sawyerdbo : a u t h o r d b p e d i a : Mark Twain ;r d f : t y p e dbo : Book .
d b p e d i a : T h e Z o m b i e S u r v i v a l G u i d edbo : a u t h o r d b p e d i a : Max Brooks ;r d f : t y p e dbo : WrittenWork .
d b p e d i a : Web Therapydbo : a u t h o r d b p e d i a : L i sa Kudrow ;r d f : t y p e dbo : Book .
Score(Domain(dbo:author, dbo:Book))= 23 ≈ 66.7%
Score(Domain(dbo:author, dbo:WrittenWork))= 13 ≈ 33.3%
dbo : Book r d f s : s u b C l a s s O f dbo : WrittenWork .
Score(Domain(dbo:author, dbo:WrittenWork))= 33 = 100%
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
15 / 34
tugraz
AKSW Research Group
Step 3 - Machine Learning(2)
Problem:
support for axiom in KB not taken into account→ no difference between 3 out of 3 and 100 out of 100
Solution:
Average of 95% confidence interval (Wald method)
p′ = s+2m+4
s −#successm −#total
min(1, p′ + 1.96 ·√
p′·(1−p′)m+4
) max(0, p′ − 1.96 ·√
p′·(1−p′)m+4
)
“In 95% of the intervals the true value is between ... and...”
Score(Domain(dbo:author, dbo:Book))≈ 57.3%Score(Domain(dbo:author, dbo:WrittenWork))≈ 69.1%
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
16 / 34
tugraz
AKSW Research Group
Step 3 - Machine Learning(2)
Problem:
support for axiom in KB not taken into account→ no difference between 3 out of 3 and 100 out of 100
Solution:
Average of 95% confidence interval (Wald method)
p′ = s+2m+4
s −#successm −#total
min(1, p′ + 1.96 ·√
p′·(1−p′)m+4
) max(0, p′ − 1.96 ·√
p′·(1−p′)m+4
)
“In 95% of the intervals the true value is between ... and...”
Score(Domain(dbo:author, dbo:Book))≈ 57.3%Score(Domain(dbo:author, dbo:WrittenWork))≈ 69.1%
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
16 / 34
tugraz
AKSW Research Group
Step 3 - Machine Learning(2)
Problem:
support for axiom in KB not taken into account→ no difference between 3 out of 3 and 100 out of 100
Solution:
Average of 95% confidence interval (Wald method)
p′ = s+2m+4
s −#successm −#total
min(1, p′ + 1.96 ·√
p′·(1−p′)m+4
) max(0, p′ − 1.96 ·√
p′·(1−p′)m+4
)
“In 95% of the intervals the true value is between ... and...”
Score(Domain(dbo:author, dbo:Book))≈ 57.3%Score(Domain(dbo:author, dbo:WrittenWork))≈ 69.1%
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
16 / 34
tugraz
AKSW Research Group
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
17 / 34
tugraz
AKSW Research Group
http://www.genomic-cds.org/ont/genomic-cds.owl
C l a s s : human with CYP2C19 star 26SubClassOf : human w i th gene t i c po l ymorph i smAnnota t i on s : r d f s : l a b e l ”human wi th CYP2C19 ∗26”SubClassOf :
has some rs11188072 C and has some rs11568732 T andhas some rs118203756 G and has some rs118203757 G andhas some rs118203759 C and has some rs12248560 C andhas some rs12571421 A and has some rs12769205 A andhas some rs17878459 G and has some rs17878649 G andhas some rs17879685 C and has some rs17879992 T andhas some rs17882687 A and has some rs17884712 G andhas some rs17884832 T and has some rs17885098 T andhas some rs17886522 A and has some rs28399504 A andhas some rs28399513 T and has some rs3758580 C andhas some rs3758581 G and has some rs41291556 T andhas some rs4244285 G and has some rs4417205 C andhas some rs4917623 T and has some rs4986893 G andhas some rs4986894 T and has some rs55640102 A andhas some rs55752064 T and has some rs56337013 C andhas some rs58973490 G and has some rs6413438 C andhas some rs7088784 A and has some rs72552267 G andhas some rs72558186 T and has some rs7902257 G andhas some rs7916649 G
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
18 / 34
tugraz
AKSW Research Group
GALEN Ontology
C l a s s : Ab s t r a c tCa v i t yEqu i va l en tTo :
BodyCav i ty and i sSpaceDe f i n edBy some( BodySt ruc tu re and hasTopology some
( Topology and ha sAb so l u t eS t a t e some su r f a c eHo l l ow ))
A ≡ B u ∃r .(C u ∃s.(D u ∃t.E ))
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
19 / 34
tugraz
AKSW Research Group
Pattern Based Knowledgebase Enrichment
1. obtain schema information
Reasoner
SPARQLEndpoint
EnrichmentOntology
Input: Entity URI, Pattern, Knowledge Base (SPARQL Endpoint)
Background Knowledge
BackgroundKnowledge+ Relevant Instance Data
List of Axiom Suggestions+ Metadata
(opt
ion
alin
voca
tion
)
2. obtain axiom type and entity specific data
3. compute confidence scores
(onl
y ex
ecu
ted
once
per
know
ledg
e ba
se)
batch mode:iterate overpatterns andentities
Learner
DL-Learner
Repositories
TONESOxford Library
BioPortal
Execution Phase
Preparation Phase
quer
y m
odes
: dire
ct,
sam
ple
d or
loca
l
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
20 / 34
tugraz
AKSW Research Group
Pattern Based Knowledgebase Enrichment
1. obtain schema information
Reasoner
SPARQLEndpoint
EnrichmentOntology
Input: Entity URI, Pattern, Knowledge Base (SPARQL Endpoint)
Background Knowledge
BackgroundKnowledge+ Relevant Instance Data
List of Axiom Suggestions+ Metadata
(opt
ion
alin
voca
tion
)
2. obtain axiom type and entity specific data
3. compute confidence scores
(onl
y ex
ecu
ted
once
per
know
ledg
e ba
se)
batch mode:iterate overpatterns andentities
Learner
DL-Learner
Repositories
TONESOxford Library
BioPortal
1. extract and normalise patterns
Normalised AxiomFrequency Database
Execution Phase
Preparation Phase
quer
y m
odes
: dire
ct,
sam
ple
d or
loca
l
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
21 / 34
tugraz
AKSW Research Group
Pattern Based Knowledgebase Enrichment
1. obtain schema information
Reasoner
SPARQLEndpoint
EnrichmentOntology
Input: Entity URI, Pattern, Knowledge Base (SPARQL Endpoint)
Background Knowledge
BackgroundKnowledge+ Relevant Instance Data
List of Axiom Suggestions+ Metadata
(opt
ion
alin
voca
tion
)
2. obtain axiom type and entity specific data
3. compute confidence scores
(onl
y ex
ecu
ted
once
per
know
ledg
e ba
se)
batch mode:iterate overpatterns andentities
Learner
DL-Learner
Repositories
TONESOxford Library
BioPortal
1. extract and normalise patterns
Normalised AxiomFrequency Database
2. pattern to query rewriting
SPARQL QueryPattern Library
Execution Phase
Preparation Phase
quer
y m
odes
: dire
ct,
sam
ple
d or
loca
l
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
22 / 34
tugraz
AKSW Research Group
Axiom Normalisation
based on structural equivalence defined in OWL 2 specification
For subclass axioms
reordering of class expressions in sub- and superclass
replacement of entities from left to right
Ensures that
Father v Male u ∃hasChild .PersonCarnivore v ∃eat.Meat u Animal
result inA v B u ∃r .C
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
23 / 34
tugraz
AKSW Research Group
Pattern Transformation
Class Expression Ci Graph Pattern p = τ(Ci ,?var)
A {?var a A.}¬C {?var ?p ?o . FILTER NOT EXISTS {τ(C , ?var)}}{a1, . . . , an} {?var ?p ?o . FILTER (?var IN (a1, . . . , an))}C1 u . . . u Cn {τ(C1, ?var) ∪ . . .∪ τ(Cn, ?var)}C1 t . . . t Cn {τ(C1, ?var)} UNION . . . UNION {τ(Cn, ?var)}∃ r .C {?var r ?s.} ∪ τ(C , ?s)∃ r .{a} {?var r a.}∃ r .SELF {?var r ?var.}
......
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
24 / 34
tugraz
AKSW Research Group
Pattern Transformation - Example
A v B u ∃r .C
? x a <A>.? x a <B>.? x <r> ? s0 .? x a <C>.
A = Book
SELECT ?p ? c l s 0 ? c l s 1 (COUNT( DISTINCT ? x ) AS ? cn t ) {? x a <Book>.? x a ? c l s 0 .? x ?p ? s0 .? s0 a ? c l s 1 .
} GROUP BY ?p ? c l s 0 ? c l s 1 ORDER BY DESC(? cn t )
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
25 / 34
tugraz
AKSW Research Group
Pattern Transformation - Example
A v B u ∃r .C
? x a <A>.? x a <B>.? x <r> ? s0 .? x a <C>.
A = Book
SELECT ?p ? c l s 0 ? c l s 1 (COUNT( DISTINCT ? x ) AS ? cn t ) {? x a <Book>.? x a ? c l s 0 .? x ?p ? s0 .? s0 a ? c l s 1 .
} GROUP BY ?p ? c l s 0 ? c l s 1 ORDER BY DESC(? cn t )
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
25 / 34
tugraz
AKSW Research Group
Pattern Transformation - Example
A v B u ∃r .C
? x a <A>.? x a <B>.? x <r> ? s0 .? x a <C>.
A = Book
SELECT ?p ? c l s 0 ? c l s 1 (COUNT( DISTINCT ? x ) AS ? cn t ) {? x a <Book>.? x a ? c l s 0 .? x ?p ? s0 .? s0 a ? c l s 1 .
} GROUP BY ?p ? c l s 0 ? c l s 1 ORDER BY DESC(? cn t )
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
25 / 34
tugraz
AKSW Research Group
Pattern Transformation - Example
A v B u ∃r .C
? x a <A>.? x a <B>.? x <r> ? s0 .? x a <C>.
A = Book
SELECT ?p ? c l s 0 ? c l s 1 (COUNT( DISTINCT ? x ) AS ? cn t ) {? x a <Book>.? x a ? c l s 0 .? x ?p ? s0 .? s0 a ? c l s 1 .
} GROUP BY ?p ? c l s 0 ? c l s 1 ORDER BY DESC(? cn t )
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
25 / 34
tugraz
AKSW Research Group
Table of Contents
1 Motivation
2 Approach
3 Experiments
4 Conclusion
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
25 / 34
tugraz
AKSW Research Group
Experimental Setup - Pattern Detection
3 Ontology Repositories:
#OntologiesTotal Error
TONES 219 12BioPortal 385 101Oxford 793 0
#AxiomsTotal Tbox RBox Abox
Avg Max Avg Max Avg Max Avg MaxTONES 14,299 1,235,392 8297 658,449 20 932 5981 1,156,468BioPortal 25,541 847,755 23,353 847,755 35 1339 2152 220,948Oxford 49,997 2,492,761 15,384 2,259,770 25 1365 34,587 2,452,737
processed ≈ 1400 ontologies
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
26 / 34
tugraz
AKSW Research Group
Pattern Frequency Detection
Pattern Frequency #Ontologies TO
NE
S
Bio
Por
tal
Oxf
ord
1. A v B 10,174,991 1050 2 1 12. A v ∃ p.B 8,199,457 604 1 2 23. A v ∃ p.(∃ q.B) 509,963 24 34. A ≡ B u ∃ p.C 361,777 319 8 4 45. B v ¬ A 237,897 417 3 3 96. A ≡ B 104,508 151 13 34 77. A ≡ ∃ p.B 70,040 139 36 32 88. ∃ p.Thing v A 41,876 595 6 7 119. A v ∀ p.B 27,556 266 4 11 19
10. A ≡ B u ∃ p.C u ∃ q.D 24,277 196 11 13 1311. A ≡ B u C 16,597 78 5 20 2212. A v ∃ p.(B u ∃ q.C) 12,453 84 23 18 1513. A v ∃ p.{a} 11,816 65 12 22 2014. A ≡ B u ∃ p.(C u ∃ q.D) 10,430 60 39 21 1715. p ≡ q− 9943 433 17 19 23
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
27 / 34
tugraz
AKSW Research Group
Fixpoint Analysis
How does the ranking of the most frequent axiom patterns change?
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
28 / 34
tugraz
AKSW Research Group
Experimental Setup - Pattern Application
DBpedia 3.8 (http://dbpedia.org/sparql)
100 random classes with at least 5 instances
60s data retrieval
at most 100 pattern instantiations per pattern
3 non-author evaluators
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
29 / 34
tugraz
AKSW Research Group
Manual Evaluation Results
manual evaluation in %pattern sample size correct minor
issuesincorrect κFleiss’
A v ∃ p.B 50 88.0 0.7 11.3 24.8A v B 47 63.8 2.1 34.0 53.8A ≡ B 25 10.7 0.0 89.3 44.0A ≡ ∃ p.B 68 29.9 2.0 68.1 60.4A ≡ B u ∃ p.C 100 25.0 3.0 72.0 72.9A ≡ B u ∃ p.(C u ∃ q.D) 100 23.0 5.3 71.7 43.5A v ∃ p.(∃ q.B) 71 85.0 3.3 11.7 34.0A v ∃ p.(B u ∃ q.C) 100 87.0 0.3 12.7 -2.8A v ∃ p.{a} 15 71.1 0.0 28.9 45.9A ≡ B u C 42 14.3 7.1 78.6 46.7A ≡ B u ∃ p.C u ∃ q.D 100 37.0 2.7 59.7 75.0
718 48.2 2.7 49.0 66.1
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
30 / 34
tugraz
AKSW Research Group
Threshold Analysis
How many of the pattern instantiations with an accuracy value in aparticular interval are correct using majority voting?
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
31 / 34
tugraz
AKSW Research Group
Table of Contents
1 Motivation
2 Approach
3 Experiments
4 Conclusion
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
31 / 34
tugraz
AKSW Research Group
Conclusion
We proposed an approach, which allows for
detecting frequent axiom usage patterns
converting them into SPARQL query patterns
learning complex schema axioms
The evaluation has shown that
it is feasible
but still should be used in a semi-automatic manner
Overall it results in
a freely available tool that can be used
suggest both complex TBox and RBox axioms on large knowledgebases accessible via SPARQL endpoints
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
32 / 34
tugraz
AKSW Research Group
Future Work
more ontology repositories, e.g. LOD cloud
check for patterns containing more than one axiom
improve score computation
learn appropriate thresholds for each axiom type
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
33 / 34
tugraz
AKSW Research Group
GeoKnow
Thank You!Questions?
Buhmann, Lehmann 2013/10/25 Pattern Based Knowledge Base Enrichment
34 / 34