![Page 1: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/1.jpg)
Assigning Global Relevance Scores to DBpedia Facts
Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci
DESWeb 03/31/2014
![Page 2: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/2.jpg)
Assigning Global Relevance Scores to DBpedia Facts
2
Structured Data
■ Advantages of structured data over unstructured data:
□ Search for explicit facts
□ Summarization of possibly interesting information
□ Automated knowledge discovery
■ Google Knowledge Graph
■ RDF Knowledge bases
□ DBpedia, YAGO/NAGA
A handful of salient facts about the query entity.
![Page 3: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/3.jpg)
Assigning Global Relevance Scores to DBpedia Facts
3■ Asking for classes to which Albert Einstein belongs
Querying YAGO
![Page 4: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/4.jpg)
Assigning Global Relevance Scores to DBpedia Facts
4■ Asking for classes to which Albert Einstein belongs
Querying DBpedia
predicate object
rdf:type owl:Thing
rdf:type dbpedia:Agent
rdf:type dbpedia:Person
rdf:type dbpedia:Scientist
rdf:type umbel:Scientist
rdf:type schema:Person
rdf:type yago:Astronomer109818343
rdf:type foaf:Person
rdf:type 19th-centuryAmericanPeople
rdf:type 19th-centuryGermanPeople
![Page 5: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/5.jpg)
Assigning Global Relevance Scores to DBpedia Facts
5
Challenge
select distinct ?p, ?o where
{ dbpedia:Barack_Obama ?p ?o}
p c
rdf:type owl:Thing
rdf:type dbpedia:Person
rdf:type yago:Person100007846
... ...
rdf:type dbpedia:Politician
... ...
dbpedia:spouse dbpedia:Michelle_Obama
Web Documents
p c
owl:orderInOffice President of the United States
dbpedia:type dbpedia:Politician
dbpedia:spouse dbpedia:Michelle_Obama
owl:birthPlace dbpedia:Honolulu
dbpprop:residence dbpedia:White_House
.... .....
rdf:type owl:Thing
![Page 6: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/6.jpg)
Assigning Global Relevance Scores to DBpedia Facts
6
Challenges
Big DataDBpedia 3.8,
ClueWeb corpus
ArchitectureText extraction, score
computation/ranking, query processing
EvaluationConduction of user studies
Ranking StrategiesImrove the ranking results
![Page 7: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/7.jpg)
Assigning Global Relevance Scores to DBpedia Facts
Overview
7Languages
• Python
• Java
• SPARQL
• JavaScript
Frameworks:
• Django
• Lucene
Web application (Django)
DBpedia Endpoint
(Apache Jena)
Application Data(Postgres)
Web corpus(Lucene Index)
User StudiesQuerying
Ranking strategiesRanking strategies
Intra DBpedia
strategies
Web Corpus
strategies
7
![Page 8: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/8.jpg)
Assigning Global Relevance Scores to DBpedia Facts
8
Ranking Facts
■ Query types:
□ Subject queries - return all physicists
□ Property queries - return all facts related to Einstein
■ Ranking strategies
□ Ranking by frequency and document frequency
□ Ranking by information diversity
□ Random walk
□ Web-based co-occurrence statistics
SELECT ?p ?o { Albert_Einstein ?p ?o }
SELECT ?s { ?s type Physicist }
![Page 9: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/9.jpg)
Assigning Global Relevance Scores to DBpedia Facts
9
Ranking by frequency and document frequency
<Albert_Einstein>
<topic> <Nobel_laureates>;<topic> <Theoretical_physicists>;<topic> <German_physicists>;<topic> <American_inventors>;<type> <Scientist>;<type> <Person>;<type> <Thing>;<residence> "Switzerland";<residence> "Austria-Hungary";<residence> "German Empire";<spouse> "Mileva Maric";...
subject document of „Albert Einstein“
<Newton> <topic> <Theoretical_physicists>.<Newton> <topic> <Nobel_laureates>.<Newton> <topic> <Mathematicians>.<Newton> <topic> <Optical_physicists>.<Newton> <topic> <History_of_calculus>.<Newton> <topic> <English_alchemists>.
<Einstein> <topic> <Theoretical_physicists>.
<Einstein> <topic> <Nobel_laureates>.
<Einstein> <topic> <German_physicists>.
<Einstein> <topic> <American_inventors>.
predicate document of „topic“
<Isaac_Newton> <topic> <Theoretical_physicists>.
<Albert_Einstein> <topic> <Theoretical_physicists>.<Bruno_Coppi> <topic> <Theoretical_physicists>.<Ravi_Gomatam> <topic> <Theoretical_physicists>.
...
object document of „Theoretical physicists“
[Shady et al ESWC’11]
![Page 10: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/10.jpg)
Assigning Global Relevance Scores to DBpedia Facts
10
Ranking by frequency and document frequency
■ Subject queries:
□ Global relevance
Isaac NewtonacademicAdvisor ...;birthDate ...;birthPlace ...;comment ...;ethnicity ...;field ...;influenced ...;influencedBy ...;knownFor ...;label ...;notableStudent ...;subject ...;subject ...;type ...;
Ravi Gomatam
subject ...;subject ...;subject ...;subject ...;subject ...;
![Page 11: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/11.jpg)
Assigning Global Relevance Scores to DBpedia Facts
11
Limitations for Property Queries
■ Property queries:
□ Global relevant but distinctive to the given subject– type Person vs. type Scientist
![Page 12: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/12.jpg)
Assigning Global Relevance Scores to DBpedia Facts
12
Ranking by diversity
■ Following a probabilistic model
□ Property queries:– Properties and objects that are as discriminative as
possible
□ Subject queries:
![Page 13: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/13.jpg)
Assigning Global Relevance Scores to DBpedia Facts
13
Random Walk Model
■ Consider the knowledge base as a directed graph
□ Already applied in [Kasneci CIKM’09]
□ Problem: literals have no outgoing link
■ Use Wiki Pagelinks and Infobox Property Mappings
□ Entities with high indegree, such as countries, are favored– Good for subject queries– Bad for property queries
![Page 14: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/14.jpg)
Assigning Global Relevance Scores to DBpedia Facts
14
Web Documents
Co-occurrence statistics
■ Lemur Project Clueweb09 Category-B web corpus
□ 50 million web documents (1.5 TB)
□ Only English-language documents
□ Includes approx. 2.7 million Wikipedia articles
■ Create an inverted index
■ Consider different word distance limits as documents
■ Rank subject-object pairs
□ „Albert Einstein“ and „Physicist“
□ Store only pairwise co-occurrence:
□ Compute frequency of s:
![Page 15: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/15.jpg)
Assigning Global Relevance Scores to DBpedia Facts
15
Evaluation
■ User study 1
□ 8 queries
□ all results
□ 12 users
□ 19 approaches/ configurations
■ 1-4: irrelevant- highly relevant
■ User study 2
□ 8+20 queries
□ top-10 results of best 4 approaches side-by-side 10 users
□ Best 3 approaches from user study 1
![Page 16: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/16.jpg)
Assigning Global Relevance Scores to DBpedia Facts
16
Top 4 Approaches in User study 1
![Page 17: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/17.jpg)
Assigning Global Relevance Scores to DBpedia Facts
17
User study 2
![Page 18: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/18.jpg)
Assigning Global Relevance Scores to DBpedia Facts
18
Results Example:Theoretical Physicists
Subject
Albert Einstein
Isaac Newton
Galileo Galilei
James Clerk Maxwell
Richard Feynman
Stephen Hawking
Max Planck
Enrico Fermi
Werner Heisenberg
Pierre-Simon Laplace
DBpedia Random Walk Model
![Page 19: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/19.jpg)
Assigning Global Relevance Scores to DBpedia Facts
19
Results Example: Albert Einstein
DBpedia Co-occurrence statistics
predicate object
rdf:type owl:Thing
rdf:type dbpedia:Agent
rdf:type dbpedia:Person
rdf:type dbpedia:Scientist
rdf:type umbel:Scientist
rdf:type schema:Person
rdf:type yago:Astronomer109818343
rdf:type foaf:Person
rdf:type 19th-centuryAmericanPeople
rdf:type 19th-centuryGermanPeople
predicate object
fields Physics
field Physics
deathPlace United States
placeOfDeath United States
shortDescription Physicists
description Physicist
type Scientist
ethnicity Jewish
subject Einstein family
residence Switzerland
![Page 20: Assigning Global Relevance Scores to DBpedia Facts Philipp Langer, Patrick Schulze, Stefan George, Tobias Metzke, Ziawasch Abedjan, Gjergji Kasneci DESWeb](https://reader036.vdocuments.us/reader036/viewer/2022062408/56649eb55503460f94bbd52a/html5/thumbnails/20.jpg)
Assigning Global Relevance Scores to DBpedia Facts
20
Conclusions
■ Investigated multiple approaches to rank DBpedia facts
□ Information theory, statistical reasoning, random walk, and co-occurrence statistics in web documents
■ DBpedia Knowledge base already provides enough information to improve the ranking of results
■ Improvement of property queries through web-based co-occurrence statistics
■ We provide the annotated datasets at
□ https://www.hpi.uni-potsdam.de/naumann/sites/dbpedia/