openhpi 6.7 - semantic search
DESCRIPTION
TRANSCRIPT
This file is licensed under the Creative Commons Attribution-NonCommercial 3.0 (CC BY-NC 3.0)
Dr. Harald Sack
Hasso Plattner Institute for IT Systems Engineering
University of Potsdam
Spring 2013
Semantic Web Technologies
Lecture 6: Applications in the Web of Data07: Semantic Search
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
2
Lecture 6: Applications in the Web of DataOpen HPI - Course: Semantic Web Technologies
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
3
07 - Semantic SearchOpen HPI - Course: Semantic Web Technologies - Lecture 6: Applications in the Web of Data
Context
Pragmatics
Experience
Experience
Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Institute, University of Potsdam
4
Meaning
Symbol Objectstands for
sender
receiver
refers tosymbolizes
Concept
Armstrong
Ogden, Richards: The Meaning of Meaning: A Study of the Influence of Language upon Thought and of the Science of Symbolism (1923)
http://commons.wikimedia.org/wiki/User:McSmit
Semantic Web Technologies , Dr. Harald Sack, Hasso Plattner Institute, University of Potsdam
Armstrong
Neil Armstrong
Astronaut
is a
Person
is a
Science Occupation
subClassOf
Employment
subClassOf
Entities
Ontologies
has an
Kosmonautsame as
is NOT a
http://dbpedia.org/resource/Neil_Armstrong
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
7
Classical Information Retrieval
(acc. to Salton,G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York 1983)
Set of Documents
files of records
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
7
Classical Information Retrieval
(acc. to Salton,G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York 1983)
Set of Documents
files of records
Set of Queries
Information requests
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
7
Classical Information Retrieval
(acc. to Salton,G., McGill, M.J.: Introduction to Modern Information Retrieval. McGraw-Hill, New York 1983)
Set of Documents
files of records
Set of Queries
Information requests
indexing language
similarity
indexingQueryFormulation
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
8
Classical Information Retrieval(simplified version)
Set of documents
search index
searching, vb. , in allen ger n sprachen bezeugt: got.sokjan, ags. sēcan, as. sokian, an. Soekj
[Bd. 20, Sp. 835]
sēza, ahd. suohhan. aus idg. sprachen steht am nächsten lat. sāgiospüre, air. saigim gehe
einer sache nach, suche; zur weiteren verwandtschaft vgl. Walde-Pokorny 2, 449.
der umlaut des stammvokals erscheint im nd., er wird im md. verzeichnet vonCrecelius
oberhess. wb. 827; Spiess henneb. id. 248; Hertel Thüringen240; Gerbet Vogtland 425
und auf kolonialem boden bei Schröerdeutsche mundarten des ungrischen
berglandes 225. neben eigentlichem suchen 'einer sache
nachspüren, sich bemühen, sie aufzufinden' (dann auch 'jemanden
aufsuchen, ihn bedrohen, angreifen') steht eine reich bezeugte bedeutungsgruppe mehr
keywords
„search“?
search query
search term(s)
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
9
relevant documents retrieved documents
relevant documents that have been retrieved
RP
Recall =| R ∩ P |
|R|
Precision =| R ∩ P |
|P|
Fα=(1+α)⋅(Recall ⋅ Precision )
α⋅(Recall + Precision )
Evaluation of Information Retrieval Systems
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
10
(One of many Definitions...)
Semantic Search
• Annotation of (text-based) metadata with semantic entities
• Entity-based Information Retrieval
• Make use of semantic relations, as e.g. content-based similarities of relationships
• Interoperable metadata via semantic annotations
• for content-based description
• for structural / technical description (Multimedia Ontologies)
Overall Goal: Quantitative and qualitative improvement of Information Retrieval
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität PotsdamTurmbau zu Babel, Pieter Brueghel, 1563
Semantic metadata enable improvement of traditional keyword-based retrieval by
(1) Query String Extension/Refinementenables more precise or more complete search results
(2) Cross Referencingenables to complement search results with additional associated or similar information
(3) Exploratory Search enables visualization and navigation of the search space
(4) Reasoningenables to complement search results with implicitly given information
Semantic Search
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
12
Semantic SearchQuery String Extension
• Keyword-based search does not deliver all search results that are relevant for a query, because synonyms and metaphors might describe the queried content.
• Extension of the original query string (Query Extension)• from dictionaries and thesauri
• extend query with synonyms, hyponyms, etc.• from domain ontologies
• extend query with meronyms, related concepts, etc.
Original query string: Bank
possible extensions: Bank ∨ depository financial institution ∨ credit union ∨ acquirer ∨ federal reserve ∨ ... increase recall
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
13
Semantic SearchQuery String Refinement
• Keyword-based search does also deliver search results that are not relevant for a query, because query terms and document terms might be ambiguous.
• Refinement of the original query string (Query Refinement)• from dictionaries and thesauri
• disambiguate polysemic terms with hypernyms• from domain ontologies
• disambiguate polysemic terms with holonyms
Original query string: Bank
possible refinements: (1) Bank ∧ financial institution (2) Bank ∧ incline ∧ slope ∧ side (3) Bank ∧ container (4) Bank ∧ deposit ∧ repository increase precision
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
14
Semantic SearchCross Referencing
• Provide search results that do not literally contain the query string but are closely related to the query by content
• Apply domain ontologies for determining related concepts
• Apply statistical analysis of large (text) document corpora
dbpedia:Neil_Armstrong
dbpedia:Apollo_11
dbprop:mission
Neil Armstrong NER
dbprop:mission
dbprop:mission
query string
dbpedia:Buzz_Aldrin
dbpedia:Michael_Collins
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
15
Semantic SearchExploratory Search
• Provide additional search results that do not necessarely contain the query string but are related to the query by content or also are related to the search results achieved by the direct query
• Apply domain ontologies and heuristics to determine the relevance of facts
95
dbpedia:Apollo_11
category:Apollo_program
dbpedia:Apollo_13
dcterms:subject
yago:Space_accidents_and_incidents
rdf:type
rdf:type
dbpedia-owl:mission
dbpedia:Neil_Armstrong
dbpedia:Space_Shuttle_Challenger
dcterms:subject
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
16
Semantic SearchReasoning
• Provide additional search results (and information) that do not necessarely contain the query string but are related to the query by content, whereby the relation may not be a direct one, but can be derived via entailment.
• Apply domain ontologies, reasoning algorithms and heuristics to find new facts and determine the relevance of facts
95
Vorlesung Semantic Web, Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
17
Semantic SearchReasoning
95
dbpedia:Neil_Armstrong
dbpedia:Apollo_11
dbpedia-owl:mission
category:Missions_to_the_Moon
dcterms:subjectcategory:Exploration_of_the_Moon
dcterms:subject
category:Spaceflight
skos:broader
dbpedia:Moon category:Animals_in_Space
dcterms:subject skos:broader
Example: query string= Neil Armstrong
(Hard) questions to solve via reasoning:• Will there be the Moon or documents about the Moon in the search results?• How is Neil Armstrong related to the Moon? (is he?)• Was Neil Armstrong (really) on the Moon?• ...
category:Moon
skos:broader
Semantic Web Technologies , Dr. Harald Sack, Hasso-Plattner-Institut, Universität Potsdam
18
08 - Exploratory Semantic SearchOpen HPI - Course: Semantic Web Technologies - Lecture 6: Applications in the Web of Data