a compositional-distributional semantic model over structured data
TRANSCRIPT
![Page 1: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/1.jpg)
A Compositional-distributional Semantic Model over Structured
DataAndré Freitas, Edward CurryInsight @ NUI Galway
Insight Workshop on Latent Space Methods (UCD)
![Page 2: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/2.jpg)
Talking to your Data
André Freitas, Edward CurryInsight @ NUI Galway
![Page 3: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/3.jpg)
Outline
Motivation & Context
Compositional-distributional Semantic Model: Distributional Relational Networks (DRNs)
Use Case: Question Answering over Linked Data
Conclusions
![Page 4: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/4.jpg)
Motivation & Context
![Page 5: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/5.jpg)
Big Data
Vision: More complete data-based picture of the world for systems and users.
![Page 6: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/6.jpg)
Shift in the Database Landscape
Heterogeneous, complex and large-scale Knowledge Bases (KBs).
Very-large and dynamic “schemas”.
10s-100s attributes1,000s-1,000,000s attributescirca 2000
circa 2013
![Page 7: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/7.jpg)
Semantic Heterogeneity Decentralized content generation. Multiple perspectives (conceptualizations) of the
reality. Ambiguity, vagueness, inconsistency.
![Page 8: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/8.jpg)
Databases for a Complex World
How do you query and do reasoning over data on this scenario?
![Page 9: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/9.jpg)
Vocabulary Problem for DatabasesQuery: Who is the daughter of Bill Clinton married to?
Semantic approximationSemantic Gap
Possible representations = Commonsense Knowledge
![Page 10: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/10.jpg)
Semantics for a Complex World “Most semantic models have dealt with particular
types of constructions, and have been carried out under very simplifying assumptions, in true lab conditions.”
“If these idealizations are removed it is not clear at all that modern semantics can give a full account of all but the simplest sentences.”Formal World Real World
Baroni et al. Frege in Space
![Page 11: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/11.jpg)
Compositional-Distributional Semantic Models Principled and effective semantic models for
coping with real world semantic conditions.
Based on vector space models.
Practical way to automatically harvest word “meanings” on a large-scale.
Focus on semantic approximation.
Applications- Semantic search.- Approximate semantic inference.- Paraphrase detection.- Semantic anomaly detection.
![Page 12: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/12.jpg)
Addressing the Vocabulary Problem for Databases (with Distributional Semantics)
![Page 13: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/13.jpg)
Solution (Video)
![Page 14: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/14.jpg)
More Complex Queries (Video)
![Page 15: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/15.jpg)
Treo Answers Jeopardy Queries (Video)
![Page 16: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/16.jpg)
Evaluation
102 natural language queries (Test Collection: QALD 2011).
Avg. query execution time: 8.53 s.
Dataset (DBpedia 3.7 + YAGO): 45,767 predicates, 5,556,492 classes and 9,434,677 instances
![Page 17: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/17.jpg)
Evaluation
![Page 18: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/18.jpg)
Compositional-Distributional Model: Distributional Relational Networks
(DRNs)
![Page 19: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/19.jpg)
Distributional Hypothesis
“Words occurring in similar (linguistic) contexts are semantically similar.”
If we can equate meaning with context, we can simply record the contexts in which a word occurs in a collection of texts (a corpus).
This can then be used as a surrogate of its semantic representation.
![Page 20: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/20.jpg)
Vector Space Model
c1
child
husbandspouse
cn
c2
function (number of times that the words occur in c1)
0.7
0.5
![Page 21: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/21.jpg)
Semantic Similarity/Relatedness
θ
c1
child
husbandspouse
cn
c2
![Page 22: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/22.jpg)
Approach Overview
Querying & Reasoning
Distributional Relational
Network (DRN)
Large-scale unstructured data
Commonsense knowledge
Structured/logic Knowledge Bases
Distributional semantics
Core semantic approximation &
composition operations
![Page 23: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/23.jpg)
Approach Overview
Querying & Reasoning
Distributional Relational
Network (DRN)
Wikipedia
Commonsense knowledge
Graph Data (RDF)
Explicit Semantic Analysis (ESA)
Core semantic approximation &
composition operations
![Page 24: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/24.jpg)
Approach Overview
Querying & Reasoning
Distributional Relational Networks
Large-scale unstructured data
Commonsense knowledge
Structured/logic Knowledge Bases
Core semantic approximation &
composition operations
![Page 25: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/25.jpg)
DRN (Ƭ-Space)
instancepredicate
e
p
r
KB
DRN
![Page 26: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/26.jpg)
DRN (Ƭ-Space) Space is segmented by the instances
![Page 27: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/27.jpg)
Core Operations
DRN
Query
![Page 28: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/28.jpg)
Core Operations
DRN Core Operations
Query
Query & Reasoning Algorithms
![Page 29: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/29.jpg)
Core Principles
Minimize the impact of Ambiguity, Vagueness, Synonymy.
Address the simplest matchings first (heuristics).
Semantic Relatedness as a primitive operation.
Distributional semantics as commonsense knowledge.
![Page 30: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/30.jpg)
Specificity Ordering
Use specificity (grammatical class + (IDF)) as a heuristc measure
Low High
Ambiguity, vagueness, synonimy
![Page 31: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/31.jpg)
Core Operations
Instance search - Proper nouns- String similarity + node cardinality
Class (unary predicate) search- Nouns, adjectives and adverbs- String similarity + Distributional semantic relatedness
Property (binary predicate) search- Nouns, adjectives, verbs and adverbs- Distributional semantic relatedness
![Page 32: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/32.jpg)
Core Operations
Navigation
Extensional expansion- Expands the instances associated with a class.
Disambiguation dialog (instance, predicate)
![Page 33: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/33.jpg)
Use Case: Question Answering over Linked Data
![Page 34: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/34.jpg)
Question Analysis
Transform natural language queries into triple patterns
“Who is the daughter of Bill Clinton married to?”
Bill Clinton daughter married to
(INSTANCE) (PREDICATE) (PREDICATE) Query Features
PODS
![Page 35: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/35.jpg)
Query Plan
Map query features into a query plan.
A query plan contains a sequence of core operations.
(INSTANCE) (PREDICATE) (PREDICATE) Query Features
Query Plan
(1) INSTANCE SEARCH (Bill Clinton) (2) p1 <- SEARCH PREDICATE (Bill Clintion, daughter)
(3) e1 <- NAVIGATE (Bill Clintion, p1)
(4) p2 <- SEARCH PREDICATE (e1, married to)
(5) e2 <- NAVIGATE (e1, p2)
![Page 36: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/36.jpg)
Instance Search
Bill Clinton daughter married to
:Bill_Clinton
Query:
Linked Data:
Instance Search
![Page 37: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/37.jpg)
Predicate Search
Bill Clinton daughter married to
:Bill_Clinton
Query:
Linked Data:
:Chelsea_Clinton
:child
:Baptists:religion
:Yale_Law_School
:almaMater
...(PIVOT ENTITY)
(ASSOCIATED TRIPLES)
![Page 38: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/38.jpg)
Predicate Search
Bill Clinton daughter married to
:Bill_Clinton
Query:
Linked Data:
:Chelsea_Clinton
:child
:Baptists:religion
:Yale_Law_School
:almaMater
...
sem_rel(daughter,child)=0.054
sem_rel(daughter,child)=0.004
sem_rel(daughter,alma mater)=0.001
Which properties are semantically related to ‘daughter’?
![Page 39: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/39.jpg)
Navigate
Bill Clinton daughter married to
:Bill_Clinton
Query:
Linked Data:
:Chelsea_Clinton
:child
![Page 40: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/40.jpg)
Navigate
Bill Clinton daughter married to
:Bill_Clinton
Query:
Linked Data:
:Chelsea_Clinton
:child
(PIVOT ENTITY)
![Page 41: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/41.jpg)
Predicate Search
Bill Clinton daughter married to
:Bill_Clinton
Query:
Linked Data:
:Chelsea_Clinton
:child
(PIVOT ENTITY)
:Mark_Mezvinsky:spouse
![Page 42: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/42.jpg)
Results
![Page 43: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/43.jpg)
From Exact to Approximate: User Feedback
![Page 44: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/44.jpg)
Distributional Semantics @ NUI Galway
• Random Projection is one solution:
• Estimate a VSM by a random projection matrix consists a set of randomly created vectors.• i.e. based on the Johnson-Lindenstrauss lemma• verified by the results reported in (Hecht-Nielsen, 1994)
* The above figure is copyrighted by Alex Clemmer (http://nullspace.io/)
Vector Spaces in Information Extraction, B. Qasemizadeh
![Page 45: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/45.jpg)
Distributional Semantics @ NUI Galway
• Application: Extraction of Technology Terms (term classification)
• Random Projection• Data Size: 10,000 publicationd
• Contexts: words and their position in the neighbourhood of terms
• Original Dimension: • approximately 5 million
• Reducing the dimension to 2000 using Random Projection
Behrang’s research evolves around classification and finding the optimal contexts in random vector spaces for the extraction of
technology terms and their relation. If you are interested please email him at [email protected]
Vector Spaces in Information Extraction, B. Qasemizadeh
![Page 46: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/46.jpg)
Conclusions
Distributional Relational Networks (DRNs) provide:
- A Knowledge Representation model.- Built-in semantic approximation.- Associated large-scale commonsense knowledge base (KB)
with no manual construction effort.
The compositional-distributional model supports a schema-agnostic QA system over a database:
- Better recall and query coverage compared to baseline systems.
![Page 47: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/47.jpg)
References
QueryingFreitas & Curry, Natural Language Queries over Heterogeneous Linked Data Graphs: A Distributional-Compositional Semantics Approach, IUI 2014.
ReasoningPereira da Silva & Freitas, Towards An Approximative Ontology-Agnostic Approach for Logic Programs, In FoIKS 2014.
KRNovacek et al. Getting the Meaning Right, ISWC 2011.
Freitas et al. Distributional Relational Networks, AAAI Fall Symposium 2013.
![Page 48: A Compositional-distributional Semantic Model over Structured Data](https://reader036.vdocuments.us/reader036/viewer/2022062405/554e9af5b4c90573338b53a5/html5/thumbnails/48.jpg)
http://andrefreitas.org