![Page 1: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/1.jpg)
A Probabilistic Framework for Information Integration and
Retrieval on the Semantic Webby
Livia Predoiu, Heiner StuckenschmidtInstitute of Computer Science,
University of Mannheim, Germany
presented byThomas Packer
![Page 2: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/2.jpg)
Sources of Uncertainty in Automated Processes in the Semantic Web
• Uncertain Document Classification• Uncertain Ontology Learning from Text• Uncertain Ontology Matching
• Leads to uncertain, unreliable or contradictory information.
• Traditional logic cannot handle inconsistency.
![Page 3: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/3.jpg)
Motivational Example
• Domain: Bibliography• Use Case: Find publications with keyword
“AI”.• Complication: Second ontology does not
include the concept of “topic” or “keywords”.• Solution: Use machine learning to categorize
documents from the second collection.
![Page 4: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/4.jpg)
Motivational Example (Continued)
• Domain: Bibliography• Use Case: Find publications with keyword
“AI”.• Complication: “Report” concept in one
ontology kind of corresponds to “Publication” in the other.
• Solution: Map concepts between ontologies.
![Page 5: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/5.jpg)
Approach
• Start with a more standard approach, Description Logic Programs.
• Extend them with probabilistic information.• Call the result Bayesian Description Logic
Programs (BDLPs).
• It is a subset of Bayesian Logic Programs.• It also integrates logic programming and
description logics knowledge bases.
![Page 6: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/6.jpg)
BDLP Pedigree
Description Logic Programs (DLPs)
Bayesian Description Logic Programs (BDLPs)
Bayesian Logic Programs (BLPs)
Description Logic (DL)
Logic Programs (LPs)
Bayesian Networks (BNs)
![Page 7: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/7.jpg)
Uses of Bayesian Description Logic Programs
• Framework for – information retrieval – information integration – across heterogeneous ontologies.
![Page 8: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/8.jpg)
Description Logic Programs (Background)
• Intersection of:– Description Logics (knowledge representation)– Logic Programming (automated theorem proving)
• DLP program contains:– Set of rules– Set of facts
• Rules have the form:– Conjunction of predicates implies some other predicate.– H and B’s are atomic formulae.– Predicate argument are called terms.– Terms are constants or variables.– A ground atom’s terms are all constants.
![Page 9: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/9.jpg)
Description Logic Programs (Background)
![Page 10: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/10.jpg)
Description Logic Programs (Background)
• Restricted expressivity• Many existing DL ontologies fit DLP
restrictions.• Reasoning in DLP is decidable.• Reasoning has much lower complexity than DL
reasoning in general (in theory and in practice).
![Page 11: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/11.jpg)
Bayesian Description Logic Programs
• BDLP program contains:– Set of rules– Set of facts
• Rules have the form:– Conjunction of predicates implies some other predicate.– “|” instead of “” to imply conditional probability.– Each rule has a probability distribution specifying the
probability of each state of the head atom given the states of the body atoms.
– Each ground atom corresponds to a BN node.
![Page 12: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/12.jpg)
Example BDLP
![Page 13: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/13.jpg)
Example Bayesian Network
• Blue Ontology 2• Cyan Learned from Ontology 2• Black & White Ontology 1• Red arcs Mappings
![Page 14: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/14.jpg)
Where do Probabilities Come From?
• Deterministic ontologies– true = 1.0– false = 0.0
• Probabilistic tools– Naïve Bayes document categorization– Probabilistic ontology mapping
• Subjectively.– People argue that people are inconsistent in their judgment
of probabilities.– Using subjective probabilities is still more accurate than
forcing people to use Boolean judgments.
![Page 15: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/15.jpg)
Example Query
• Query for publications about AI.• Non-ground query.• Two valid groundings.• Query BN for probabilities (IR with ranking).
![Page 16: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/16.jpg)
Conclusion• Strengths:
– Actually explains how Bayesian Networks relate to predicates.– Handles integration (which others do not).– Handles IR.
• Weaknesses– DLPs don’t allow for negation or equivalence.– No measured evaluation.– Size of model and therefore BN can be exponential in size of KB.– Intractable exact inference in BN’s with cycles.
• Future work– Learn BLP programs from data.– Prune BN to portion relevant to query.– Approximate probabilistic inference.– Parallel/distributed programming.
![Page 17: A Probabilistic Framework for Information Integration and Retrieval on the Semantic Web](https://reader033.vdocuments.us/reader033/viewer/2022042703/56814851550346895db563c7/html5/thumbnails/17.jpg)
Questions