learning probabilistic relational models
DESCRIPTION
Nir Friedman Hebrew University [email protected]. Lise Getoor Stanford University [email protected]. Daphne Koller Stanford University [email protected]. Avi Pfeffer Stanford University [email protected]. Learning Probabilistic Relational Models. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/1.jpg)
Learning Probabilistic Relational Models
Daphne KollerStanford University
Nir FriedmanHebrew [email protected]
Lise GetoorStanford University
Avi PfefferStanford University
![Page 2: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/2.jpg)
• Data sources– relational and object-oriented databases– frame-based knowledge bases – World Wide Web
Learning from Relational Data
• Problem:– must fix attributes in advance
can represent only some limited set of structures– IID assumption may not hold
• Traditional approaches– work well with flat representations– fixed length attribute-value vectors – assume IID samples
![Page 3: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/3.jpg)
Our Approach• Probabilistic Relational Models (PRMs)
– rich representation language models• relational dependencies• probabilistic dependencies
• Learning PRMs – parameter estimation– model selection
from data stored in relational databases
![Page 4: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/4.jpg)
Outline• Motivation• Probabilistic relational models
– Probabilistic Logic Programming[Poole, 1993]; [Ngo & Haddawy 1994]
– Probabilistic object-oriented knowledge[Koller & Pfeffer 1997; 1998]; [Koller, Levy & Pfeffer; 1997]
• Learning PRMs• Experimental results• Conclusions
![Page 5: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/5.jpg)
Probabilistic Relational Models
• Combine advantages of predicate logic & BNs: – natural domain modeling: objects, properties,
relations;– generalization over a variety of situations;– compact, natural probability models.
• Integrate uncertainty with relational model:– properties of domain entities can depend on
properties of related entities;– uncertainty over relational structure of domain.
![Page 6: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/6.jpg)
Relational SchemaStudentIntelligencePerformance
RegistrationGradeSatisfaction
CourseDifficultyRating
ProfessorPopularity
Teaching-Ability
Stress-Level
Teach
In
Take
• Describes the types of objects and relations in the database
ClassesClasses
RelationshipsRelationships
AttributesAttributes
![Page 7: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/7.jpg)
Example instance I Professor
Prof. GumpPopularity
highTeaching Ability
mediumStress-Level
low
CoursePhil142
Difficulty low
Ratinghigh
CoursePhil101
Difficulty low
Ratinghigh
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
StudentJohn Doe
Intelligence high
Performance average
StudentJane Doe
Intelligence high
Performance average
![Page 8: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/8.jpg)
What’s Uncertain?
Relations
ProfessorProf. Gump
Popularityhigh
Teaching Abilitymedium
Stress-Levellow
CoursePhil142
Difficulty low
Ratinghigh
CoursePhil101
Difficulty low
Ratinghigh
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
StudentJohn Doe
Intelligence high
Performance average
StudentJane Doe
Intelligence high
Performance average
Attribute Values
ObjectsStudent
Judy DunnIntelligence
highPerformance
high
![Page 9: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/9.jpg)
StudentJohn Deer
Intelligence ???
Performance ???
Attribute Uncertainty
Fixed skeleton – set of objects in each class– relations between them
Uncertainty– over assignments of values to attributes
ProfessorProf. Gump
Popularity???
Teaching Ability???
Stress-Level???
CoursePhil142
Difficulty ???
Rating???
CoursePhil101
Difficulty ???
Rating???
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
Reg#5639
Grade???
Satisfaction ???
StudentJane Doe
Intelligence ???
Performance ???
![Page 10: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/10.jpg)
IntellReg.Taker.ficulty,Reg.In.Dif
|Reg.Grade P
PRM: Dependencies
StudentIntelligence
Performance
RegGradeSatisfaction
CourseDifficulty
Rating
ProfessorPopularity
Teaching-Ability
Stress-Level
1.06.03.01.01.08.04.05.01.01.04.05.0
,,,,
,
llhllhhh
CBAID
![Page 11: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/11.jpg)
PRM: Dependencies (cont.)Professor
Prof. GumpPopularity
highTeaching Ability
mediumStress-Level
low
CoursePhil142
Difficulty low
Ratinghigh
CoursePhil101
Difficulty low
Ratinghigh
Reg#5639
GradeA
Satisfaction 3
Reg#5639
GradeA
Satisfaction 3
Reg#5639
Grade?
Satisfaction 3
StudentJohn Doe
Intelligence high
Performance average
StudentJane Doe
Intelligence high
Performance average
StudentJohn Deer
Intelligence low
Performance average
Reg#5639
Grade?
Satisfaction 3
1.06.03.01.01.08.04.05.01.01.04.05.0
,,,,
,
llhllhhh
CBAID
1.06.03.01.01.08.04.05.01.01.04.05.0
,,,,
,
llhllhhh
CBAID
![Page 12: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/12.jpg)
PRM: aggregate dependencies
RegGrade
StudentIntelligence
Performance
Satisfaction
CourseDifficulty
Rating
ProfessorPopularity
Teaching-Ability
Stress-Level
StudentJane Doe
Intelligence high
Performance average
Reg#5077
GradeC
Satisfaction 2
Reg#5054
GradeC
Satisfaction 1
Reg#5639
GradeA
Satisfaction 3
Problem!!!
Need CPTs of varying sizes
avg
1.03.06.04.04.02.07.02.01.0
CBA
hmlavg
![Page 13: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/13.jpg)
PRM: aggregate dependencies
StudentIntelligence
Performance
RegGradeSatisfaction
CourseDifficulty
Rating
ProfessorPopularity
Teaching-Ability
Stress-Level
avg
avg
count
sum, min, max, avg, mode, count
![Page 14: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/14.jpg)
PRM: Summary• A PRM specifies
– a probabilistic dependency structure S• a set of parents for each attribute X.A
– a set of local probability models
• Given a skeleton structure , a PRM specifies a probability distribution over instances I:– over attribute values of all objects in
Classes Objects
)|(),,|( ).()( .
. axparentsX Xx AX
axPSP III
Value of attribute A in object xAttributes
![Page 15: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/15.jpg)
Learning PRMs
Relational
Schema
Database:
• Parameter estimation
• Structure selection
Course Student
Reg
Course Student
Reg
Instance I
![Page 16: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/16.jpg)
Parameter estimation in PRMs• Assume known dependency structure S• Goal: estimate PRM parameters
– entries in local probability models,
• A parameterization is good if it is likely to generate the observed data, instance I .
• MLE Principle: Choose so as to maximize l
),|(log),:( SPSl II
).(|. AxparentsAx
crucial property: decompositionseparate terms for different X.A
![Page 17: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/17.jpg)
ML parameter estimation
IntellReg.Taker.ficulty,Reg.In.Dif
|Reg.Grade P
StudentIntelligence
PerformanceReg
GradeSatisfaction
CourseDifficultyRating
).,.().,.,.(
*
.,.|.
hISlDCNhISlDCAGRN
hISlDCAGR
DB technology well-suited to the computation of suff statistics:
Coursetable
Regtable
Studenttable
IntSGradeRDiffC
...
Count
sufficient statistics
![Page 18: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/18.jpg)
Model Selection• Idea:
– define scoring function – do local search over legal structures
• Key Components:– scoring models– legal models– searching model space
![Page 19: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/19.jpg)
Scoring Models
• Bayesian approach:
• closed form solution
])()|(log[)|(log):(
priorlikelihoodmarginal
SPSPSPSScore
III
![Page 20: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/20.jpg)
Legal Models
• Dependency ordering over attributes:
x.a
y.b
axby .. if X.A depends on Y.B
PaperAccepted
ResearcherReputation author-of
• PRM defines a coherent probability model over skeleton if is acyclic
![Page 21: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/21.jpg)
Guaranteeing AcyclicityHow do we guarantee that a PRM is acyclic for every skeleton?
PRMdependency structure S
dependencygraph
Y.B
X.A
if X.A depends directly on Y.B
dependency graph acyclic acyclic for any Attribute stratification:
![Page 22: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/22.jpg)
Limitation of stratificationPersonM-chromosome
P-chromosome
Blood-type
PersonM-chromosome
P-chromosome
Blood-type
PersonM-chromosome
P-chromosome
Blood-type
Father Mother
Person.M-chrom Person.P-chrom
Person.B-type ???
![Page 23: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/23.jpg)
Guaranteed acyclic relations
PersonM-chromosome
P-chromosome
Blood-type
PersonM-chromosome
P-chromosome
Blood-type
PersonM-chromosome
P-chromosome
Blood-type
Father Mother
• Prior knowledge: the Father-of relation is acyclic– dependence of Person.A on Person.Father.B cannot induce cycles
![Page 24: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/24.jpg)
Guaranteeing acyclicity• With guaranteed acyclic relations, some cycles in
the dependency graph are guaranteed to be safe.• We color the edges in the dependency graph
A cycle is safe if– it has a green edge– it has no red edge
yellow: withinsingle object
X.B
X.Agreen: viag.a. relation
Y.B
X.Ared: viaother relations
Y.B
X.A
Person.M-chrom Person.P-chrom
Person.B-type
![Page 25: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/25.jpg)
Searching Model Space
Student
Course Reg scoreAdd C.AC.B
score
Delete S.IS.P Student
Course Reg
Student
RegCourse
Phase 0: consider only dependencies within a class
![Page 26: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/26.jpg)
Phased structure search
Student
Course Reg scoreAdd C.AR.B
score
Add S.IR.CStudent
Course Reg
Student
RegCourse
Phase 1: consider dependencies from “neighboring” classes, via schema relations
![Page 27: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/27.jpg)
Phased structure search
scoreAdd C.AS.P
score
Add S.IC.B
Phase 2: consider dependencies from “further” classes, via relation chains
Student
Course Reg
Student
Course Reg
Student
Course Reg
![Page 28: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/28.jpg)
Experimental Results:Movie Domain (real data)
11,000 movies, 7,000 actors
ActorGender
AppearsRole-type
MovieProcess
Decade
Genre
source: http://www-db.stanford.edu/movies/doc.html
![Page 29: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/29.jpg)
Genetics domain (synthetic data)PersonM-chromosome
P-chromosome
Blood-type
PersonM-chromosome
P-chromosome
Blood-type
PersonM-chromosome
P-chromosome
Blood-type
Father Mother
Blood-TestContaminated
Result
![Page 30: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/30.jpg)
Experimental Results
-32000
-30000
-28000
-26000
-24000
-22000
-20000
-18000
200 300 400 500 600 700 800
Sco
re
Dataset Size
Median LikelihoodGold Standard
![Page 31: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/31.jpg)
Future directions• Learning in complex real-world domains
– drug treatment regimes– collaborative filtering
• Missing data• Learning with structural uncertainty• Discovery
– hidden variables– causal structure– class hierarchy
![Page 32: Learning Probabilistic Relational Models](https://reader031.vdocuments.us/reader031/viewer/2022013122/56814342550346895dafba2e/html5/thumbnails/32.jpg)
Conclusions• PRMs natural extension of BNs:
– well-founded (probabilistic) semantics– compact representation of complex models
• Powerful learning techniques– builds on BN learning techniques– can learn directly from relational data
• Parameter estimation– efficient, effective exploitation of DB technology
• Structure identification– builds on well understood theory– major issues:
• guaranteeing coherence• search heuristics