learning probabilistic relational models using non-negative matrix factorization
TRANSCRIPT
Anthony Coutant, Philippe Leray, Hoel Le CapitaineDUKe (Data, User, Knowledge) Team, LINA
26th June, 2014
Learning Probabilistic Relational Models using Non-Negative Matrix Factorization
7ème Journées Francophones sur les Réseaux Bayésiens et les Modèles Graphiques Probabilistes
22 / 24
Context
• Probabilistic Relational Models (PRM)– Attributes uncertainty in Relational datasets
• Relational datasets: attributes + link
• PRM with Reference Uncertainty (RU) model link uncertainty
• Partitioning individuals necessary in PRM-RU
33 / 24
Problem & Proposal
• PRM-RU partition individuals based on attributes only
• We propose to cluster the relationship information instead
• We show that :
– Attributes partitioning do not explain all relationships
– Relational partitioning can explain attributes oriented relationships
44 / 24
Flat datasets – Bayesian Networks
• Individuals supposed i.i.d.
P(G1)A B
0,25 0,75
P(G2)A B
0,25 0,75
DatasetG1 G2 RA B 1stB A 1stB B 2ndB B 2nd
G1, G2
P(R|G1,G2) A,A A,B B,A B,B
1st division 0,8 0,5 0,5 0,2
2nd division 0,2 0,5 0,5 0,8
Grade 1
Ranking
Grade 2
55 / 24
Relational datasets – Relational schema
StudentIntelligence
Ranking
RegistrationGrade
Satisfaction
1,n1
Instance
Schema
CoursePhil101
Difficulté???
Note???
Registration#4563
Note???
Satisfaction???
StudentJane Doe
Intelligence???
Classement???
StudentJane Doe
Intelligencehigh
Ranking1st division
Registration#4563
Note???
Satisfaction???
Registration#4563
GradeA
Satisfactionhigh
CoursePhil101
Difficultyhigh
Evaluationhigh
CourseDifficulty
Evaluation
1,n 1
66 / 24
Probabilistic Relational Models (PRM) .
MEAN(G)
P(R|MEAN(G)) A B
1st division 0,8 0,2
2nd division 0,2 0,8
PRM
Schema
Instance
StudentIntelligence
Ranking
RegistrationGrade
Satisfaction
1,n1CourseDifficulty
Evaluation
1,n 1
Evaluation Intelligence
Grade
Satisfaction
Difficulty Ranking
Course Registration Student
MEAN
MEAN
CourseMath
Difficulté???
Note???
Registration#6251
Note???
Satisfaction???
StudentJohn Smith
Intelligence???
Classement???
StudentJane Doe
Intelligence???
Ranking???
Registration#5621
Note???
Satisfaction???
Registration#4563
Grade???
Satisfaction???
CoursePhil
Difficulty???
Evaluation???
Instance
77 / 24
Probabilistic Relational Models (PRM) ..
MEAN(G)
P(R|MEAN(G)) A B
1st division 0,8 0,2
2nd division 0,2 0,8
PRM
Schema
CourseMath
Difficulté???
Note???
Registration#6251
Note???
Satisfaction???
StudentJohn Smith
Intelligence???
Classement???
StudentJane Doe
Intelligence???
Ranking???
Registration#5621
Note???
Satisfaction???
Registration#4563
Grade???
Satisfaction???
CoursePhil
Difficulty???
Evaluation???
Instance
Evaluation Intelligence
Grade
Satisfaction
Difficulty Ranking
Course Registration Student
MEAN
MEAN
Math.Diff
#4563.Grade
#5621.Grade
#6251.Grade
MEAN
GBN (Ground Bayesian Network)
Math.Eval
Phil.Diff
Phil.Eval#4563.Satis #5621.Satis
#6251.Satis
MEAN
JD.Int
JS.Int
JD.Rank
JS.RankMEAN
MEAN
Instance
StudentIntelligence
Ranking
RegistrationGrade
Satisfaction
1,n1CourseDifficulty
Evaluation
1,n 1
88 / 24
Uncertainty in Relational datasets
CoursePhil101
Difficulté???
Note???
Registration#4563
Note???
Satisfaction???
StudentJane Doe
Intelligence???
Classement???
StudentJane Doe
Intelligence???
Ranking???
Registration#4563
Note???
Satisfaction???
Registration#4563
Grade???
Satisfaction???
CoursePhil101
Difficulty???
Evaluation???
StudentJane Doe
Intelligence???
Ranking???
StudentJane Doe
Intelligence???
Ranking???
Registration#4563
Note???
Satisfaction???
Registration#4563
GradeA
Satisfaction???
CoursePhil101
Difficulté???
Note???
CoursePhil101
Difficulty???
Evaluationhigh
CoursePhil101
Difficulté???
Note???
Registration#4563
Note???
Satisfaction???
StudentJane Doe
Intelligence???
Classement???
StudentJane Doe
Intelligence???
Ranking???
Registration#4563
Note???
Satisfaction???
Registration#4563
Grade???
Satisfaction???
CoursePhil101
Difficulty???
Evaluation???
StudentJane Doe
Intelligence???
Ranking???
StudentJane Doe
Intelligence???
Ranking???
Registration#4563
Note???
Satisfaction???
Registration#4563
GradeA
Satisfaction???
CoursePhil101
Difficulté???
Note???
CoursePhil101
Difficulty???
Evaluationhigh
?
Attributes uncertainty (PRM)
Attributes and link uncertainty (PRM extensions)
?
99 / 24
• Reference uncertainty: P(r.Course = ci, r.Student = sj | r.exists = true)• A random variable for each individual id? Not generalizable• Solution: partitioning
Difficulty Intelligence
Course StudentRegistration
Student
Evaluation RankingCourse
P(Student | Course.Difficulty)?
P(Course)?
PRM with reference uncertainty .
1010 / 24
• P(Student | ClusterStudent) follows a uniform law
Difficulty Intelligence
Course StudentRegistration
ClusterCourse
Course
ClusterStudent
Student
P(CStudent | S.Intelligence)
low high
C1 0 1
C2 1 0
P(Student | CStudent)
C1 C2
s1 0 1
s2 1 0
Evaluation Ranking
PRM with reference uncertainty ..
1111 / 24
• P(Student | ClusterStudent) follows a uniform law
Difficulty Intelligence
Course StudentRegistration
ClusterCourse
Course
ClusterStudent
Student
P(CStudent | S.Intelligence)
low high
C1 0 1
C2 1 0
P(Student | CStudent)
C1 C2
s1 0 1
s2 1 0
Evaluation Ranking
PRM with reference uncertainty ..
highlow Biolow
high C1
C2
Students Population stats
50% 50%Partition Function
1212 / 24
Attributes-oriented Partition Functions in PRM-RU
• PRM-RU: Clustering from attributes• Assumption: attributes explain the relationship• Not generalizable, relationship information not used for partitioning
Course StudentP(Green | Red) = 1P(Purple | Blue) = 1
YES
1313 / 24
Attributes-oriented Partition Functions in PRM-RU
• PRM-RU: Clustering from attributes• Assumption: attributes explain the relationship• Not generalizable, relationship information not used for partitioning
Course StudentP(Green | Red) = 1P(Purple | Blue) = 1
Course StudentP(Green | Red) = 1P(Purple | Blue) = 1
YES IS THAT SO?
1414 / 24
Attributes-oriented Partition Functions in PRM-RU
• PRM-RU: Clustering from attributes• Assumption: attributes explain the relationship• Not generalizable, relationship information not used for partitioning
Course Student Course StudentP(Green | Red) = 1P(Purple | Blue) = 1
P(Green | Red) = 0.5P(Purple | Red) = 0.5
Course StudentP(Green | Red) = 0.5P(Purple | Red) = 0.5
Course StudentP(Green | Red) = 1P(Purple | Blue) = 1
YES NOIS THAT SO?
1515 / 24
Relationship-oriented Partitioning
• Objective: finding partitioning maximizing intra-partition edges
Course Student
P(Student.p1 | Course.p1) = 1P(Student.p2 | Course.p2) = 1
p1
p2Course Student
P(Green | Red) = 0.5P(Purple | Red) = 0.5
1616 / 24
Experiments – Protocol – Dataset generation
Entity 2Att 1
…Att n
R1,n 1
Entity 1Att 1
…Att n
1 1,n
Schema
Instance
Entity 1 Entity 2R
1717 / 24
Experiments – Protocol – Dataset generation
Entity 2Att 1
…Att n
R1,n 1
Entity 1Att 1
…Att n
1 1,n
Schema
Instance
Entity 1 Entity 2
Attributes partitioning favorable case
Relationship partitioning favorable case
Entity 1 Entity 2
R
R
1818 / 24
Experiments – Protocol – LearningEntity 1 Entity 2Relation
Att n
Att 1
Att n
Att 1 CE1
CE2
E2
E1
• Parameter learning on set up structure• 2 PRM compared:– Either with attributes partitioning– Or with relational partitioning
1919 / 24
Experiments – Protocol – Evaluation• For each generated dataset D – Split D into 10 subsets {D1, …, D10}
– Perform 10 Folds CV each with one Di for test and others for training• Do it for PRM with attributes partitioning : store the results of 10 log likelihood PattsLL[i]• Do it for PRM with relationship partitioning : store the results of 10 log likelihood PrelLL[i]
– Evaluate mean and sd of PattsLL[i] and PrelLL[i]
– Evaluate significancy of relationship partitioning over attributes partitioning
2020 / 24
Experiments – ResultsRandom clusters (independent from attributes)
k2 4 16
n
25
50
100
200
Relational > Attributes partitioningAttributes > Relational partitioningPartitionings not significantly comparable
k2 4 16
n
25
50
100
200
Attributes => Cluster(fully dependent from attributes)
Green: Red:
Orange:
2121 / 24
Experiments – About the NMF choice for partitioning
• NMF– Find low dimension factor matrices which product approximates the original matrix– A relationship between two entities is an adjacency matrix
• Motivation for NMF usage– (Restrictively) captures latent information from both rows and columns: co-clustering– Several extensions dedicated to more accurate co-clustering (NMTF)
– Extensions for Laplacian regularization• Allow to capture both attributes and relationship information for clustering
– Extensions for Tensor factorization• Allow to model n-ary relationships, n >= 2
– NMF = Good starting choice for the long-term needs?
2222 / 24
Experiments – About the NMF choice for partitioning
• But– Troubles with performances in experimentations– Very sensitive to initialization: crashes whenever reaching singular
state
– Moving toward large scale methods : graph based relational clustering?
2323 / 24
Conclusion
• PRM-RU to define probability structure in relational datasets• Need for partitioning• PRM-RU use attributes oriented partitioning• We propose to cluster the relationship information instead• Experiments show that :– Attributes partitioning do not explain all relationships– Relational partitioning can explain attributes oriented relationships
2424 / 24
Perspectives
• Experiments on real life datasets
• Towards large scale partitioning methods
• PRM-RU Structure Learning using clustering algorithms
• What about other link uncertainty representations?
Anthony Coutant, Philippe Leray, Hoel Le CapitaineDUKe (Data, User, Knowledge) Team, LINA
Questions?
7ème Journées Francophones sur les Réseaux Bayésiens et les Modèles Graphiques Probabilistes
(anthony.coutant | philippe.leray | hoel.lecapitaine)@univ-nantes.fr