statistical learning from relational data
DESCRIPTION
Statistical Learning from Relational Data. Daphne Koller Stanford University Joint work with many many people. Relational Data is Everywhere. The web Webpages (& the entities they represent), hyperlinks Social networks People, institutions, friendship links Biological data - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/1.jpg)
Statistical Learning from Relational Data
Daphne KollerStanford University
Joint work with many many people
![Page 2: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/2.jpg)
Relational Data is Everywhere
The web Webpages (& the entities they represent),
hyperlinks Social networks
People, institutions, friendship links Biological data
Genes, proteins, interactions, regulation Bibliometrics
Papers, authors, journals, citations Corporate databases
Customers, products, transactions
![Page 3: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/3.jpg)
Relational Data is Different
Data instances not independent Topics of linked webpages are correlated
Data instances are not identically distributed: Heterogeneous instances (papers, authors)
No IID assumption
This is a good thing
![Page 4: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/4.jpg)
New Learning Tasks Collective classification of related instances
Labeling an entire website of related webpages
Relational clustering Finding coherent clusters in the genome
Link prediction & classification Predicting when two people are likely to be friends
Pattern detection in network of related objects Finding groups (research groups, terrorist groups)
![Page 5: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/5.jpg)
Probabilistic Models Uncertainty model:
space of “possible worlds”; probability distribution over this space.
Worlds: often defined via a set of state variables medical diagnosis: diseases, symptoms, findings, …
each world: an assignment of values to variables
Number of worlds is exponential in # of vars 2n if we have n binary variables
![Page 6: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/6.jpg)
Outline
Relational Bayesian networks* Relational Markov networks Collective Classification Relational clustering
* with Avi Pfeffer, Nir Friedman, Lise Getoor
![Page 7: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/7.jpg)
Bayesian Networks
nodes = variablesedges = direct influence
Graph structure encodes independence assumptions: Letter conditionally independent of Intelligence given Grade
0% 20% 40% 60% 80% 100%
hard,high
hard,low
easy,high
easy,lowA B C
CPD P(G|D,I)
Job
Grade
SAT
IntelligenceDifficulty
![Page 8: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/8.jpg)
Bayesian Networks: Problem
Bayesian nets use propositional representation Real world has objects, related to each other
Intelligence Difficulty
Grade
Intell_Jane Diffic_CS101
Grade_Jane_CS101
Intell_George Diffic_Geo101
Grade_George_Geo101
Intell_George Diffic_CS101
Grade_George_CS101A C
These “instances” are not independent
![Page 9: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/9.jpg)
Relational Schema Specifies types of objects in domain, attributes of
each type of object & types of relations between objects
Teach
Student
Intelligence
Registration
Grade
Satisfaction
Course
Difficulty
Professor
Teaching-Ability
In
Take
ClassesClasses
RelationsRelationsAttributesAttributes
![Page 10: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/10.jpg)
St. Nordaf University
Tea
ches
Tea
ches
In-course
In-course
Registered
In-course
Prof. SmithProf. Jones
George
Jane
Welcome to
CS101
Welcome to
Geo101
Teaching-abilityTeaching-ability
Difficulty
Difficulty Registered
RegisteredGrade
Grade
Grade
Satisfac
Satisfac
Satisfac
Intelligence
Intelligence
World
![Page 11: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/11.jpg)
Relational Bayesian Networks
Universals: Probabilistic patterns hold for all objects in class Locality: Represent direct probabilistic dependencies
Links define potential interactions
StudentIntelligence
RegGrade
Satisfaction
CourseDifficulty
ProfessorTeaching-Ability
[K. & Pfeffer; Poole; Ngo & Haddawy]
0% 20% 40% 60% 80% 100%
hard,high
hard,low
easy,high
easy,lowA B C
![Page 12: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/12.jpg)
Prof. SmithProf. Jones
Welcome to
CS101
Welcome to
Geo101
RBN Semantics
Teaching-abilityTeaching-ability
Difficulty
Difficulty
Grade
Grade
Grade
Satisfac
Satisfac
Satisfac
Intelligence
Intelligence
George
Jane
Ground model: variables: attributes of all objects dependencies: determined by relational links & template model
Welcome to
CS101
![Page 13: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/13.jpg)
Welcome to
CS101
low / high
The Web of Influence
0% 50% 100%0% 50% 100%
Welcome to
Geo101 A
C
low high
0% 50% 100%
easy / hard
![Page 14: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/14.jpg)
Outline
Relational Bayesian networks* Relational Markov networks†
Collective Classification Relational clustering
* with Avi Pfeffer, Nir Friedman, Lise Getoor
† with Ben Taskar, Pieter Abbeel
![Page 15: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/15.jpg)
Why Undirected Models? Symmetric, non-causal interactions
E.g., web: categories of linked pages are correlated
Cannot introduce direct edges because of cycles
Patterns involving multiple entities E.g., web: “triangle” patterns Directed edges not appropriate
“Solution”: Impose arbitrary direction Not clear how to parameterize CPD for variables
involved in multiple interactions Very difficult within a class-based
parameterization[Taskar, Abbeel, K. 2001]
![Page 16: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/16.jpg)
Markov Networks
Laura
Noah
Mary
James
N)(L,N)(M,M)(J,L)(K,L)(J,K)(J,
ZN)M,L,K,P(J,
1
Kyle
0 0.5 1 1.5 2
AAABACBABBBCCACBCC
Template potential
![Page 17: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/17.jpg)
Relational Markov Networks
Universals: Probabilistic patterns hold for all groups of objects
Locality: Represent local probabilistic dependencies Sets of links give us possible interactions
Study Group
Student2
Reg2
GradeIntelligence
Course
Reg1Grade
Student1
Difficulty
Intelligence
0 0.5 1 1.5 2
AAABACBABBBCCACBCC
Template potential
![Page 18: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/18.jpg)
Welcome to
CS101
RMN Semantics
Welcome to
Geo101
Difficulty
Difficulty
Grade
Grade
Intelligence
Intelligence
George
Jane
Jill
Intelligence
Geo Study Group
CS Study Group
Grade
Grade
![Page 19: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/19.jpg)
Outline
Relational Bayesian Networks Relational Markov Networks Collective Classification*
Discriminative training Web page classification Link prediction
Relational clustering
* with Ben Taskar, Carlos Guestrin, Ming Fai Wong, Pieter Abbeel
![Page 20: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/20.jpg)
Model Structure
ProbabilisticRelational
ModelCourse
Student
Reg
Training Data
New Data
Learning
Inference
Conclusions
Collective Classification
Train on one year of student intelligence, course difficulty, and grades Given only grades in following year, predict all students’ intelligence
Example:
Features: .x
Labels: .y*
Features: ’.x Labels: ’.y
![Page 21: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/21.jpg)
Learning RMN Parameters
Student2
Reg2
GradeIntelligence
Course
Reg1Grade
Student1
Difficulty
IntelligenceTemplate potential
Study Group
AAABACBABBBCCACBCC
Parameterize potentials as log-linear model
)exp(1
).( )(xfwxw
wT
ZP
)exp().,.( 21 CCCCAAAA fwfwGRGR
![Page 22: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/22.jpg)
Max Likelihood Estimation
maximizew
Estimation Classification
argmaxy
.x
.y* ).|.(log xy*w P ).,.(log xy*w P
We don’t care about the joint distribution P(.x, .y)
)'.|'.(log xyw P
![Page 23: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/23.jpg)
Web KB
Tom MitchellProfessor
WebKBProject
Sean SlatteryStudent
Advisor-of
Project-of
Member
[Craven et al.]
![Page 24: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/24.jpg)
Web Classification Experiments
WebKB dataset Four CS department websites Bag of words on each page Links between pages Anchor text for links
Experimental setup Trained on three universities Tested on fourth Repeated for all four combinations
![Page 25: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/25.jpg)
Professordepartment
extractinformationcomputersciencemachinelearning
…
Standard Classification
Categories:facultycourseprojectstudentother
Page
...
Category
Word1 WordN
![Page 26: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/26.jpg)
Standard Classification
...LinkWordN
workingwithTom Mitchell …
Page
...
Category
Word1 WordN
00.020.040.060.080.1
0.120.140.160.18
Logistic
test
set
err
or
4-fold CV:Trained on 3 universities
Tested on 4th
Discriminatively trained naïve Markov
= Logistic Regression
![Page 27: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/27.jpg)
Power of ContextProfessor
?Student? Post-doc?
![Page 28: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/28.jpg)
Collective Classification
...
PageCategory
Word1 WordN
From-
Link ...
PageCategory
Word1 WordN
To-
CCCFCPCSFCFFFPFSPCPFPPPSSCSFSPSS
Compatibility (From,To)FT
![Page 29: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/29.jpg)
Collective Classification
...
PageCategory
Word1 WordN
From-
Link ...
PageCategory
Word1 WordN
To-
Logistic Links
Classify all pages collectively,
maximizing the joint label probability
00.020.040.060.080.1
0.120.140.160.18
test
set
err
or
[Taskar, Abbeel, K., 2002]
![Page 30: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/30.jpg)
More Complex Structure
![Page 31: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/31.jpg)
More Complex Structure
C
Wn
W1Faculty
S
Students
S
Courses
![Page 32: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/32.jpg)
Collective Classification: Results
00.020.040.060.080.1
0.120.140.160.18
Logistic Links Section Link+Section[Taskar, Abbeel, K., 2002]
test
set
err
or
35.4% error reduction over logistic
![Page 33: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/33.jpg)
Max Conditional Likelihood
maximizew
Estimation Classification
argmaxy
)(log..).|.(log xyx,fwxy ww ZP T
xyfwx
xyw
w .,.exp)(
1).|.( T
ZP
)'.|'.(log xyw P xyfw '.,'. T).|.(log xy*w P.x
.y*
We don’t care about the conditional distribution P(.y |
.x)
![Page 34: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/34.jpg)
*yy
yyx,fw
*yx,fw
].[..
..
T
T
margin # labelingmistakes in y
Max Margin Estimation
[Taskar, Guestrin, K., 2003] (see also [Collins, 2002; Hoffman 2003])
Quadratic program
Exponentially many constraints
maximize ||w||=1
Estimation Classification
argmaxy xyfw '.,'. T.x
.y*
What we really want: correct class labels
![Page 35: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/35.jpg)
Max Margin Markov Networks
We use structure of Markov network to provide equivalent formulation of QP Exponential only in tree width of network Complexity = max-likelihood classification
Can solve approximately in networks where induced width is too large Analogous to loopy belief propagation
Can use kernel-based features! SVMs meet graphical models
[Taskar, Guestrin, K., 2003]
![Page 36: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/36.jpg)
WebKB Revisited
00.020.040.060.080.1
0.120.140.160.180.2
Test
Err
or
Logistic likelihood max margin
16.1% relative reduction in error relative to cond. likelihood RMNs
![Page 37: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/37.jpg)
Predicting Relationships
Even more interesting: relationships between objects
Tom MitchellProfessor
WebKBProject
Sean SlatteryStudent
Advisor-of
Member
Member
![Page 38: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/38.jpg)
Predicting Relations
0
5
10
15
20
25
30
Flat Collective
Introduce exists/type attribute for each potential link Learn discriminative model for this attribute Collectively predict its value in new world
Relation
...
Page
Word1 WordN
From-
...
Page
Word1 WordN
To-
Exists/Type...LinkWord1 LinkWordN
Category Category
72.9% error reduction over flat
[Taskar, Wong, Abbeel, K., 2003]
![Page 39: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/39.jpg)
Outline
Relational Bayesian Networks Relational Markov Networks Collective Classification Relational clustering
Movie data* Biological data†
* with Ben Taskar, Eran Segal
† with Eran Segal, Nir Friedman, Aviv Regev, Dana Pe’er, Haidong Wang, Micha Shapira, David Botstein
![Page 40: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/40.jpg)
Model Structure
ProbabilisticRelational
ModelCourse
Student
Reg
Unlabeled Relational Data
Learning
Relational Clustering
Given only students’ grades, cluster similar students
Example:
Clustering of instances
![Page 41: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/41.jpg)
Learning w. Missing Data: EM
EM Algorithm applies essentially unchanged E-step computes expected sufficient statistics,
aggregated over all objects in class M-step uses ML (or MAP) parameter estimation
Key difference: In general, the hidden variables are not
independent Computation of expected sufficient statistics
requires inference over entire network
![Page 42: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/42.jpg)
P(Registration.Grade | Course.Difficulty, Student.Intelligence)
0% 20% 40% 60% 80% 100%
hard,high
hard,low
easy,high
easy,low
Learning w. Missing Data: EM
0% 20% 40% 60% 80% 100%
hard,high
hard,low
easy,high
easy,low
0% 20% 40% 60% 80% 100%
hard,high
hard,low
easy,high
easy,low
0% 20% 40% 60% 80% 100%
hard,high
hard,low
easy,high
easy,low
0% 20% 40% 60% 80% 100%
hard,high
hard,low
easy,high
easy,low
low / higheasy / hard
A B C
CoursesStudents
[Dempster et al. 77]
![Page 43: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/43.jpg)
Movie Data
Internet Movie Databasehttp://www.imdb.com
![Page 44: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/44.jpg)
Actor
Director
Movie
Genres Rating
Year#Votes
MPAA Rating
Discovering Hidden Types
Type Type
Type
[Taskar, Segal, K., 2001]
Learn model using EM
![Page 45: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/45.jpg)
Directors
Steven SpielbergTim BurtonTony ScottJames CameronJohn McTiernanJoel Schumacher
Alfred HitchcockStanley KubrickDavid LeanMilos FormanTerry GilliamFrancis Coppola
Actors
Anthony HopkinsRobert De NiroTommy Lee JonesHarvey KeitelMorgan FreemanGary Oldman
Sylvester StalloneBruce WillisHarrison FordSteven SeagalKurt RussellKevin CostnerJean-Claude Van DammeArnold Schwarzenegger
…
MoviesWizard of OzCinderellaSound of MusicThe Love BugPollyannaThe Parent TrapMary PoppinsSwiss Family Robinson
…
Terminator 2BatmanBatman ForeverGoldenEyeStarship TroopersMission: Impossible Hunt for Red October
Discovering Hidden Types
[Taskar, Segal, K., 2001]
![Page 46: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/46.jpg)
Biology 101: Gene Expression
Gene 2
CodingControl
Gene 1
CodingControl
DNA
RNA
Protein
Swi5 Transcription factor
Sw
i5
Cells express different subsets of their genesin different tissues and under different conditions
![Page 47: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/47.jpg)
Gene Expression Microarrays
Measure mRNA level for all genes in one condition Hundreds of experiments Highly noisy
Expression of gene i in experiment jExperiment
s
Gen
es
Induced
Repressed
![Page 48: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/48.jpg)
Standard Analysis Cluster genes by similarity of expression profiles Manually examine clusters to understand what’s
common to genes in cluster
Clustering
![Page 49: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/49.jpg)
General Approach Expression level is a function of gene
properties and experiment properties Learn model that best explains the data• Observed properties: gene sequence, array condition, …• Hidden properties: gene clusterGene Experiment
Expression
Properties of
Gene iProperties of Experiment j
Expression levelof Gene i
in Experiment j
Attributes Attributes
Level
• Assignment to hidden variables (e.g., module assignment)• Expression level as function of properties
![Page 50: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/50.jpg)
Level
Gene ExperimentCluster
Expression
ID
Clustering as a PRM
P(Ei.L | g.C)g.C
1
2
3
0
0
0
g.C
g.E1 g.E2 g.Ek
CPD 2
CPD k
Naïve Bayes
CPD 1
![Page 51: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/51.jpg)
Modular Regulation Learn functional modules:
Clusters of genes that are similarly controlled Learn control program for modules
Expression as function of control genes
HAP4
CMK1 truefalse
truefalse
![Page 52: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/52.jpg)
[Segal, Regev, Pe’er, Koller, Friedman, 2003]
Level
GeneControlk
ExperimentCluster
Expression
Control2Control1
Module Network PRM
HAP4
CMK1 truefalse
truefalse
00
0
Cluster 1BMH1
Yer184c
true
false
truefalse
GIC2 USV1FAR1 true
false
true
truefalse
false
true
true
false
USV1
truefalse
APG1
Cluster 2
Activity levelof control
genein experiment
![Page 53: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/53.jpg)
Experimental Results
Yeast Stress Data (Gasch et al.) 2355 genes that showed activity 173 experiments (microarrays):
Diverse environmental stress conditions (e.g. heat shock)
Learned module network with 50 modules: Cluster assignments are hidden variables Structure of dependency trees unknown
Learned model using structural EM algorithm
Segal et al., Nature Genetics, 2003
![Page 54: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/54.jpg)
Biological Evaluation
Find sets of co-regulated genes (regulatory module)
Find the regulators of each module
[Segal et al., Nature Genetics, 2003]
46/50
30/50
![Page 55: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/55.jpg)
Experimental Results Hypothesis: Regulator ‘X’ regulates process ‘Y’ Experiment: Knock out ‘X’ and rerun the experiment
HAP4
CMK1 truefalse
truefalse X?
[Segal et al., Nature Genetics, 2003]
![Page 56: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/56.jpg)
wt Ypl230w
0 3 5 7 9 24 0 2 5 7 9 24
(hrs.)
>16x
341 differentially expressed genes
0 7 15 30 60 0 7 15 30 60
wt (min.)
Ppt1
>4x
602
0 5 15 30 60 0 5 15 30 60
wt (min.)
Kin82
>4x
281
Differentially Expressed Genes
[Segal et al., Nature Genetics, 2003]
![Page 57: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/57.jpg)
Were the differentially expressed genes predicted as targets?
Rank modules by enrichment for diff. expressed genes
# Module Significance
14 Ribosomal and phosphate metabolism 8/32, 9e 3
11 Amino acid and purine metabolism 11/53, 1e 2
15 mRNA, rRNA and tRNA processing 9/43, 2e 2
39 Protein folding 6/23, 2e 2
30 Cell cycle 7/30, 2e 2
Ppt1
# Module Significance
39Protein folding 7/23, 1e-4
29Cell differentiation 6/41, 2e-2
5 Glycolysis and folding 5/37, 4e-2
34Mitochondrial and protein fate 5/37, 4e-2
Ypl230w
# Module Significance
3 Energy and osmotic stress I 8/31, 1e 4
2 Energy, osmolarity & cAMP signaling 9/64, 6e 3
15 mRNA, rRNA and tRNA processing 6/43, 2e 2
Kin82
Biological Experiments Validation
All regulators regulate predicted modules
[Segal et al., Nature Genetics, 2003]
![Page 58: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/58.jpg)
Biology 102: Pathways
Pathways are sets of genes that act together to achieve a common function
![Page 59: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/59.jpg)
Finding Pathways: Attempt I
Use protein-protein interaction data
![Page 60: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/60.jpg)
Finding Pathways: Attempt I
Use protein-protein interaction data
![Page 61: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/61.jpg)
Finding Pathways: Attempt I
Use protein-protein interaction data
Problems: Data is very noisy Structure is lost:
Large connected component in interaction graph (3527/3589 genes)
![Page 62: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/62.jpg)
Finding Pathways: Attempt II
Use expression microarray clusters
Pathway I
Pathway II
Problems: Expression is only
‘weak’ indicator of interaction
Interacting pathways are not separable
![Page 63: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/63.jpg)
Finding Pathways: Our Approach
Use both types of data to find pathways Find “active” interactions using gene expression Find pathway-related co-expression using
interactions
Pathway I
Pathway II
Pathway III
Pathway IV
[Segal, Wang, K., 2003]
![Page 64: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/64.jpg)
Probabilistic Model
...
Pathway
Exp1 ExpN
Gene
Interacts
[Segal, Wang, K., 2003]
1
...
Pathway
Exp1 ExpN
Gene2
Expression level in N arrays
protein productinteraction
Compatibilitypotential
(g.C,g.C)g1.C g2.C
123123123
111222333
1
1
2
3
0
0
Cluster all genes collectively,
maximizing the joint model likelihood
![Page 65: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/65.jpg)
Capturing Protein Complexes
Independent data set of interacting proteins
0
50
100
150
200
250
300
350
400
0 10 20 30 40 50 60 70 80 90 100Complex Coverage (%)
Nu
m C
om
ple
xes
Our method
Standard expression clustering
124 complexes covered at 50% for our method
46 complexes covered at 50% for clustering
[Segal, Wang, K., 2003]
![Page 66: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/66.jpg)
YHR081WRRP40RRP42MTR3RRP45RRP4RRP43DIS3TRM7SKI6RRP46CSL4
RNAse Complex Pathway
YHR081W
SKI6
RRP42
RRP45
RRP46
RRP43TRM7RRP40
MTR3RRP4
DIS3
CSL4
Includes all 10 known pathway genes
Only 5 genes found by clustering
[Segal, Wang, K., 2003]
![Page 67: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/67.jpg)
Interaction Clustering RNAse complex found by interaction
clustering as part of cluster with 138 genes
[Segal, Wang, K., 2003]
![Page 68: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/68.jpg)
Truth in Advertising Huge graphical models:
3000-50,000 hidden variables Hundreds of thousands of observed nodes Very densely connected
Learning: Multiple iterations of model updates Each requires running inference on the model
Inference: Exact inference is intractable Use belief propagation Single inference iteration: 1-6 hours Algorithmic ideas key to scaling
![Page 69: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/69.jpg)
Relational Data: A New Challenge
Data consists of different types of instances
Instances are related in complex networks
Instances are not independent
New tasks for machine learning Collective classification Relational clustering Link prediction Group detection
Opportunity
![Page 70: Statistical Learning from Relational Data](https://reader036.vdocuments.us/reader036/viewer/2022062323/5681584e550346895dc5a73f/html5/thumbnails/70.jpg)
http://robotics.stanford.edu/~koller/