bayesian machine learning and its application alan qi feb. 23, 2009
TRANSCRIPT
![Page 1: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/1.jpg)
Bayesian Machine learning and its application
Alan QiFeb. 23, 2009
![Page 2: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/2.jpg)
Motivation
• massive data from various sources: web pages, facebook, high-throughput biological data, high-throughput chemical data, etc.
• Challenging goal: how to model complex systems and extract knowledge from data.
![Page 3: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/3.jpg)
Bayesian machine learning
Bayesian learning method
Principled way to fuse prior knowledge and new evidence in data
Key issues Model Design
Computation
Wide-range applications
![Page 4: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/4.jpg)
Bayesian learning in practice
Applications:
Recommendation systems (Amazon, NetFlix)
Text Parsing (Finding latent topics in documents)
Systems biology (where computations meets biology)
Computer vision (parsing handwritten diagram automatically)
Wireless communications
Computational finance ....
![Page 5: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/5.jpg)
Learning for biology: understanding gene regulation during organism development
Protein, product of Gene B
DNA
Gene A
Learning functionalities of genes for development
Inferring high-resolution protein-DNA binding locations from low-resolution measurement
Learning regulatory cascades during embryonic stem cell development
![Page 6: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/6.jpg)
6
High
time timetime
genes
High
time timetime
genes
High
time timetime
genes
Wild-type lineageNo C lineage Extra ‘C’ lineages
High
time timetime
genes
Data: gene expression profiles from wide-types & mutants
(Baugh et al, 2005)
![Page 7: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/7.jpg)
Bayesian semisupervised classification for finding tissue-specific genes
BGEN: (Bayesian GENeralization from examples, Qi et al., Bioinformatics 2006)
Labeledexpression
Labeledexpression
Classifier
Graph-based kernels
(F. Chung, 1997, Zhu et al., 2003, Zhou et al. 2004)
Gaussian process classifier that is trained by EP and classifies the whole genome efficiently
Estimating noise and probe quality by approximate leave-one-out error
Gene expression
![Page 8: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/8.jpg)
Biological experiments support our predictions
CNon C
MuscleEpidermis
CNon C
MuscleEpidermis
K01A2.5
R11A5.4 Ge’s lab
![Page 9: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/9.jpg)
Data: genomic sequences
![Page 10: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/10.jpg)
![Page 11: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/11.jpg)
RNA: messager
![Page 12: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/12.jpg)
Consensus SequencesUseful for publication
IUPAC symbols for degenerate sites
Not very amenable to computation
Nature Biotechnology 24, 423 - 425 (2006)
![Page 13: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/13.jpg)
Probabilistic Model
.2
.2
.5
.1
.7.2.2.1.3
.1.2.4.5.4
.1.2.2.2.2
.1.4.1.2.1ACGT
M1 MKM1
Pk(S|M)
Position FrequencyMatrix (PFM)
1 K Count frequenciesAdd pseudocounts
![Page 14: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/14.jpg)
Bayesian learning: Estimating motif models by Gibbs sampling
P(Se
quen
ces|
para
ms1
,par
ams2
)
Parameter1 Parameter2
In theory, Gibbs Sampling less likely to get stuck a local maxima
![Page 15: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/15.jpg)
Bayesian learning: Estimating motif models by expectation maximization
P(Se
quen
ces|
para
ms1
,par
ams2
)
Parameter1 Parameter2
To minimize the effects of local maxima, you should searchmultiple times from different starting points
![Page 16: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/16.jpg)
Scoring A Sequence
11
( | ) ( | )( | )log log log
( | ) ( | ) ( | )
N Ni i i i
ii i i
P S PFM P S PFMP S PFMScore
P S B P S B P S B
To score a sequence, we compare to a null model
A: 0.25
T: 0.25
G: 0.25
C: 0.25
A: 0.25
T: 0.25
G: 0.25
C: 0.25
Background DNA (B)
.2
.2
.5
.1
.7.2.2.1.3
.1.2.4.5.4
.1.2.2.2.2
.1.4.1.2.1ACGT
Log likelihoodratio
-0.3
-0.3
1
-1.3
1.4-0.3-0.3-1.30.3
-1.3-0.30.610.6
-1.3-0.30.3-0.3-0.3
-1.30.6-1.3-0.3-1.3ACGT
Position WeightMatrix (PWM)
PFM
![Page 17: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/17.jpg)
Scoring a Sequence
MacIsaac & Fraenkel (2006) PLoS Comp Bio
Common threshold = 60% of maximum score
![Page 18: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/18.jpg)
Visualizing Motifs – Motif LogosRepresent both base frequency and conservation at each position
Height of letter proportionalto frequency of base at that position
Height of stack proportionalto conservation at that position
![Page 19: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/19.jpg)
Software implemenation: AlignACE
http://atlas.med.harvard.edu/cgi-bin/alignace.pl
• Implements Gibbs sampling for motif discovery– Several enhancements
• ScanAce – look for motifs in a sequence given a model
• CompareAce – calculate “similarity” between two motifs (i.e. for clustering motifs)
![Page 20: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/20.jpg)
Data: biological networks
![Page 21: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/21.jpg)
Network Decomposition
• Infinite Non-negative Matrix Factorization
1. Formulate the discovery of network legos as a non-negative factorization problem
2. Develop a novel Bayesian model which automatically learns the number of the bases.
![Page 22: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/22.jpg)
Network Decomposition
•Synthetic Network Decomposition
![Page 23: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/23.jpg)
Network Decomposition
![Page 24: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/24.jpg)
Data: Movie rating
• User-item Matrix of Ratings
• Recommend: 5 • Not Recommend: 1
X =
![Page 25: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/25.jpg)
Task: how to predict user preference
• “Based on the premise that people looking for information should be able to make use of what others have already found and evaluated.” (Maltz & Ehrlich, 1995)
• E.g., if you like movies A, B, C, D, and E. And I like A, B, C, D but have not seen E yet. What would be my possible rating on E?
![Page 26: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/26.jpg)
Collaborative filtering for recommendation systems
• Matrix factorization as an collaborative filtering approach:
X ≈ Z A where X is N by D, Z is N by K and A is K by D.
xi,j: user i’s rating on movie j
zi,k: user i’s interests in movie category k (e.g., action, thriller, comedy, romance, etc.)
Ak,j: how likely movie j belong to movie category k
Such that xi,j ≈ zi,1 A1,j + zi,2 A22,j + … + zi,K AK,j
![Page 27: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/27.jpg)
Bayesian learning of matrix factorization
• Training: Use probability theory, in particular, Bayeisan inference, to learn the model parameters Z, A given data X, which contains missing elements, i.e., unknown ratings
• Prediction: use estimated Z and A to predict unkown ratings in X
![Page 28: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/28.jpg)
Test resutls
• ‘Jester’ dataset: • Map from [-10,10] to [0,20]• 10 random chosen datasets, each with 1000
users. For each user we randomly hold out 10 ratings for testing
• IMF, INMF and NMF(K=2…9)
![Page 29: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/29.jpg)
Collaborative Filtering
![Page 30: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/30.jpg)
Task
• How to find latent topics and group documents, such as emails, papers, or news into different clusters?
![Page 31: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/31.jpg)
Data: text documents
X =
Computer science papers Biology papers
![Page 32: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/32.jpg)
Assumptions
1. The keywords are shared in different documents of one topic.
2. The more important the keyword is, the more frequent it appears.
![Page 33: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/33.jpg)
Matrix factorization models (again)
X = Z A
xi,j: the frequency word j appears in document zi,k: how much content in document i is related to topic k (e.g., biology, computer science, etc.)
Ak,j: how important word j to topic k
![Page 34: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/34.jpg)
Bayesian Matrix Factorization
• We will use Bayesian methods again to estimate Z and A.
• Once we can identify hidden topics by examining A and cluster documents.
![Page 35: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/35.jpg)
Text Clustering
• ‘20 newsgroup’ dataset• A subset of 815 articles and 477 words.
![Page 36: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/36.jpg)
Discovered hidden topics
![Page 37: Bayesian Machine learning and its application Alan Qi Feb. 23, 2009](https://reader034.vdocuments.us/reader034/viewer/2022051517/5697bfe81a28abf838cb6358/html5/thumbnails/37.jpg)
Summary
• Bayesian machine learning: A powerful tool enables computers to learn hidden relations from massive data and make sensible predictions.
• Applications in computational biology, e.g., gene expression analysis and motif discovery, and information extraction, e.g., text modeling.