biomedical applications and the probabilistic framework · probabilistic kalman filter observation...
TRANSCRIPT
![Page 1: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/1.jpg)
Biomedical Applications and theProbabilistic Framework
Peter Sykacek
http://www.sykacek.net
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.1/23
![Page 2: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/2.jpg)
Talk Overview
� Motivation of Probabilistic Concepts
� BCI, current practice & shortcomings
� Probabilistic Kalman Filter
� Adaptive BCI
� Gene Discovery
� DAG for Bayesian Marker Identification
� Gene Selection
� Discussion of Model Selection
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.2/23
![Page 3: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/3.jpg)
Probabilistic Motivations
Thomas Bayes (1701 - 1763)Learning from data using adecision theoretic framework
First consequence: wemust revise beliefs ac-cording to Bayes theorem
, where
.
Second consequence: De-cisions by maximising ex-pected utilities
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.3/23
![Page 4: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/4.jpg)
Probabilistic Motivations
Thomas Bayes (1701 - 1763)Learning from data using adecision theoretic framework
� ��� � � � � � �� � ��
� �
First consequence: wemust revise beliefs ac-cording to Bayes theorem
, where
.
Second consequence: De-cisions by maximising ex-pected utilities
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.3/23
![Page 5: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/5.jpg)
Probabilistic Motivations
Thomas Bayes (1701 - 1763)Learning from data using adecision theoretic framework
� ��� � � � � � �� � ��
� �
First consequence: wemust revise beliefs ac-cording to Bayes theorem
��� � � � �� � � � � � � � , where
� � � � � ��
� ��� � � �� � � .
Second consequence: De-cisions by maximising ex-pected utilities
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.3/23
![Page 6: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/6.jpg)
Brain Computer Interface
Computer is controlled directly by cortical activity.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.4/23
![Page 7: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/7.jpg)
Classification of BCIs
(mostly motor tasks)involuntary events (P300)voluntary control
Scalp EEG intracranial EEG
Neural activity
Cognitive events intracranial EEG � � high spatial
and temporal resolution; highly inva-
sive!; allows 2-d control of artificial
limb.
surface EEG � � low spatial and
temporal resolution; no permanent in-
terference with patient; slow! at most
20 bit per minute and task.
� � focus on BCI’s based on scalp recordings.
� � low bit rates; last resort if no other communica-
tion possible
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.5/23
![Page 8: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/8.jpg)
BCI with almost no adaptation
� P
�� �
based: L. A. Farwell and E. Donchin, � �
User intention is embedded within a sequence ofsymbols. The correct symbol leads to “surprise”and triggers a P
�� �
.
� Filter & threshold: N. Birbaumer etal. , � �
threshold slow cortical potentials; J.R. Wolpawetal., � � threshold moving average in anappropriate pass band e.g. �-rhythm.
These principles rely mostly on user training.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.6/23
![Page 9: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/9.jpg)
BCI & static pattern recognition
� Extract representation of EEG “waveforms” (e.g.low pass filtered time series; spectralrepresentation)
� Parameterize supervised classification implicitlyassuming stationarity.
What ifTechnical setup changes during operation?
(e.g. electrolyte changes impedance)User learns from feedback?User shows fatigue?
Assuming stationarity must be wrong !
� � Probabilistic method for “adaptive” BCI.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.7/23
![Page 10: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/10.jpg)
Probabilistic Kalman Filter
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
n-1Λ
n-1w nw
λI
n-1
β α
w
observation n
yn
Key: get right (may regard as
learning rate)
0 200 400 600 800 10000
50
100
150
200
250
300
350
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
0 200 400 600 800 10000
0.2
0.4
0.6
0.8
1
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
Illustration of and “in-
stantaneous” generalization error for
B. D. Ripley’s synthetic data with ar-
tificial non-stationarity (swap labels
after sample 500).
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.8/23
![Page 11: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/11.jpg)
Probabilistic Kalman Filter
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
n-1Λ
n-1w nw
λI
n-1
β α
w
observation n
yn
Key: get
�
right (may regard
� � �
as
learning rate)
0 200 400 600 800 10000
50
100
150
200
250
300
350
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
0 200 400 600 800 10000
0.2
0.4
0.6
0.8
1
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
Illustration of and “in-
stantaneous” generalization error for
B. D. Ripley’s synthetic data with ar-
tificial non-stationarity (swap labels
after sample 500).
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.8/23
![Page 12: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/12.jpg)
Probabilistic Kalman Filter
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
n-1Λ
n-1w nw
λI
n-1
β α
w
observation n
yn
Key: get
�
right (may regard� � �
as
learning rate)
Classification: Non linear and non
Gaussian, some eqns.
0 200 400 600 800 10000
50
100
150
200
250
300
350
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
0 200 400 600 800 10000
0.2
0.4
0.6
0.8
1
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
Illustration of and “in-
stantaneous” generalization error for
B. D. Ripley’s synthetic data with ar-
tificial non-stationarity (swap labels
after sample 500).
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.8/23
![Page 13: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/13.jpg)
Probabilistic Kalman Filter
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �� � � � � �
n-1Λ
n-1w nw
λI
n-1
β α
w
observation n
yn
Key: get
�
right (may regard� � �
as
learning rate)
Classification: Non linear and non
Gaussian, some eqns.
0 200 400 600 800 10000
50
100
150
200
250
300
350
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
0 200 400 600 800 10000
0.2
0.4
0.6
0.8
1
Simulations using σλ=1e+003
window sz. 1window sz. 5window sz. 10window sz. 15window sz. 20
Illustration of � � � and “in-
stantaneous” generalization error for
B. D. Ripley’s synthetic data with ar-
tificial non-stationarity (swap labels
after sample 500).
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.8/23
![Page 14: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/14.jpg)
Adaptive BCIby variational Kalman filtering.BCI: data driven prediction of cognitive state from EEG measurements.
Working hypothesis: EEG dynamics during a cognitive task are subject to
temporal variation (learning effects, fatigue ...)
Represent EEG segments by z-transformed reflection coefficients.
Mutual information, adaptive method and identical “stationary” model.
0 0.5 10
0.2
0.4
0.6
0.8
1
navigation/auditory
DP(y| z)||P(y)
in bit
empi
rical
cdf
0 0.5 10
0.2
0.4
0.6
0.8
1
navigation/movement
DP(y| z)||P(y)
in bit
empi
rical
cdf
0 0.5 10
0.2
0.4
0.6
0.8
1
auditory/movement
DP(y| z)||P(y)
in bit
empi
rical
cdf
0 0.5 1 1.50
0.2
0.4
0.6
0.8
1
navig./audit./move, 3 class
DP(y| z)||P(y)
in bit
empi
rical
cdf
0 0.5 10
0.2
0.4
0.6
0.8
1
rest/move no feedback
DP(y| z)||P(y)
in bit
empi
rical
cdf
0 0.5 10
0.2
0.4
0.6
0.8
1
rest/move feedback
DP(y| z)||P(y)
in bit
empi
rical
cdf
0 0.5 10
0.2
0.4
0.6
0.8
1
move/math no feedback
DP(y| z)||P(y)
in bit
empi
rical
cdf
0 0.5 10
0.2
0.4
0.6
0.8
1
move/math feedback
DP(y| z)||P(y)
in bit
empi
rical
cdf
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.9/23
![Page 15: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/15.jpg)
Communication Bandwidthbit rates � � ��� � [bit/s]
task vkf vsi
��� rest/move no fb.
� � �� � � � � � � � � �
rest/move fb.
� � �� � � �� � � � � �
move/math no fb.
� � �� � � � � � � � � �
move/math fb.
� � �� � � � � � � � � �
nav./aud./move
� � � � � � �� � � � � �
audit./move� � � � � � � � � � � � �
navig./move� � �� � �� � � � � � �
navig./audit.
� � � � � � � � � � � � �
Conclusion: adaptive methods increase BCI band-
widths even on short time scales.jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.10/23
![Page 16: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/16.jpg)
Gene discoveryDiscovering “important” genes (or proteins) from
microarray datasets can be classified as
� Identification of all differentially expressedgenes.
� Identification of reliable (sets) of marker genes.
Current practise for the first: classical methods (e.g.t-test on differences of means) or probabilisticapproaches with one indicator variable for each gene.
The second is typically done by conventional featuresubset selection. As a result we obtain a set of genesthat was found by heuristic search.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.11/23
![Page 17: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/17.jpg)
Bayesian Marker IdentificationMissing in FSS: How good are other explanations?
Interpret microarray data as classification problem of“genetic” regressors w.r.t. a discrete response.
� � Bayesian variable selection provides thisinformation. However: hopeless, unless we constrainthe dimensionality.
Simplified attempt: � � Find distribution overindividual genes.
Probabilities result from the marginal likelihood ofeach model.
� � � � � � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � �� �
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.12/23
![Page 18: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/18.jpg)
DAG for Marker IdentificationI
� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �
� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �
� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �� � � � � � � � �
� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �
� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �
� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �
� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �� � � � � � � �
0Λ z
ny
n
observation n
nxw
Π Latent variable probit GLM.
� is a one dimensional Gaus-sian random variable with mean
� � � and precision
.
��� � � ��� �
if �� � �
� if �
�
Inference can be done by a variational method(systematic error) or by sampling (random error). Thelatter allows to integrate over analytically and wedraw from and only.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.13/23
![Page 19: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/19.jpg)
DAG for Marker IdentificationI
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
� � � � � � � �
0Λ z
ny
n
observation n
nxw
Π Latent variable probit GLM.
� is a one dimensional Gaus-sian random variable with mean
� � � and precision
.
��� � � ��� �
if �� � �
� if �
�
Inference can be done by a variational method(systematic error) or by sampling (random error). Thelatter allows to integrate over � analytically and wedraw from � and
�only.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.13/23
![Page 20: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/20.jpg)
Asymptotic Behaviour
0 50 100 150 200
0
0.2
0.4
0.6
0.8
1
Cum
ulat
ive
Pro
babi
lity
Gene index
Bayes Posterior, Leukaemia 3 slides18 slides38 slides
0 50 100 150 200
0
0.02
0.04
0.06
0.08
0.1
Asc
endi
ng p
−va
l.
Gene index
Test based ranking, Leukaemia 3 slides18 slides38 slides
0 50 100 150 200
0
0.2
0.4
0.6
0.8
1
Cum
ulat
ive
Pro
babi
lity
Gene index
Bayes Posterior, Colon cancer 3 slides30 slides62 slides
0 50 100 150 200
0
0.02
0.04
0.06
0.08
0.1
Asc
endi
ng p
−va
l.
Gene index
Test based ranking, Colon cancer 3 slides30 slides62 slides
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.14/23
![Page 21: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/21.jpg)
Comparison with ML
0 50 100 150 200
0
0.2
0.4
0.6
0.8
1
Cum
ulat
ive
Pro
babi
lity
Gene index
Leukaemia
Bayesian prob.Max. likelihood
0 50 100 150 200
0
0.2
0.4
0.6
0.8
1
Cum
ulat
ive
Pro
babi
lity
Gene index
Colon cancer
Bayesian prob.Max. likelihood
Results differ since Bayesian model posteriors take“complexity” (ref. Hochreiter’s “flat minima”) intoaccount.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.15/23
![Page 22: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/22.jpg)
Selection and Gen. Accuracy
Most probable regressorsselected at a
��
� �
threshold
Acc. no. description
� �� �� �
Colon Cancer (Alon et. al.)
Z50753 Uroguanylin 0.76
R87126 Myosin 0.21
M63391 desmin gene 0.01
M36634 vasoact. pept. 0.01
Leukaemia (Golub et. al.)
X95735 Zyxin 0.93
M55150 FAH Fumarylac. 0.05
M27891 CST3 Cystatin C 0.01
Generalization accuracy
Dataset B. probit “indifference”
Colon 84% 74% to 94%
Leukaemia 88% 91% to 96%
No “better” results in literature �
confirms model.
Biology confirms Uroguanylin (cell
apoptosis) as important in colon can-
cer development.
But: Meaning of the prob-abilities?
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.16/23
![Page 23: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/23.jpg)
Selection and Gen. Accuracy
Most probable regressorsselected at a
��
� �
threshold
Acc. no. description
� �� �� �
Colon Cancer (Alon et. al.)
Z50753 Uroguanylin 0.76
R87126 Myosin 0.21
M63391 desmin gene 0.01
M36634 vasoact. pept. 0.01
Leukaemia (Golub et. al.)
X95735 Zyxin 0.93
M55150 FAH Fumarylac. 0.05
M27891 CST3 Cystatin C 0.01
Generalization accuracy
Dataset B. probit “indifference”
Colon 84% 74% to 94%
Leukaemia 88% 91% to 96%
No “better” results in literature �
confirms model.
Biology confirms Uroguanylin (cell
apoptosis) as important in colon can-
cer development.
But: Meaning of the prob-abilities?
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.16/23
![Page 24: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/24.jpg)
DiscussionQuoting
� � � �� � -closed model selection
with zero-one utility.
Our approach should assume an -open scenario.Under asymptotic normality,
� � � � � �
degenerates on��� � �
that minimizes� ��� �� �� ���� � � �� �� � �� �� � � � �� � ).
If the predictive distribution of a new observation is ofinterest, B&S’s suggest to use a logarithmic scorefunction for -open model comparison.
��� � � � ��� � ��� � � � �� � �� �
(e.g. cross validation estimate, still to be done)
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.17/23
![Page 25: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/25.jpg)
A simple idea:the world is one probabilistic model.
� Applications often require hierarchical structure:a feature extraction part and a probabilisticmodel.
� Classical approach: treat both parts separatelyand thus regard features as sufficient statistic ofthe data. � � Features are deterministicvariables.
� Our suggestion: treat such hierarchical settings asone probabilistic model. � � Feature extractionis a representation in a latent space.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.18/23
![Page 26: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/26.jpg)
Bayes’ Consistent Models
ϕϕI
baI
a
a b
b
X X
t
Expected utility requiresto integrate over all un-known variables, includ-ing , , and thatrepresent a feature space.
−6 −5 −4 −3 −2 −1 0 1 20.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
Precision of p(φa|X
a) on logarithmic scale
Pos
terio
r pr
obab
ilitie
s
Probabilistic sensor fusion
P(2) − latent feauturesP(2) − conditioning
Decisions depend on(un)certainty and maythus change.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.19/23
![Page 27: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/27.jpg)
Bayes’ Consistent Models
ϕϕI
baI
a
a b
b
X X
t
Expected utility requiresto integrate over all un-known variables, includ-ing ��� , ��� ,
�� and
�� thatrepresent a feature space.
−6 −5 −4 −3 −2 −1 0 1 20.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
Precision of p(φa|X
a) on logarithmic scale
Pos
terio
r pr
obab
ilitie
s
Probabilistic sensor fusion
P(2) − latent feauturesP(2) − conditioning
Decisions depend on(un)certainty and maythus change.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.19/23
![Page 28: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/28.jpg)
Bayes’ Consistent Models
ϕϕI
baI
a
a b
b
X X
t
Expected utility requiresto integrate over all un-known variables, includ-ing ��� , ��� ,
�� and
�� thatrepresent a feature space.
−6 −5 −4 −3 −2 −1 0 1 20.25
0.3
0.35
0.4
0.45
0.5
0.55
0.6
0.65
0.7
Precision of p(φa|X
a) on logarithmic scale
Pos
terio
r pr
obab
ilitie
s
Probabilistic sensor fusion
P(2) − latent feauturesP(2) − conditioning
Decisions depend on(un)certainty and maythus change.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.19/23
![Page 29: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/29.jpg)
Time Series ClassificationROC Curves
0 0.5 1
0
0.2
0.4
0.6
0.8
1sen
sitivity
Navig. vs. Aud.
1 − specifity
Bayes cond.
0 0.5 1
0
0.2
0.4
0.6
0.8
1
sensiti
vity
Left vs. Right
1 − specifity
Bayes cond.
0 0.5 1
0
0.2
0.4
0.6
0.8
1
sensiti
vity
Spindle
1 − specifity
Bayes cond.
0 0.5 1
0
0.2
0.4
0.6
0.8
1
sensiti
vity
Synthetic
1 − specifity
Bayes cond.
Kullback Leibler Divergence
0 0.2 0.4 0.6 0.8
0
0.2
0.4
0.6
0.8
1
empiri
cal cd
f.
Navig. vs. Aud.
DP(t|x)||P(t)
bit
Bayes cond.
0 0.2 0.4 0.6 0.8
0
0.2
0.4
0.6
0.8
1
empiri
cal cd
f.
Left vs. Right
DP(t|x)||P(t)
bit
Bayes cond.
0 1 2
0
0.2
0.4
0.6
0.8
1
empiri
cal cd
f.
Spindle
DP(t|x)||P(t)
bit
Bayes cond.
0 0.5 1 1.5
0
0.2
0.4
0.6
0.8
1
empiri
cal cd
f.
Synthetic
DP(t|x)||P(t)
bit
Bayes cond.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.20/23
![Page 30: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/30.jpg)
More ResultsExpected feature values
−2 −1.5 −1
0
0.2
0.4
0.6
0.8
1
Bayes Spindle
dim. 2
dim. 1
−2 −1.5 −1
0
0.2
0.4
0.6
0.8
1
cond. Spindle
dim. 2
dim. 1
−1 −0.5 0
0
0.2
0.4
0.6
Bayes Synthetic
dim. 3
dim. 2
−1 −0.5 0
0
0.2
0.4
0.6
cond. Synthetic
dim. 3
dim. 2
−2 −1.5 −10
0.1
0.2
0.3
0.4
0.5
Bayes Navig. vs. Aud.
dim. 2
dim. 1
−2 −1.5 −10
0.1
0.2
0.3
0.4
0.5
cond. Navig. vs. Aud.
dim. 2
dim. 1
Kullback Leibler Divergence for “Artefacts”
0 0.2 0.4 0.6
0
0.2
0.4
0.6
0.8
1
empiri
cal cd
f.
Correct classifications
DP(t|x)||P(t)
bit
Bayescond.
0 0.2 0.4 0.6
0
0.2
0.4
0.6
0.8
1
empiri
cal cd
f.
Wrong classifications
DP(t|x)||P(t)
bit
Bayescond.
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.21/23
![Page 31: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/31.jpg)
Variational Kalman FilterThe logarithmic model evidence for a window of size
�
is
��� � � � ��� � � � �� � ��
��� �
� �
� � � � � � � � � �
� � �� � � ��� �� � � �� � � � � � � � � � � � � �� � � ��� � � � � � �
�
This is not a probabilistic structure! (need Rauch Tung Striebel smoother)
Plug in distributions and integrate over :
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.22/23
![Page 32: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/32.jpg)
Variational Kalman FilterThe logarithmic model evidence for a window of size
�
is
��� � � � ��� � � � �� � ��
��� �
� �
� � � � � � � � � �
� � �� � � ��� �� � � �� � � � � � � � � � � � � �� � � ��� � � � � � �
�
This is not a probabilistic structure! (need Rauch Tung Striebel smoother)
Plug in distributions and integrate over � � � :
��� � � � ��� � � � �� � ��
��� �
�
�� � �� � � � �� �� � � � �� � � �� � �
� �� � � � � � � � � � � � � � � � �� �� � � � �� � � �� � � � � � � � � � �
� � � � �� � � �� � � � � � � � � � �� � � �
�
�
���� � � �� � � � �� � � � � � �� � �
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.22/23
![Page 33: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/33.jpg)
Lower Bounds
�� � � � ��� � � � � � � � � � ��� � � � � � � � �
� � �� � �� � � ��� � ���� � � � � ��
� �
���� � � � �
� � �
� � ��
�� � ��
back to vkf
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.23/23
![Page 34: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/34.jpg)
Lower Bounds
�� � � � ��� � � � � � � � � � ��� � � � � � � � �
� � �� � �� � � ��� � ���� � � � � ��
� �
���� � � � �
� � �
� � ��
�� � ��
� � � � �� � � �� �� � � � �� � � � �
��
�� � � ��
���� � ��� �� �� � � �
��
�� � � � � ��� �� � � � � �� �
�
back to vkf
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.23/23
![Page 35: Biomedical Applications and the Probabilistic Framework · Probabilistic Kalman Filter observation n n y n-1 Λ wn-1 wn λI n-1 β α w Key: get right (may regard as learning rate)](https://reader033.vdocuments.us/reader033/viewer/2022060703/606fe4137edd2d5bf60a74be/html5/thumbnails/35.jpg)
Lower Bounds
�� � � � ��� � � � � � � � � � ��� � � � � � � � �
� � �� � �� � � ��� � ���� � � � � ��
� �
���� � � � �
� � �
� � ��
�� � ��
� � � � �� � � �� �� � � � �� � � � �
��
�� � � ��
���� � ��� �� �� � � �
��
�� � � � � ��� �� � � � � �� �
�
� � � � � � � � � � � � � �� �� � � � �� � � �� � � � � � � � � � �
� � � � � � � � � � � � � �� �� � � � � � � � �� � � � � � � � � �
� � � � � � � � � � � � � � � � � �� �� �� � � � � �� � � � � � � � � �
back to vkf
jump 2 TOC Probabilistic Methods with Applications, Peter Sykacek, 2004 – p.23/23