gp cvpr12 session1
Post on 02-Jun-2018
227 Views
Preview:
TRANSCRIPT
-
8/11/2019 Gp Cvpr12 Session1
1/302
-
8/11/2019 Gp Cvpr12 Session1
2/302
Book
?
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 2 / 74
-
8/11/2019 Gp Cvpr12 Session1
3/302
Outline
1 The Gaussian Density
2 Covariance from Basis Functions
3
Basis Function Representations
4 Constructing Covariance
5 GP Limitations
6 Conclusions
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 3 / 74
-
8/11/2019 Gp Cvpr12 Session1
4/302
Outline
1 The Gaussian Density
2 Covariance from Basis Functions
3 Basis Function Representations
4 Constructing Covariance
5 GP Limitations
6 Conclusions
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 4 / 74
-
8/11/2019 Gp Cvpr12 Session1
5/302
The Gaussian Density
Perhaps the most common probability density.
p(y|, 2) =
1
22exp
(y
)2
22
=Ny|, 2The Gaussian density.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 5 / 74
-
8/11/2019 Gp Cvpr12 Session1
6/302
Gaussian Density
0
1
2
3
0 1 2
p(h
|
,
2)
h, height/m
The Gaussian PDF with = 1.7 and variance 2 = 0.0225. Mean shownas red line. It could represent the heights of a population of students.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 6 / 74
-
8/11/2019 Gp Cvpr12 Session1
7/302
Gaussian Density
Ny|, 2= 122
exp(y )2
22
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 7 / 74
-
8/11/2019 Gp Cvpr12 Session1
8/302
Two Important Gaussian Properties
1 Sum of Gaussian variables is also Gaussian.
yi Ni, 2i n
i=1
yi N n
i=1
i,n
i=1
2i(Aside: As sum increases, sum of non-Gaussian, finite variancevariables is also Gaussian [central limit theorem].)
2 Scaling a Gaussian leads to a Gaussian.
y N, 2
wy Nw, w22
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 8 / 74
-
8/11/2019 Gp Cvpr12 Session1
9/302
Two Important Gaussian Properties
1 Sum of Gaussian variables is also Gaussian.
yi Ni, 2i n
i=1
yi N n
i=1
i,n
i=1
2i(Aside: As sum increases, sum of non-Gaussian, finite variancevariables is also Gaussian [central limit theorem].)
2 Scaling a Gaussian leads to a Gaussian.
y N, 2
wy Nw, w22
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 8 / 74
-
8/11/2019 Gp Cvpr12 Session1
10/302
Two Important Gaussian Properties
1 Sum of Gaussian variables is also Gaussian.
yi Ni, 2i n
i=1
yi N n
i=1
i,n
i=1
2i(Aside: As sum increases, sum of non-Gaussian, finite variancevariables is also Gaussian [central limit theorem].)
2 Scaling a Gaussian leads to a Gaussian.
y N, 2
wy N
w, w22
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 8 / 74
-
8/11/2019 Gp Cvpr12 Session1
11/302
Two Important Gaussian Properties
1 Sum of Gaussian variables is also Gaussian.
yi Ni, 2i n
i=1
yi N n
i=1
i,n
i=1
2i(Aside: As sum increases, sum of non-Gaussian, finite variancevariables is also Gaussian [central limit theorem].)
2 Scaling a Gaussian leads to a Gaussian.
y N, 2
wy N
w, w22
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 8 / 74
-
8/11/2019 Gp Cvpr12 Session1
12/302
Two Important Gaussian Properties
1 Sum of Gaussian variables is also Gaussian.
yi Ni, 2i n
i=1
yi N n
i=1
i,n
i=1
2i(Aside: As sum increases, sum of non-Gaussian, finite variancevariables is also Gaussian [central limit theorem].)
2 Scaling a Gaussian leads to a Gaussian.
y N, 2
wy N
w, w22
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 8 / 74
-
8/11/2019 Gp Cvpr12 Session1
13/302
-
8/11/2019 Gp Cvpr12 Session1
14/302
Two Simultaneous Equations
A system of two differentialequations with two unknowns.
y1 =mx1+c
y2 =mx2+c
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 9 / 74
-
8/11/2019 Gp Cvpr12 Session1
15/302
Two Simultaneous Equations
A system of two differentialequations with two unknowns.
y1 y2=m(x1 x2)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 9 / 74
-
8/11/2019 Gp Cvpr12 Session1
16/302
Two Simultaneous Equations
A system of two differentialequations with two unknowns.
y1 y2x1 x2 =m
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 9 / 74
-
8/11/2019 Gp Cvpr12 Session1
17/302
Two Simultaneous Equations
A system of two differentialequations with two unknowns.
m= y2 y1x2 x1
c=y1 mx10
1
2
3
4
5
0 1 2 3
y
x
c
y1
y2
x2 x1
m = y2y1
x2x1
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 9 / 74
S
-
8/11/2019 Gp Cvpr12 Session1
18/302
Two Simultaneous Equations
How do we deal with threesimultaneous equations with only twounknowns?
y1 =mx1+c
y2 =mx2+c
y3 =mx3+c 0
1
2
3
4
5
0 1 2 3
y
x
c
y1
y2
x2 x1
m = y2y1
x2x1
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 9 / 74
O d i d S
-
8/11/2019 Gp Cvpr12 Session1
19/302
Overdetermined System
With two unknowns and two observations:
y1 =mx1+cy2 =mx2+c
Additional observation leads to overdeterminedsystem.
y3 =mx3+c
This problem is solved through a noise model
N0, 2
y1 =mx1+c+1
y2 =mx2+c+2
y3 =mx3+c+3
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 10 / 74
O d i d S
-
8/11/2019 Gp Cvpr12 Session1
20/302
Overdetermined System
With two unknowns and two observations:
y1 =mx1+cy2 =mx2+c
Additional observation leads to overdeterminedsystem.
y3 =mx3+c
This problem is solved through a noise model
N0, 2
y1 =mx1+c+1
y2 =mx2+c+2
y3 =mx3+c+3
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 10 / 74
O d t i d S t
-
8/11/2019 Gp Cvpr12 Session1
21/302
Overdetermined System
With two unknowns and two observations:
y1 =mx1+cy2 =mx2+c
Additional observation leads to overdeterminedsystem.
y3 =mx3+c
This problem is solved through a noise model
N0, 2
y1 =mx1+c+1
y2 =mx2+c+2
y3 =mx3+c+3
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 10 / 74
N i M d l
-
8/11/2019 Gp Cvpr12 Session1
22/302
Noise Models
We arent modeling entire system.
Noise model gives mismatch between model and data.
Gaussian model justified by appeal to central limit theorem.Other models also possible (Student-tfor heavy tails).
Maximum likelihood with Gaussian noise leads to least squares.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 11 / 74
U d d t i d S st
-
8/11/2019 Gp Cvpr12 Session1
23/302
Underdetermined System
What about two unknowns and oneobservation?
y1 =mx1+c
0
1
2
3
4
5
0 1 2 3
y
x
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 12 / 74
Underdetermined System
-
8/11/2019 Gp Cvpr12 Session1
24/302
Underdetermined System
Can compute m given c.
m= y1 cx
0
1
2
3
4
5
0 1 2 3
y
x
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 12 / 74
Underdetermined System
-
8/11/2019 Gp Cvpr12 Session1
25/302
Underdetermined System
Can compute m given c.
c= 1.75 = m= 1.25
0
1
2
3
4
5
0 1 2 3
y
x
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 12 / 74
Underdetermined System
-
8/11/2019 Gp Cvpr12 Session1
26/302
Underdetermined System
Can compute m given c.
c= 0.777 = m= 3.78
0
1
2
3
4
5
0 1 2 3
y
x
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 12 / 74
Underdetermined System
-
8/11/2019 Gp Cvpr12 Session1
27/302
Underdetermined System
Can compute m given c.
c= 4.01 = m= 7.01
0
1
2
3
4
5
0 1 2 3
y
x
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 12 / 74
-
8/11/2019 Gp Cvpr12 Session1
28/302
Underdetermined System
-
8/11/2019 Gp Cvpr12 Session1
29/302
Underdetermined System
Can compute m given c.
c= 2.45 = m= 0.545
0
1
2
3
4
5
0 1 2 3
y
x
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 12 / 74
Underdetermined System
-
8/11/2019 Gp Cvpr12 Session1
30/302
Underdetermined System
Can compute m given c.
c= 0.657 = m= 3.66
0
1
2
3
4
5
0 1 2 3
y
x
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 12 / 74
Underdetermined System
-
8/11/2019 Gp Cvpr12 Session1
31/302
Underdetermined System
Can compute m given c.
c= 3.13 = m= 6.13
0
1
2
3
4
5
0 1 2 3
y
x
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 12 / 74
-
8/11/2019 Gp Cvpr12 Session1
32/302
Underdetermined System
-
8/11/2019 Gp Cvpr12 Session1
33/302
U Sy
Can compute m given c.Assume
c N(0, 4) ,
we find a distribution of solutions.0
1
2
3
4
5
0 1 2 3
y
x
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 12 / 74
Probability for Under- and Overdetermined
-
8/11/2019 Gp Cvpr12 Session1
34/302
y
To deal with overdetermined introduced probability distribution forvariable,i.
For underdetermined system introduced probability distribution forparameter,c.
This is known as a Bayesian treatment.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 13 / 74
-
8/11/2019 Gp Cvpr12 Session1
35/302
For general Bayesian inference need multivariate priors.E.g. for multivariate linear regression:
yi=
i
wjxi,j+i
(where weve dropped c for convenience), we need a prior over w.This motivates a multivariateGaussian density.
We will use the multivariate Gaussian to put a prior directlyon thefunction (a Gaussian process).
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 14 / 74
-
8/11/2019 Gp Cvpr12 Session1
36/302
For general Bayesian inference need multivariate priors.E.g. for multivariate linear regression:
yi =wxi,:+i
(where weve dropped c for convenience), we need a prior over w.This motivates a multivariateGaussian density.
We will use the multivariate Gaussian to put a prior directlyon thefunction (a Gaussian process).
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 14 / 74
Multivariate Regression Likelihood
-
8/11/2019 Gp Cvpr12 Session1
37/302
Recall multivariate regression likelihood:
p(y|X, w) = 1(22)n/2
exp
1
22
ni=1
yiwxi,:
2
Now use a multivariate Gaussian prior:
p(w) = 1
(2)p2
exp 1
2ww
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 15 / 74
Multivariate Regression Likelihood
-
8/11/2019 Gp Cvpr12 Session1
38/302
Recall multivariate regression likelihood:
p(y|X, w) = 1(22)n/2
exp
1
22
ni=1
yiwxi,:
2
Now use a multivariate Gaussian prior:
p(w) = 1
(2)p2
exp 1
2ww
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 15 / 74
Posterior Density
-
8/11/2019 Gp Cvpr12 Session1
39/302
Once again we want to know the posterior:
p(w|y,X) p(y|X, w)p(w)
And we can compute by completing the square.
log p(w|y, X) = 1
22
ni=1
y2i + 1
2
ni=1
yixi,:w
122
ni=1
wxi,:x
i,:w 1
2ww+ const.
p(w|y,X) =N(w|w, Cw)Cw = (
2XX +1)1 and w =Cw2Xy
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 16 / 74
Posterior Density
-
8/11/2019 Gp Cvpr12 Session1
40/302
Once again we want to know the posterior:
p(w|y,X) p(y|X, w)p(w)
And we can compute by completing the square.
log p(w|y, X) = 1
22
ni=1
y2i + 1
2
ni=1
yix
i,:w
122
ni=1
wxi,:x
i,:w 1
2ww+ const.
p(w|y,X) =N(w|w, Cw)Cw = (
2XX +1)1 and w =Cw2Xy
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 16 / 74
Bayesian vs Maximum Likelihood
-
8/11/2019 Gp Cvpr12 Session1
41/302
Note the similarity between posterior mean
w = (2XX +1)12Xy
and Maximum likelihood solution
w= (XX)1Xy
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 17 / 74
Marginal Likelihood is Computed as Normalizer
-
8/11/2019 Gp Cvpr12 Session1
42/302
p(w|y,X)p(y
|X) =p(y
|w,X)p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 18 / 74
-
8/11/2019 Gp Cvpr12 Session1
43/302
Two Dimensional Gaussian
-
8/11/2019 Gp Cvpr12 Session1
44/302
Consider height, h/m and weight, w/kg.
Could sample height from a distribution:
p(h)
N(1.7, 0.0225)
And similarly weight:
p(w) N(75, 36)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 20 / 74
Height and Weight Models
-
8/11/2019 Gp Cvpr12 Session1
45/302
p(h)
h/m
Marginal Distributions
p(w)
w/kg Gaussian
distributions for height and weight.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 21 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
46/302
w
/kg
h/
m
Joint Distribution
p(h)
Marginal Distributions
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
-
8/11/2019 Gp Cvpr12 Session1
47/302
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
48/302
w
/kg
h/m
Joint Distribution
p(h)
Marginal Distributions
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
-
8/11/2019 Gp Cvpr12 Session1
49/302
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
50/302
w
/kg
h/m
Joint Distribution
p(h)
Marginal Distributions
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
51/302
w
/kg
h/m
Joint Distribution
p(h)
Marginal Distributions
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
52/302
w
/kg
h/m
Joint Distribution
p(h)
g
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
53/302
w/kg
h/m
Joint Distribution
p(h)
g
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
-
8/11/2019 Gp Cvpr12 Session1
54/302
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
55/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
56/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
57/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
58/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
-
8/11/2019 Gp Cvpr12 Session1
59/302
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
60/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
61/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
62/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
63/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
-
8/11/2019 Gp Cvpr12 Session1
64/302
-
8/11/2019 Gp Cvpr12 Session1
65/302
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
66/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
67/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
68/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Sample height and weight one after the other and plot against each other.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 22 / 74
-
8/11/2019 Gp Cvpr12 Session1
69/302
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
70/302
w
/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
71/302
w
/kg
h/
m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
-
8/11/2019 Gp Cvpr12 Session1
72/302
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
73/302
w
/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
-
8/11/2019 Gp Cvpr12 Session1
74/302
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
75/302
w
/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
76/302
w
/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
77/302
w
/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
78/302
w
/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
79/302
w
/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
80/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
81/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
82/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
83/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
-
8/11/2019 Gp Cvpr12 Session1
84/302
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
85/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
86/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
87/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
88/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
89/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
-
8/11/2019 Gp Cvpr12 Session1
90/302
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
91/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Sampling Two Dimensional VariablesMarginal Distributions
-
8/11/2019 Gp Cvpr12 Session1
92/302
w/kg
h/m
Joint Distribution
p(h)
p(w)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 24 / 74
Independent Gaussians
-
8/11/2019 Gp Cvpr12 Session1
93/302
p(w, h) =p(w)p(h)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 25 / 74
Independent Gaussians
-
8/11/2019 Gp Cvpr12 Session1
94/302
p(w, h) = 1
221222exp
1
2
(w 1)2
21+
(h 2)222
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 25 / 74
Independent Gaussians
-
8/11/2019 Gp Cvpr12 Session1
95/302
p(w, h) = 1
2
2122
exp1
2w
h
1
2
21 0
0 221
wh
1
2
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 25 / 74
Independent Gaussians
-
8/11/2019 Gp Cvpr12 Session1
96/302
p(y) = 1
2 |D| exp1
2(y )D1(y )
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 25 / 74
-
8/11/2019 Gp Cvpr12 Session1
97/302
Correlated Gaussian
-
8/11/2019 Gp Cvpr12 Session1
98/302
Form correlated from original by rotating the data space using matrix R.
p(y) = 1
2
|D
|
12
exp
1
2(Ry R)D1(Ry R)
U t d L () S i 1 GP d R i CVPR T t i l 26 / 74
Correlated Gaussian
-
8/11/2019 Gp Cvpr12 Session1
99/302
Form correlated from original by rotating the data space using matrix R.
p(y) = 1
2
|D
|
12
exp
1
2(y )RD1R(y )
this gives a covariance matrix:
C1 =RD1R
U t d L () S i 1 GP d R i CVPR T t i l 26 / 74
Correlated Gaussian
-
8/11/2019 Gp Cvpr12 Session1
100/302
Form correlated from original by rotating the data space using matrix R.
p(y) = 1
2
|C
|
12
exp
1
2(y )C1(y )
this gives a covariance matrix:
C= RDR
U t d L () S i 1 GP d R i CVPR T t i l 26 / 74
Recall Univariate Gaussian Properties
1 Sum of Gaussian variables is also Gaussian.
-
8/11/2019 Gp Cvpr12 Session1
101/302
yi Ni, 2i n
i=1yi N
n
i=1i,
n
i=12i
2 Scaling a Gaussian leads to a Gaussian.
y N, 2
wy Nw, w22
U d L () S i 1 GP d R i CVPR T i l 27 / 74
Recall Univariate Gaussian Properties
1 Sum of Gaussian variables is also Gaussian.
-
8/11/2019 Gp Cvpr12 Session1
102/302
yi Ni, 2i n
i=1yi N
n
i=1i,
n
i=12i
2 Scaling a Gaussian leads to a Gaussian.
y N, 2
wy Nw, w22
U d L () S i 1 GP d R i CVPR T i l 27 / 74
Recall Univariate Gaussian Properties
1 Sum of Gaussian variables is also Gaussian.
-
8/11/2019 Gp Cvpr12 Session1
103/302
yi Ni, 2i n
i=1yi N
n
i=1i,
n
i=12i
2 Scaling a Gaussian leads to a Gaussian.
y N, 2
wy Nw, w22
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 27 / 74
-
8/11/2019 Gp Cvpr12 Session1
104/302
Recall Univariate Gaussian Properties
1 Sum of Gaussian variables is also Gaussian.
-
8/11/2019 Gp Cvpr12 Session1
105/302
yi Ni, 2i n
i=1yi N
n
i=1i,
n
i=12i
2 Scaling a Gaussian leads to a Gaussian.
y N, 2
wy Nw, w22
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 27 / 74
Multivariate Consequence
-
8/11/2019 Gp Cvpr12 Session1
106/302
If x N(,)
And
y=Wx
Theny NW, WW
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 28 / 74
Multivariate Consequence
-
8/11/2019 Gp Cvpr12 Session1
107/302
If x N(,)
And
y=Wx
Theny N
W, WW
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 28 / 74
-
8/11/2019 Gp Cvpr12 Session1
108/302
Sampling a Function
-
8/11/2019 Gp Cvpr12 Session1
109/302
Multi-variate Gaussians
We will consider a Gaussian with a particular structure of covariancematrix.
Generate a single sample from this 25 dimensional Gaussiandistribution, f= [f1, f2. . . f25].
We will plot these points against their index.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 29 / 74
Gaussian Distribution Sample
2 0.
1
-
8/11/2019 Gp Cvpr12 Session1
110/302
-2
-1
0
1
0 5 10 15 20 25
i(a) A 25 dimensional correlated randomvariable (values ploted against index)
j
i00.0.0.0.0.
0.0.0.
(b) colormap showing correlations betweendimensions.
Figure: A sample from a 25 dimensional Gaussian distribution.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 30 / 74
-
8/11/2019 Gp Cvpr12 Session1
111/302
Gaussian Distribution Sample
2
00.1
-
8/11/2019 Gp Cvpr12 Session1
112/302
-2
-1
0
1
0 5 10 15 20 25
i(a) A 25 dimensional correlated randomvariable (values ploted against index)
00.0.0.0.0.
0.0.
0.
(b) colormap showing correlations betweendimensions.
Figure: A sample from a 25 dimensional Gaussian distribution.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 30 / 74
Gaussian Distribution Sample
2
00.1
-
8/11/2019 Gp Cvpr12 Session1
113/302
-2
-1
0
1
0 5 10 15 20 25
i(a) A 25 dimensional correlated randomvariable (values ploted against index)
00.0.0.0.0.
0.0.
0.
(b) colormap showing correlations betweendimensions.
Figure: A sample from a 25 dimensional Gaussian distribution.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 30 / 74
Gaussian Distribution Sample
2
00.
1
-
8/11/2019 Gp Cvpr12 Session1
114/302
-2
-1
0
1
0 5 10 15 20 25
i(a) A 25 dimensional correlated randomvariable (values ploted against index)
00.0.0.0.0.
0.0.
0.
(b) colormap showing correlations betweendimensions.
Figure: A sample from a 25 dimensional Gaussian distribution.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 30 / 74
Gaussian Distribution Sample
2
00.1
-
8/11/2019 Gp Cvpr12 Session1
115/302
-2
-1
0
1
0 5 10 15 20 25
i(a) A 25 dimensional correlated randomvariable (values ploted against index)
00.0.0.0.0.
0.0.
0.
(b) colormap showing correlations betweendimensions.
Figure: A sample from a 25 dimensional Gaussian distribution.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 30 / 74
Gaussian Distribution Sample
2
0.0.
1
-
8/11/2019 Gp Cvpr12 Session1
116/302
-2
-1
0
1
0 5 10 15 20 25
i(a) A 25 dimensional correlated randomvariable (values ploted against index)
00.0.0.0.0.
0.0.
0.
(b) colormap showing correlations betweendimensions.
Figure: A sample from a 25 dimensional Gaussian distribution.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 30 / 74
-
8/11/2019 Gp Cvpr12 Session1
117/302
Prediction off2 from f1
1
-
8/11/2019 Gp Cvpr12 Session1
118/302
-1
0
-1 0 1
f2
1 0.96587
0.96587 1
The single contour of the Gaussian density represents thejoint
distribution,p
(f
1,f
2).We observe that f1 = 0.313.Conditional density: p(f2|f1 = 0.313).
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 31 / 74
Prediction off2 from f1
1
-
8/11/2019 Gp Cvpr12 Session1
119/302
-1
0
-1 0 1
f2
1 0.96587
0.96587 1
The single contour of the Gaussian density represents thejointdistribution, p(f
1, f
2).
We observe that f1 = 0.313.Conditional density: p(f2|f1 = 0.313).
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 31 / 74
Prediction off2 from f1
1
-
8/11/2019 Gp Cvpr12 Session1
120/302
-1
0
-1 0 1
f2
1 0.96587
0.96587 1
The single contour of the Gaussian density represents thejointdistribution, p(f
1, f
2).
We observe that f1 = 0.313.Conditional density: p(f2|f1 = 0.313).
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 31 / 74
-
8/11/2019 Gp Cvpr12 Session1
121/302
Prediction with Correlated Gaussians
Prediction off2 from f1 requires conditional density.
-
8/11/2019 Gp Cvpr12 Session1
122/302
Conditional density is alsoGaussian.
p(f2|f1) =N
f2|k1,2k1,1
f1, k2,2 k21,2
k1,1
where covariance of joint density is given by
K=
k1,1 k1,2k2,1 k2,2
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 32 / 74
Prediction off5 from f1
11 0.57375
-
8/11/2019 Gp Cvpr12 Session1
123/302
-1
0
-1 0 1
f5
0.57375 1
The single contour of the Gaussian density represents thejointdistribution, p(f
1, f
5).
We observe that f1 = 0.313.Conditional density: p(f5|f1 = 0.313).
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 33 / 74
Prediction off5 from f1
11 0.57375
-
8/11/2019 Gp Cvpr12 Session1
124/302
-1
0
-1 0 1
f5
0.57375 1
The single contour of the Gaussian density represents thejointdistribution, p(f
1, f
5).
We observe that f1 = 0.313.Conditional density: p(f5|f1 = 0.313).
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 33 / 74
Prediction off5 from f1
11 0.57375
-
8/11/2019 Gp Cvpr12 Session1
125/302
-1
0
-1 0 1
f5
0.57375 1
The single contour of the Gaussian density represents thejointdistribution, p(f1, f5).
We observe that f1 = 0.313.Conditional density: p(f5|f1 = 0.313).
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 33 / 74
Prediction off5 from f1
11 0.57375
-
8/11/2019 Gp Cvpr12 Session1
126/302
-1
0
-1 0 1
f5
0.57375 1
The single contour of the Gaussian density represents thejointdistribution, p(f1, f5).
We observe that f1 = 0.313.Conditional density: p(f5|f1 = 0.313).
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 33 / 74
Prediction with Correlated Gaussians
Prediction off from f requires multivariate conditional density.
M lti i t diti l d it i l G i
-
8/11/2019 Gp Cvpr12 Session1
127/302
Multivariate conditional density is alsoGaussian.
p(f|f) =N
f|K,fK1f,ff, K, K,fK1f,fKf,
Here covariance of joint density is given by
K= Kf,f K,fKf, K,
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 34 / 74
-
8/11/2019 Gp Cvpr12 Session1
128/302
Covariance FunctionsWhere did this covariance matrix come from?
Exponentiated Quadratic Kernel Function (RBF, SquaredExponential, Gaussian)
2
-
8/11/2019 Gp Cvpr12 Session1
129/302
k
x, x
= exp
x x
2222
Covariance matrix is builtusing the inputsto thefunction x.
For the example above itwas based on Euclidean
distance.The covariance function isalso know as a kernel.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 35 / 74
Covariance FunctionsWhere did this covariance matrix come from?
Exponentiated Quadratic Kernel Function (RBF, SquaredExponential, Gaussian)
x x2
-
8/11/2019 Gp Cvpr12 Session1
130/302
k
x, x
= exp
x x
2222
Covariance matrix is builtusing the inputsto thefunction x.
For the example above itwas based on Euclidean
distance.The covariance function isalso know as a kernel.
-3-2
-1
0
1
2
3
-1 -0.5 0 0.5 1
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 35 / 74
Covariance FunctionsWhere did this covariance matrix come from?
k(xi, xj) = exp
||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
131/302
x1 = 3.0, x2 = 1.20, and x3 = 1.40 with = 2.00 and = 1.00.
x1 = 3.0, x1 = 3.0
k1,1 = 1.00 exp (3.03.0)222.002
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
k(xi, xj) = exp
||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
132/302
1.00
x1 = 3.0, x2 = 1.20, and x3 = 1.40 with = 2.00 and = 1.00.
x1 = 3.0, x1 = 3.0
k1,1 = 1.00 exp (3.03.0)222.002
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
-
8/11/2019 Gp Cvpr12 Session1
133/302
-
8/11/2019 Gp Cvpr12 Session1
134/302
-
8/11/2019 Gp Cvpr12 Session1
135/302
Covariance FunctionsWhere did this covariance matrix come from?
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
136/302
1.00 0.110
0.110
x1 = 3.0, x2 = 1.20, and x3 = 1.40 with = 2.00 and = 1.00.
x2 = 1.20, x2 = 1.20
k2,2 = 1.00 exp (1.201.20)222.002
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
137/302
1.00 0.110
0.110 1.00
x1 = 3.0, x2 = 1.20, and x3 = 1.40 with = 2.00 and = 1.00.
x2 = 1.20, x2 = 1.20
k2,2 = 1.00 exp (1.201.20)222.002
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
-
8/11/2019 Gp Cvpr12 Session1
138/302
-
8/11/2019 Gp Cvpr12 Session1
139/302
Covariance FunctionsWhere did this covariance matrix come from?
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
140/302
1.00 0.110 0.0889
0.110 1.00
0.0889
x1 = 3.0, x2 = 1.20, and x3 = 1.40 with = 2.00 and = 1.00.
x3 = 1.40, x1 = 3.0
k3,1 = 1.00 exp (1.401.40)222.002
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
141/302
1.00 0.110 0.0889
0.110 1.00
0.0889
x1 = 3.0, x2 = 1.20, and x3 = 1.40 with = 2.00 and = 1.00.
x3 = 1.40, x2 = 1.20
k3,2 = 1.00 exp (1.401.40)222.002
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
142/302
1.00 0.110 0.0889
0.110 1.00
0.0889 0.995
x1 = 3.0, x2 = 1.20, and x3 = 1.40 with = 2.00 and = 1.00.
x3 = 1.40, x2 = 1.20
k3,2 = 1.00 exp (1.401.40)222.002
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
143/302
1.00 0.110 0.0889
0.110 1.00 0.995
0.0889 0.995
x1 = 3.0, x2 = 1.20, and x3 = 1.40 with = 2.00 and = 1.00.
x3 = 1.40, x2 = 1.20
k3,2 = 1.00 exp (1.401.40)222.002
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
144/302
1.00 0.110 0.0889
0.110 1.00 0.995
0.0889 0.995
x1 = 3.0, x2 = 1.20, and x3 = 1.40 with = 2.00 and = 1.00.
x3 = 1.40, x3 = 1.40
k3,3 = 1.00 exp (1.401.40)222.002
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
145/302
1.00 0.110 0.0889
0.110 1.00 0.995
0.0889 0.995 1.00
x1 = 3.0, x2 = 1.20, and x3 = 1.40 with = 2.00 and = 1.00.
x3 = 1.40, x3 = 1.40
k3,3 = 1.00 exp (1.401.40)222.002
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
-
8/11/2019 Gp Cvpr12 Session1
146/302
Covariance FunctionsWhere did this covariance matrix come from?
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
147/302
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x1 = 3, x1 = 3
k1,1 = 1.0 exp (33)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
1 0
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
148/302
1.0
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x1 = 3, x1 = 3
k1,1 = 1.0 exp (33)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
1 0
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
149/302
1.0
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x2 = 1.2, x1 = 3
k2,1 = 1.0 exp (1.21.2)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
1 0
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
150/302
1.0
0.11
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x2 = 1.2, x1 = 3
k2,1 = 1.0 exp (1.21.2)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
-
8/11/2019 Gp Cvpr12 Session1
151/302
-
8/11/2019 Gp Cvpr12 Session1
152/302
-
8/11/2019 Gp Cvpr12 Session1
153/302
-
8/11/2019 Gp Cvpr12 Session1
154/302
-
8/11/2019 Gp Cvpr12 Session1
155/302
Covariance FunctionsWhere did this covariance matrix come from?
1.0 0.11 0.089
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
156/302
1.0 0.11 0.089
0.11 1.0
0.089
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x3 = 1.4, x1 = 3
k3,1 = 1.0 exp (1.41.4)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
-
8/11/2019 Gp Cvpr12 Session1
157/302
Covariance FunctionsWhere did this covariance matrix come from?
1.0 0.11 0.089
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
158/302
0.11 1.0
0.089 1.0
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x3 = 1.4, x2 = 1.2
k3,2 = 1.0 exp (1.41.4)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
-
8/11/2019 Gp Cvpr12 Session1
159/302
Covariance FunctionsWhere did this covariance matrix come from?
1.0 0.11 0.089
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
160/302
0.11 1.0 1.0
0.089 1.0
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x3 = 1.4, x3 = 1.4
k3,3 = 1.0 exp (1.41.4)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
1.0 0.11 0.089
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
161/302
0.11 1.0 1.0
0.089 1.0 1.0
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x3 = 1.4, x3 = 1.4
k3,3 = 1.0 exp (1.41.4)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
1.0 0.11 0.089
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
162/302
0.11 1.0 1.0
0.089 1.0 1.0
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x4 = 2.0, x1 = 3
k4,1 = 1.0 exp (2.02.0)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
-
8/11/2019 Gp Cvpr12 Session1
163/302
-
8/11/2019 Gp Cvpr12 Session1
164/302
Covariance FunctionsWhere did this covariance matrix come from?
1.0 0.11 0.089 0.044
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
165/302
0.11 1.0 1.0
0.089 1.0 1.0
0.044
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x4 = 2.0, x2 = 1.2
k4,2 = 1.0 exp (2.02.0)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
-
8/11/2019 Gp Cvpr12 Session1
166/302
-
8/11/2019 Gp Cvpr12 Session1
167/302
Covariance FunctionsWhere did this covariance matrix come from?
1.0 0.11 0.089 0.044
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
168/302
0.11 1.0 1.0 0.92
0.089 1.0 1.0
0.044 0.92
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x4 = 2.0, x3 = 1.4
k4,3 = 1.0 exp (2.02.0)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
-
8/11/2019 Gp Cvpr12 Session1
169/302
Covariance FunctionsWhere did this covariance matrix come from?
1.0 0.11 0.089 0.044
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
170/302
0.11 1.0 1.0 0.92
0.089 1.0 1.0 0.96
0.044 0.92 0.96
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x4 = 2.0, x3 = 1.4
k4,3 = 1.0 exp (2.02.0)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
1.0 0.11 0.089 0.044
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
171/302
0.11 1.0 1.0 0.92
0.089 1.0 1.0 0.96
0.044 0.92 0.96
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x4 = 2.0, x4 = 2.0
k4,4 = 1.0 exp (2.02.0)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
1.0 0.11 0.089 0.044
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
172/302
0.11 1.0 1.0 0.92
0.089 1.0 1.0 0.96
0.044 0.92 0.96 1.0
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x4 = 2.0, x4 = 2.0
k4,4 = 1.0 exp (2.02.0)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
173/302
x1 = 3, x2 = 1.2, x3 = 1.4, and x4 = 2.0 with = 2.0 and = 1.0.
x4 = 2.0, x4 = 2.0
k4,4 = 1.0 exp (2.02.0)222.02
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
k(xi, xj) = exp ||xixj||222
-
8/11/2019 Gp Cvpr12 Session1
174/302
x1 = 3.0, x2 = 1.20, and x3 = 1.40 with = 5.00 and = 4.00.
x1 = 3.0, x1 = 3.0
k1,1 = 4.00 exp (3.03.0)225.002
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
-
8/11/2019 Gp Cvpr12 Session1
175/302
-
8/11/2019 Gp Cvpr12 Session1
176/302
-
8/11/2019 Gp Cvpr12 Session1
177/302
-
8/11/2019 Gp Cvpr12 Session1
178/302
-
8/11/2019 Gp Cvpr12 Session1
179/302
-
8/11/2019 Gp Cvpr12 Session1
180/302
-
8/11/2019 Gp Cvpr12 Session1
181/302
-
8/11/2019 Gp Cvpr12 Session1
182/302
-
8/11/2019 Gp Cvpr12 Session1
183/302
-
8/11/2019 Gp Cvpr12 Session1
184/302
-
8/11/2019 Gp Cvpr12 Session1
185/302
Covariance FunctionsWhere did this covariance matrix come from?
4.00 2.81 2.72
k(xi, xj) = exp ||xixj||222
1 40 1 20
-
8/11/2019 Gp Cvpr12 Session1
186/302
2.81 4.00 4.00
2.72 4.00
x1 = 3.0, x2 = 1.20, and x3 = 1.40 with = 5.00 and = 4.00.
x3 = 1.40, x2 = 1.20
k3,2 = 4.00 exp (1.401.40)225.002
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
-
8/11/2019 Gp Cvpr12 Session1
187/302
Covariance FunctionsWhere did this covariance matrix come from?
4.00 2.81 2.72
k(xi, xj) = exp ||xixj||222
1 40 1 40
-
8/11/2019 Gp Cvpr12 Session1
188/302
2.81 4.00 4.00
2.72 4.00 4.00
x1 = 3.0, x2 = 1.20, and x3 = 1.40 with = 5.00 and = 4.00.
x3 = 1.40, x3 = 1.40
k3,3 = 4.00 exp (1.401.40)225.002
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Covariance FunctionsWhere did this covariance matrix come from?
k(xi, xj) = exp ||xixj||222
x = 1 40 x = 1 40
-
8/11/2019 Gp Cvpr12 Session1
189/302
x1 = 3.0, x2 = 1.20, and x3 = 1.40 with = 5.00 and = 4.00.
x3 = 1.40, x3 = 1.40
k3,3 = 4.00 exp (1.401.40)225.002
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 36 / 74
Outline
1 The Gaussian Density
2 Covariance from Basis Functions
3 Basis Function Representations
-
8/11/2019 Gp Cvpr12 Session1
190/302
4 Constructing Covariance
5 GP Limitations
6 Conclusions
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 37 / 74
Basis Function Form
Radial basis functionscommonly have the form
k(xi) = exp
|xi k|
2
22
.
1
-
8/11/2019 Gp Cvpr12 Session1
191/302
Basis function mapsdata into afeaturespacein which alinear sum is a nonlinear function.
0
0.5
-8 -6 -4 -2 0 2 4 6 8
(x
)
xFigure: A set of radial basis functions with width= 2 and location parameters = [4 0 4 ].
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 38 / 74
-
8/11/2019 Gp Cvpr12 Session1
192/302
Random Functions
Functions derived using:
f(x) =m
k=1
wkk(x),-2
-10
1
2
f(x)
-
8/11/2019 Gp Cvpr12 Session1
193/302
k 1
where W is sampledfrom a Gaussian density,
wk N(0, ) .
-8 -6 -4 -2 0 2 4 6 8
x
Figure: Functions sampled using the basis set fromfigure2. Each line is a separate sample, generated bya weighted sum of the basis set. The weights, w aresampled from a Gaussian density with variance = 1.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 40 / 74
Direct Construction of Covariance Matrix
Use matrix notation to write function,
f (xi; w) =m
k=1
wkk(xi)
computed at training data gives a vector
-
8/11/2019 Gp Cvpr12 Session1
194/302
f= w.
w and fare only related by a inner product.
is fixed and non-stochastic for a given training set.
f is Gaussian distributed.
it is straightforward to compute distribution for f
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 41 / 74
-
8/11/2019 Gp Cvpr12 Session1
195/302
Direct Construction of Covariance Matrix
Use matrix notation to write function,
f (xi; w) =m
k=1
wkk(xi)
computed at training data gives a vector
-
8/11/2019 Gp Cvpr12 Session1
196/302
f= w.
w and fare only related by a inner product.
is fixed and non-stochastic for a given training set.
f is Gaussian distributed.
it is straightforward to compute distribution for f
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 41 / 74
-
8/11/2019 Gp Cvpr12 Session1
197/302
Direct Construction of Covariance Matrix
Use matrix notation to write function,
f (xi; w) =m
k=1
wkk(xi)
computed at training data gives a vector
-
8/11/2019 Gp Cvpr12 Session1
198/302
f= w.
w and fare only related by a inner product.
is fixed and non-stochastic for a given training set.
f is Gaussian distributed.
it is straightforward to compute distribution for f
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 41 / 74
Direct Construction of Covariance Matrix
Use matrix notation to write function,
f (xi; w) =m
k=1
wkk(xi)
computed at training data gives a vector
-
8/11/2019 Gp Cvpr12 Session1
199/302
f= w.
w and fare only related by a inner product.
is fixed and non-stochastic for a given training set.
f is Gaussian distributed.
it is straightforward to compute distribution for f
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 41 / 74
-
8/11/2019 Gp Cvpr12 Session1
200/302
-
8/11/2019 Gp Cvpr12 Session1
201/302
Expectations
We useto denote expectations under prior distributions.We have
f = w .
Prior mean ofw was zero giving
f =0.
-
8/11/2019 Gp Cvpr12 Session1
202/302
Prior covariance off is
K=
ff f f
ff= ww,giving
K= .
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 42 / 74
-
8/11/2019 Gp Cvpr12 Session1
203/302
Expectations
We useto denote expectations under prior distributions.We have
f = w .
Prior mean ofw was zero giving
f =0.
-
8/11/2019 Gp Cvpr12 Session1
204/302
Prior covariance off is
K=
ff f f
ff= ww,giving
K= .
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 42 / 74
-
8/11/2019 Gp Cvpr12 Session1
205/302
Covariance between Two Points
The prior covariance between two points xi and xj is
k(xi, xj) =
m
(xi)(xj)
or in vector form
k (x x ) = (x ) (x )
-
8/11/2019 Gp Cvpr12 Session1
206/302
k(xi, xj) =:(xi) :(xj) ,
For the radial basis used this gives
k(xi, xj) =
m
k=1
exp|xi k|2 + |xjk|2
22 .
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 43 / 74
-
8/11/2019 Gp Cvpr12 Session1
207/302
-
8/11/2019 Gp Cvpr12 Session1
208/302
Covariance between Two Points
The prior covariance between two points xi and xj is
k(xi, xj) =
m
(xi)(xj)
or in vector form
k (xi xj ) = (xi ) (xj )
-
8/11/2019 Gp Cvpr12 Session1
209/302
k(xi, xj) =:(xi) :(xj) ,
For the radial basis used this gives
k(xi, xj) =
m
k=1
exp|xi k|2 + |xjk|2
22 .
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 43 / 74
-
8/11/2019 Gp Cvpr12 Session1
210/302
Selecting Number and Location of Basis
Need to choose
1 location of centers2 number of basis functions
Consider uniform spacing over a region:
-
8/11/2019 Gp Cvpr12 Session1
211/302
k(xi, xj) =
mk=1
expx
2i +x
2j
2k(xi+xj) + 2
2k
22
,
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 44 / 74
Selecting Number and Location of Basis
Need to choose
1 location of centers2 number of basis functions
Consider uniform spacing over a region:
-
8/11/2019 Gp Cvpr12 Session1
212/302
k(xi, xj) =
mk=1
expx
2i +x
2j
2k(xi+xj) + 2
2k
22
,
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 44 / 74
-
8/11/2019 Gp Cvpr12 Session1
213/302
Uniform Basis Functions
Set each center location to
k=a+
(k
1).
Specify the bases in terms of their indices,
-
8/11/2019 Gp Cvpr12 Session1
214/302
k(xi, xj) =
mk=1
exp x
2i +x
2j
22
2 (a+ k) (xi+xj) + 2 (a+ k)2
22
.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 45 / 74
-
8/11/2019 Gp Cvpr12 Session1
215/302
-
8/11/2019 Gp Cvpr12 Session1
216/302
Infinite Basis Functions
Take 0 =a and m =bso b=a+ (m 1).
Take limit as 0 so m k(xi, xj) =
ba
exp
x
2i +x
2j
22
1 ( )2 1 ( )2
-
8/11/2019 Gp Cvpr12 Session1
217/302
+
2 12(xi+xj)2
12(xi+xj)
2
22
d,
where we have used k .
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 46 / 74
-
8/11/2019 Gp Cvpr12 Session1
218/302
Infinite Basis Functions
Take 0 =a and m =bso b=a+ (m 1).
Take limit as 0 so m k(xi, xj) =
ba
exp
x
2i +x
2j
22
2 1 ( + )2 1 ( + )2
-
8/11/2019 Gp Cvpr12 Session1
219/302
+
2 12(xi+xj)
12(xi+xj)
2
22
d,
where we have used k .
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 46 / 74
Result
Performing the integration leads to
k(xi,xj) =
2
2 exp
(xi xj)242
erf
b 12(xi+xj)
erf
a 12(xi+xj)
,
-
8/11/2019 Gp Cvpr12 Session1
220/302
Now take limit as a and b
k(xi, xj) = exp
(xi xj)
2
42
.
where =2.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 47 / 74
-
8/11/2019 Gp Cvpr12 Session1
221/302
Result
Performing the integration leads to
k(xi,xj) =
2
2 exp
(xi xj)242
erf
b 12(xi+xj)
erf
a 12(xi+xj)
,
-
8/11/2019 Gp Cvpr12 Session1
222/302
Now take limit as a and b
k(xi, xj) = exp
(xi xj)
2
42
.
where =2.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 47 / 74
Infinite Feature Space
A RBF model with infinite basis functions is a Gaussian process.
The covariance function is the exponentiated quadratic.
Note: The functional form for the covariance function and basisfunctions are similar.
this is a special case,
-
8/11/2019 Gp Cvpr12 Session1
223/302
in general they are very differentSimilar results can obtained for multi-dimensional input networks ??.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 48 / 74
-
8/11/2019 Gp Cvpr12 Session1
224/302
Infinite Feature Space
A RBF model with infinite basis functions is a Gaussian process.
The covariance function is the exponentiated quadratic.
Note: The functional form for the covariance function and basisfunctions are similar.
this is a special case,
-
8/11/2019 Gp Cvpr12 Session1
225/302
in general they are very differentSimilar results can obtained for multi-dimensional input networks ??.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 48 / 74
Infinite Feature Space
A RBF model with infinite basis functions is a Gaussian process.
The covariance function is the exponentiated quadratic.
Note: The functional form for the covariance function and basisfunctions are similar.
this is a special case,i l h diff
-
8/11/2019 Gp Cvpr12 Session1
226/302
in general they are very different
Similar results can obtained for multi-dimensional input networks ??.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 48 / 74
-
8/11/2019 Gp Cvpr12 Session1
227/302
The Parametric Bottleneck
Parametric models have a representation that does not respond toincreasing training set size.
Bayesian posterior distributions over parameters contain theinformation about the training data.
Use Bayes rule from training data, p(w|y,X), Make predictions on test data
p(y|X, y,X) =
p(y|w, X) p(w|y,X)dw) .
-
8/11/2019 Gp Cvpr12 Session1
228/302
w becomes a bottleneck for information about the training set to passto the test set.
Solution: increase m so that the bottleneck is so large that it nolonger presents a problem.
How big is big enough for m? Non-parametrics says m .
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 50 / 74
-
8/11/2019 Gp Cvpr12 Session1
229/302
-
8/11/2019 Gp Cvpr12 Session1
230/302
The Parametric Bottleneck
Now no longer possible to manipulate the model through the standardparametric form given in (1).
However, it ispossible to express parametricas GPs:
k(xi, xj) =:(xi) :(xj) .
-
8/11/2019 Gp Cvpr12 Session1
231/302
These are known as degenerate covariance matrices.Their rank is at most m, non-parametric models have full rankcovariance matrices.
Most well known is the linear kernel, k(xi, xj) =x
i xj.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 51 / 74
The Parametric Bottleneck
Now no longer possible to manipulate the model through the standardparametric form given in (1).
However, it ispossible to express parametricas GPs:
k(xi, xj) =:(xi) :(xj) .
-
8/11/2019 Gp Cvpr12 Session1
232/302
These are known as degenerate covariance matrices.Their rank is at most m, non-parametric models have full rankcovariance matrices.
Most well known is the linear kernel, k(xi, xj) =x
i xj.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 51 / 74
The Parametric Bottleneck
Now no longer possible to manipulate the model through the standardparametric form given in (1).
However, it ispossible to express parametricas GPs:
k(xi, xj) =:(xi) :(xj) .
-
8/11/2019 Gp Cvpr12 Session1
233/302
These are known as degenerate covariance matrices.Their rank is at most m, non-parametric models have full rankcovariance matrices.
Most well known is the linear kernel, k(xi, xj) =x
i xj.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 51 / 74
Making Predictions
For non-parametrics prediction at new points f is made byconditioning on f in the joint distribution.
In GPs this involves combining the training data with the covariancefunction and the mean function.
Parametric is a special case when conditional prediction can besummarized in a fixednumber of parameters.
-
8/11/2019 Gp Cvpr12 Session1
234/302
Complexity of parametric model remains fixed regardless of the size ofour training data set.
For a non-parametric model the required number of parameters growswith the size of the training data.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 52 / 74
Making Predictions
For non-parametrics prediction at new points f is made byconditioning on f in the joint distribution.
In GPs this involves combining the training data with the covariancefunction and the mean function.
Parametric is a special case when conditional prediction can besummarized in a fixednumber of parameters.
-
8/11/2019 Gp Cvpr12 Session1
235/302
Complexity of parametric model remains fixed regardless of the size ofour training data set.
For a non-parametric model the required number of parameters growswith the size of the training data.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 52 / 74
-
8/11/2019 Gp Cvpr12 Session1
236/302
-
8/11/2019 Gp Cvpr12 Session1
237/302
-
8/11/2019 Gp Cvpr12 Session1
238/302
Covariance Functions
RBF Basis Functions
k
x, x
= (x)(x)
i(x) = exp
x i
22
2
-
8/11/2019 Gp Cvpr12 Session1
239/302
=
10
1
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 53 / 74
-
8/11/2019 Gp Cvpr12 Session1
240/302
-
8/11/2019 Gp Cvpr12 Session1
241/302
Covariance Functions and Mercer Kernels
Mercer Kernels and Covariance Functions are similar.the kernel perspective does not make a probabilistic interpretation ofthe covariance function.
Algorithms can be simpler, but probabilistic interpretation is crucialfor kernel parameter optimization.
-
8/11/2019 Gp Cvpr12 Session1
242/302
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 54 / 74
-
8/11/2019 Gp Cvpr12 Session1
243/302
-
8/11/2019 Gp Cvpr12 Session1
244/302
Constructing Covariance Functions
Sum of two covariances is also a covariance function.
k(x, x) =k1(x, x) +k2(x, x
)
-
8/11/2019 Gp Cvpr12 Session1
245/302
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 56 / 74
-
8/11/2019 Gp Cvpr12 Session1
246/302
-
8/11/2019 Gp Cvpr12 Session1
247/302
-
8/11/2019 Gp Cvpr12 Session1
248/302
Covariance Functions
MLP Covariance Function
k
x, x
= asin
wxx +b
wxx +b+ 1wxx +b+ 1
Based on infinite neuralnetwork model.
0
1
2
-
8/11/2019 Gp Cvpr12 Session1
249/302
w= 40
b= 4 -2
-1
0
-1 0 1
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 59 / 74
-
8/11/2019 Gp Cvpr12 Session1
250/302
Covariance Functions
Linear Covariance Function
k
x, x
= xx
Bayesian linear regression.
1
0
1
2
-
8/11/2019 Gp Cvpr12 Session1
251/302
= 1
-2
-1
-1 0 1
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 60 / 74
-
8/11/2019 Gp Cvpr12 Session1
252/302
-
8/11/2019 Gp Cvpr12 Session1
253/302
-
8/11/2019 Gp Cvpr12 Session1
254/302
Gaussian Process Interpolation
-3
-2
-1
0
1
2
3
2 1 0 1 2
f(x)
-
8/11/2019 Gp Cvpr12 Session1
255/302
3 -2 -1 0 1 2
x
Figure: Real example: BACCO (see e.g. (?)). Interpolation through outputs fromslow computer simulations (e.g. atmospheric carbon levels).
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 61 / 74
Gaussian Process Interpolation
-3
-2
-1
0
1
2
3
2 1 0 1 2
f(x)
-
8/11/2019 Gp Cvpr12 Session1
256/302
3 -2 -1 0 1 2
x
Figure: Real example: BACCO (see e.g. (?)). Interpolation through outputs fromslow computer simulations (e.g. atmospheric carbon levels).
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 61 / 74
Gaussian Process Interpolation
-3
-2
-1
0
1
2
3
2 1 0 1 2
f(x)
-
8/11/2019 Gp Cvpr12 Session1
257/302
3 -2 -1 0 1 2
x
Figure: Real example: BACCO (see e.g. (?)). Interpolation through outputs fromslow computer simulations (e.g. atmospheric carbon levels).
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 61 / 74
Gaussian Process Interpolation
-3
-2
-1
0
1
2
3
2 1 0 1 2
f(x)
-
8/11/2019 Gp Cvpr12 Session1
258/302
3 -2 -1 0 1 2
x
Figure: Real example: BACCO (see e.g. (?)). Interpolation through outputs fromslow computer simulations (e.g. atmospheric carbon levels).
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 61 / 74
Gaussian Process Interpolation
-3
-2
-1
0
1
2
3
2 1 0 1 2
f(x)
-
8/11/2019 Gp Cvpr12 Session1
259/302
-2 -1 0 1 2
x
Figure: Real example: BACCO (see e.g. (?)). Interpolation through outputs fromslow computer simulations (e.g. atmospheric carbon levels).
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 61 / 74
Noise Models
Graph of a GP
Relates input variables, X,
to vector, y, throughfgiven kernel parameters .
Plate notation indicatesindependence ofyi|fi.
Noise model, p(yi|fi) cantake several forms
yi
X
fi
i= 1. . .
n
-
8/11/2019 Gp Cvpr12 Session1
260/302
, p (y | )take several forms.Simplest is Gaussiannoise.
. . .
Figure: The Gaussian processdepicted graphically.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 62 / 74
-
8/11/2019 Gp Cvpr12 Session1
261/302
Gaussian Process Regression
-3
-2
-1
0
1
2
3
-2 -1 0 1 2
y(x)
-
8/11/2019 Gp Cvpr12 Session1
262/302
x
Figure: Examples include WiFi localization, C14 callibration curve.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 64 / 74
-
8/11/2019 Gp Cvpr12 Session1
263/302
-
8/11/2019 Gp Cvpr12 Session1
264/302
-
8/11/2019 Gp Cvpr12 Session1
265/302
-
8/11/2019 Gp Cvpr12 Session1
266/302
-
8/11/2019 Gp Cvpr12 Session1
267/302
-
8/11/2019 Gp Cvpr12 Session1
268/302
Gaussian Process Regression
-3
-2
-1
0
1
2
3
-2 -1 0 1 2
y(x)
-
8/11/2019 Gp Cvpr12 Session1
269/302
x
Figure: Examples include WiFi localization, C14 callibration curve.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 64 / 74
Gaussian Process Regression
-3
-2
-1
0
1
2
3
-2 -1 0 1 2
y(x)
-
8/11/2019 Gp Cvpr12 Session1
270/302
x
Figure: Examples include WiFi localization, C14 callibration curve.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 64 / 74
Learning Covariance ParametersCan we determine length scales and noise levels from the data?
N(y|0,K) = 1(2)
n2 |K|exp
yK1y2
The parameters are insidethe covariance function
(matrix).
-
8/11/2019 Gp Cvpr12 Session1
271/302
( )
ki,j=k(xi, xj;)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 65 / 74
-
8/11/2019 Gp Cvpr12 Session1
272/302
Learning Covariance ParametersCan we determine length scales and noise levels from the data?
logN(y|0,K) = n2
log 212
log |K|yK1
y2
The parameters are insidethe covariance function(matrix).
-
8/11/2019 Gp Cvpr12 Session1
273/302
( )ki,j=k(xi, xj;)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 65 / 74
Learning Covariance ParametersCan we determine length scales and noise levels from the data?
E() =
1
2log |K| +yK1y
2
The parameters are insidethe covariance function(matrix).
-
8/11/2019 Gp Cvpr12 Session1
274/302
( )ki,j=k(xi, xj;)
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 65 / 74
-
8/11/2019 Gp Cvpr12 Session1
275/302
-
8/11/2019 Gp Cvpr12 Session1
276/302
Capacity control: log |K|
1 0
0 2
1
2 =
-
8/11/2019 Gp Cvpr12 Session1
277/302
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 67 / 74
-
8/11/2019 Gp Cvpr12 Session1
278/302
-
8/11/2019 Gp Cvpr12 Session1
279/302
Capacity control: log |K|
1 0
0 2
1
2 =
-
8/11/2019 Gp Cvpr12 Session1
280/302
|| =12Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 67 / 74
Capacity control: log |K|
1 0
0 2
1
2 || =
-
8/11/2019 Gp Cvpr12 Session1
281/302
|| =12Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 67 / 74
-
8/11/2019 Gp Cvpr12 Session1
282/302
-
8/11/2019 Gp Cvpr12 Session1
283/302
Capacity control: log |K|
1 0
0 2
1
2 || =
-
8/11/2019 Gp Cvpr12 Session1
284/302
|| =12Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 67 / 74
Capacity control: log |K|
w1,1 w1,2
w2,1 w2,2
12
||R =
-
8/11/2019 Gp Cvpr12 Session1
285/302
|R| =12Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 67 / 74
Data Fit: y1K1y
2
-6-4
-2
0
2
4
6
y2
1
2
-
8/11/2019 Gp Cvpr12 Session1
286/302
-6 -4 -2 0 2 4 6
y1
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 68 / 74
Data Fit: y1K1y
2
-6-4
-2
0
2
4
6
y2
1
2
-
8/11/2019 Gp Cvpr12 Session1
287/302
-6 -4 -2 0 2 4 6
y1
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 68 / 74
Data Fit: y1K1y
2
-6-4
-2
0
2
4
6
y2
12
-
8/11/2019 Gp Cvpr12 Session1
288/302
-6 -4 -2 0 2 4 6
y1
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 68 / 74
Learning Covariance ParametersCan we determine length scales and noise levels from the data?
-2
-1
0
1
2
-2 -1 0 1 2
x
-10
-5
0
5
10
15
20
101 100 101
length scale,
-
8/11/2019 Gp Cvpr12 Session1
289/302
E() =1
2|K| + y
K1y
2
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 69 / 74
-
8/11/2019 Gp Cvpr12 Session1
290/302
-
8/11/2019 Gp Cvpr12 Session1
291/302
-
8/11/2019 Gp Cvpr12 Session1
292/302
Learning Covariance ParametersCan we determine length scales and noise levels from the data?
-2
-1
0
1
2
-2 -1 0 1 2
x
-10
-5
0
5
10
15
20
101 100 101
length scale,
-
8/11/2019 Gp Cvpr12 Session1
293/302
E() =1
2|K| + y
K1y
2
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 69 / 74
Learning Covariance ParametersCan we determine length scales and noise levels from the data?
-2
-1
0
1
2
-2 -1 0 1 2
x
-10
-5
0
5
10
15
20
101 100 101
length scale,
-
8/11/2019 Gp Cvpr12 Session1
294/302
E() =1
2|K| + y
K1y
2
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 69 / 74
Learning Covariance ParametersCan we determine length scales and noise levels from the data?
-2
-1
0
1
2
-2 -1 0 1 2
x
-10
-5
0
5
10
15
20
101 100 101
length scale,
1
-
8/11/2019 Gp Cvpr12 Session1
295/302
E() =1
2|K| + y
K1y
2
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 69 / 74
Learning Covariance ParametersCan we determine length scales and noise levels from the data?
-2
-1
0
1
2
-2 -1 0 1 2
x
-10
-5
0
5
10
15
20
101 100 101
length scale,
1
-
8/11/2019 Gp Cvpr12 Session1
296/302
E() =1
2|K| + y
K1y
2
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 69 / 74
Learning Covariance ParametersCan we determine length scales and noise levels from the data?
-2
-1
0
1
2
-2 -1 0 1 2
x
-10
-5
0
5
10
15
20
101 100 101
length scale,
1
-
8/11/2019 Gp Cvpr12 Session1
297/302
E() =1
2|K| + y
K1y
2
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 69 / 74
Gene Expression Example
-
8/11/2019 Gp Cvpr12 Session1
298/302
Data from ?. Figure from?.
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 70 / 74
Outline
1 The Gaussian Density
2 Covariance from Basis Functions
3 Basis Function Representations
4 Constructing Covariance
5 GP Limitations
-
8/11/2019 Gp Cvpr12 Session1
299/302
6 Conclusions
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 71 / 74
Limitations of Gaussian Processes
Inference is O(n3) due to matrix inverse (in practice use Cholesky).
Gaussian processes dont deal well with discontinuities (financialcrises, phosphorylation, collisions, edges in images).
Widely used exponentiated quadratic covariance (RBF) can be toosmooth in practice (but there are many alternatives!!).
-
8/11/2019 Gp Cvpr12 Session1
300/302
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 72 / 74
Summary
Broad introduction to Gaussian processes. Started with Gaussian distribution. Motivated Gaussian processes through the multivariate density.
Emphasized the role of the covariance (not the mean).
Performs nonlinear regression with error bars.
Parameters of the covariance function (kernel) are easily optimized
with maximum likelihood.
-
8/11/2019 Gp Cvpr12 Session1
301/302
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 73 / 74
References I
G. Della Gatta, M. Bansal, A. Ambesi-Impiombato, D. Antonini, C. Missero, and D. di Bernardo. Direct targets of the trp63transcription factor revealed by a combination of gene expression profiling and reverse engineering. Genome Research, 18(6):939948, Jun 2008. [URL]. [DOI].
A. A. Kalaitzis and N. D. Lawrence. A simple approach to ranking differentially expressed gene expression time courses throughGaussian process regression. BMC Bioinformatics, 12(180), 2011. [DOI].
R. M. Neal. Bayesian Learning for Neural Networks. Springer, 1996. Lecture Notes in Statistics 118.
J. Oakley and A. OHagan. Bayesian inference for the uncertainty distribution of computer model outputs. Biometrika, 89(4):769784, 2002.
C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA, 2006. [GoogleBooks].
C. K. I. Williams. Computation with infinite neural networks. Neural Computation, 10(5):12031216, 1998.
http://dx.doi.org/10.1101/gr.073601.107http://dx.doi.org/10.1101/gr.073601.107http://dx.doi.org/10.1101/gr.073601.107http://dx.doi.org/10.1101/gr.073601.107http://dx.doi.org/10.1101/gr.073601.107http://dx.doi.org/10.1186/1471-2105-12-180http://dx.doi.org/10.1186/1471-2105-12-180http://dx.doi.org/10.1186/1471-2105-12-180http://books.google.com/books?as_isbn=0-262-18253-Xhttp://books.google.com/books?as_isbn=0-262-18253-Xhttp://books.google.com/books?as_isbn=0-262-18253-Xhttp://books.google.com/books?as_isbn=0-262-18253-Xhttp://books.google.com/books?as_isbn=0-262-18253-Xhttp://books.google.com/books?as_isbn=0-262-18253-Xhttp://dx.doi.org/10.1186/1471-2105-12-180http://dx.doi.org/10.1101/gr.073601.107http://dx.doi.org/10.1101/gr.073601.107 -
8/11/2019 Gp Cvpr12 Session1
302/302
Urtasun and Lawrence () Session 1: GP and Regression CVPR Tutorial 74 / 74
top related