study of sparse online gaussian process for regression ee645 final project may 2005 eric saint...

76
Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Post on 22-Dec-2015

223 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Study ofSparse Online Gaussian Process

for Regression

EE645 Final ProjectMay 2005

Eric Saint Georges

Page 2: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

A. IntroductionB. OGP

1. Definition of Gaussian Process2. Sparse Online GP algorithm (OGP)

C. Simulation Results1. Comparison with LS SVM on Boston Housing data

set (Batch)2. Time Series Prediction using OGP3. Optical Beam Position Optimization

D. Conclusion

Contents

Page 3: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Introduction

Possible Application of OGP to Optical Free Space Communication

for

Monitoring and Optimization in a noisy environment

Using

Sparse OGP Algorithm developed by Lehel Csato and al.

Page 4: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

A. IntroductionB. OGP

1. Definition of Gaussian Process2. Sparse Online GP algorithm (OGP)

C. Simulation Results1. Comparison with LS SVM on Boston Housing data

set (Batch)2. Time Series Prediction using OGP3. Optical Beam Position Optimization

D. Conclusion

Contents

Page 5: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Gaussian Process Definition

Collection of indexed random variables– Mean– Covariance defined by a Kernel function

• function: can be any Positive Semi Definite function• Defines the assumptions on the prior distribution• Wide scope of choices• Popular Kernels are stationary functions: f (x-x’)

– Index can be time or space or anything else

Page 6: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

On line GP Process

• Bayesian Process:Prior distribution (GP Process)

+

Likelihood Function

Posterior distribution

(Using Bayes rule)

Page 7: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Solving a Gaussian Process: Given measures n inputs and n measures t i

(with ti = yi + ei) being zero mean and e variance

Prior distribution over y i is given by the covariance matrix K ij = C(xi,xj).

Prior distribution over the measures t i is given by K+ e In

Prediction on function y* for an input x* consists in calculating the mean and variances:

y*(x* )= i C(xi,xj)

and

(x* )=C(x*,x*) –kT(x*)(K + e In )-1 k (x*)

With i = (K + e In )-1 t

Page 8: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Solving the Gaussian Process:

Solving requires inversion of (K + e In ) which is a n x n matrix, n being the number of training inputs.

Memory is in n2 and cpu time in n3.

Page 9: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Sampling from a Gaussian Process:

• Example of Kernel:

a = amplitude

s = scale (smoothness)

n

i i

ii xx

eaK 1

2)'(

Page 10: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Sampling from a GP: before Training

-20 -15 -10 -5 0 5 10 15 20-10

-8

-6

-4

-2

0

2

4

6

8

10Drawing Samples with Scale = 10

SampleMean+Standard Deviation-Standard Deviation

Page 11: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Sampling from a GP: before Training

0 20 40 60 80 100-10

-8

-6

-4

-2

0

2

4

6

8

10Drawing Samples with Scale = 1

SampleMean+Standard Deviation-Standard Deviation

0 20 40 60 80 100-10

-8

-6

-4

-2

0

2

4

6

8

10Drawing Samples with Scale = 100

SampleMean+Standard Deviation-Standard Deviation

Effect of Scale

Small Scale=1 Large Scale =100

Page 12: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Sampling from a GP: before Training

0 20 40 60 80 100-10

-8

-6

-4

-2

0

2

4

6

8

10Drawing Samples with Scale = 1

SampleMean+Standard Deviation-Standard Deviation

0 20 40 60 80 100-10

-8

-6

-4

-2

0

2

4

6

8

10Drawing Samples with Scale = 100

SampleMean+Standard Deviation-Standard Deviation

Effect of Scale

Small Scale=1 Large Scale =100

Page 13: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Sampling from a GP: After Training

-20 -15 -10 -5 0 5 10 15 20-3

-2

-1

0

1

2

3

4

5

6

-20 -15 -10 -5 0 5 10 15 20-3

-2

-1

0

1

2

3

4

5

6Drawing Samples with Scale = 50

SampleMean+Standard Deviation-Standard Deviation

After 3 Training samples

Page 14: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Sampling from a GP: After Training

-20 -15 -10 -5 0 5 10 15 20-3

-2

-1

0

1

2

3

4

5

6

-20 -15 -10 -5 0 5 10 15 20-3

-2

-1

0

1

2

3

4

5

6Drawing Samples with Scale = 50

SampleMean+Standard Deviation-Standard Deviation

After 10 Training samples

Page 15: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Online Gaussian Process: Issues

Two Major Issues with the GP process;

1. Data Set size is limited by Memory and CPU

2. Posterior distribution is usually not Gaussian

Page 16: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Algorithm developed by Csato and al.

Sparse Online Gaussian Algorithm

Sparsity createdusing limited number

of SVs

GaussianApproximation

Posterior DistributionNot usually Gaussian

Data Set size limitedby Memory and CPU

Matlab Software available on the Web

Page 17: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

SOGP Process defined by:– Kernel Parameters m + 2 Vector for RBF Kernel– Support Vectors: d x 1 Vector (indexes)– GP Parameters:

: d x 1 Vector• K: d x n Matrix

• m dimension of input space• d number of support vectors

Sparse Online Gaussian Algorithm

Page 18: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

A. IntroductionB. OGP

1. Definition of Gaussian Process2. Sparse Online GP algorithm (SOGP)

C. Simulation Results1. Comparison with LS SVM on Boston Housing

data set (Batch)2. Time Series Prediction using OGP3. Optical Beam Position Optimization

D. Conclusion

Page 19: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

LS SVM on Boston Housing Data Set

RBF KernelC=10, =4304 training samples averaged on 10 Random draws

MeanAverage MSE on Training: 3 3 3 3.0Average MSE on Test: 7 7 6 6.5Standard Deviation on Training 0 0 1 0.4Standard Deviation on Test 2 2 2 1.6

Average cpu = 3 sec / run

Page 20: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

•Kernel :

•Initial Hyper-parameters: and (i=1 to 13 for BH)

•Number of Hyper-parameter optimization iterations: tried

between 3 and 6

•Max number of Support Vectors: Variable

OGP on Boston Housing Data Set

n

i i

ii xx

eaK 1

2)'(

a i

Page 21: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

6 Iterations, MaxBV between 10 and 250

OGP on Boston Housing Data Set

Page 22: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

0 50 100 1500

5

10

15GP Regression on Boston Housing Data Set

MS

E

Max number of Support Vectors

maxHyp = 3

MSE av eraged ov er 5 draws. Hy per Parameters updated 3 times. Max numbers of Support Vectors = 10, 20, 50, 100, 150, Number of Training samples = 304 Total Elapsed Time = 2733 sec. Results sav ed in Result_5120_4.mat net Structure sav ed in net_5120_4.mat Figure sav ed in Fig_5120_4.jpg

5120_4 30-Apr-2005

Train

Test

3 Iterations, MaxBV between 10 and 150

OGP on Boston Housing Data Set

Page 23: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

4 Iterations, MaxBV between 10 and 150

0 50 100 1500

5

10

15GP Regression on Boston Housing Data Set

MS

E

Max number of Support Vectors

maxHyp = 4

MSE av eraged ov er 5 draws. Hy per Parameters updated 4 times. Max numbers of Support Vectors = 10, 20, 50, 100, 150, Number of Training samples = 304 Total Elapsed Time = 3694 sec. Results sav ed in Result_5120_5.mat net Structure sav ed in net_5120_5.mat Figure sav ed in Fig_5120_5.jpg

5120_5 30-Apr-2005

Train

Test

OGP on Boston Housing Data Set

Page 24: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

0 50 100 1500

5

10

15GP Regression on Boston Housing Data Set

MS

E

Max number of Support Vectors

maxHyp = 6

MSE av eraged ov er 5 draws. Hy per Parameters updated 6 times. Max numbers of Support Vectors = 10, 20, 50, 100, 150, Number of Training samples = 304 Total Elapsed Time = 5785 sec. Results sav ed in Result_5120_6.mat net Structure sav ed in net_5120_6.mat Figure sav ed in Fig_5120_6.jpg

5120_6 30-Apr-2005

Train

Test

6 Iterations, MaxBV between 10 and 150

OGP on Boston Housing Data Set

Page 25: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Cpu Time

0 5 10 15 20 250

100

200

300

400

500

600Processing Time

Maximum Number of SVs

Pro

cess

ing

Tim

e (S

ec)s

MaxHyp=3

MaxHyp=4MaxHyp=6

0 50 100 1500.4

0.5

0.6

0.7

0.8

0.9

1

1.1Processing Time /maxBV/maxHyp

Maximum Number of SVs

Pro

cess

ing

Tim

e (S

ec)s

MaxHyp=3

MaxHyp=4MaxHyp=6

0 50 100 150

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

Number of Support Vectors

Alpha*(Beta+SVs2)./SVs

a*(b+SVs2)/SVsas a function of SVs

OGP on Boston Housing Data Set

Page 26: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

10 15 20 25 30 35 40 45 50 55 600

5

10

15GP Regression on Boston Housing Data Set

MS

E

Max number of Support Vectors

Train

Test

MSE av eraged ov er 10 draws. Hy per Parameters updated 4 times. Max numbers of Support Vectors = 10, 20, 30, 40, 50, 60, Number of Training samples = 304 Total Elapsed Time = 5749 sec. Results sav ed in Result_5121_1.mat net Structure sav ed in net_5121_1.mat Figure sav ed in Fig_5121_1.jpg

5121_1 01-May-2005

Run with 4 Iterations, MaxBV between 10 and 60

OGP on Boston Housing Data Set

Page 27: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

25 30 35 40 452

4

6

8

10GP Regression on Boston Housing Data Set

MS

E

Max number of Support Vectors

Train

Test

MSE av eraged ov er 50 draws. Hy per Parameters updated 4 times. Max numbers of Support Vectors = 30, 40, Number of Training samples = 304 Total Elapsed Time = 8335 sec. Results sav ed in Result_5122_1.mat net Structure sav ed in net_5122_1.mat Figure sav ed in Fig_5122_1.jpg

5122_1 02-May-2005

Final Run with 4 Iterations, MaxBV 30 and 40Average over 50 random draws

Max Number of SVs 30 40Average MSE on Training: 3.80 3.20Average MSE on Test: 7.10 6.90Standard Deviation on Training 0.23 0.24Standard Deviation on Test 1.20 1.10

OGP on Boston Housing Data Set

Page 28: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

MSE not as good as LS SVM (6.9 versus to 6.5)But Standard deviation better than LS SVM (1.1 versus 1.6).

Cpu Time much longer (90sec versus 3 sec per run)But increases slower with number of samples than LS SVM.Might do better with large data sets.

Conclusion

OGP on Boston Housing Data Set

Page 29: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

TSP (Time Series Prediction)

0 1000 2000 3000 4000 5000-600

-400

-200

0

200

400

600

800

Time

TSP Data

Training DataPrediction Data

OGP on TSP

Page 30: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

RUN = 10 Kpar(1) = 0.0100, kpar(2) = 2000 Overlap between section = 0 Training samples. Max numbers of Support Vectors = 50,

5128_18 08-May-2005

Training DataTest DataGP EstimationOPG Prediction

0 200 400 600 800 1000-600

-400

-200

0

200

400

Initial kpar(1) = 1.00e-002Final kpar(1) = 1.30e-003MSE on Prediction = 2489.7.

980 Training Samples.

OGP on TSP: Initial Runs

Page 31: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

RUN = 10 Kpar(1) = 0.0100, kpar(2) = 2000 Overlap between section = 0 Training samples. Max numbers of Support Vectors = 50,

5128_13 08-May-2005

800 850 900 950 1000-400

-300

-200

-100

0

100

200

Initial kpar(1) = 1.00e-002

Final kpar(1) = 1.61e-002

MSE on Prediction = 1400.2.

Training DataTest DataGP EstimationOPG Prediction

OGP on TSP: Initial Runs

Page 32: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

RUN = 10 Kpar(1) = 0.0100, kpar(2) = 2000 Overlap between section = 0 Training samples. Max numbers of Support Vectors = 50,

5128_15 08-May-2005

700 750 800 850 900 950 1000-400

-300

-200

-100

0

100

200

Initial kpar(1) = 1.00e-002

Final kpar(1) = 2.43e-003

MSE on Prediction = 91.1.

Training DataTest DataGP EstimationOPG Prediction

OGP on TSP: Initial Runs

Page 33: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

RUN = 10 281 Training Samples. Kpar(1) = 0.0100, kpar(2) = 2000 Overlap between section = 0 Training samples. Max numbers of Support Vectors = 50,

5128_33 08-May-2005

700 750 800 850 900 950 1000-400

-300

-200

-100

0

100

200Initial kpar(1) = 1.00e-002

Final kpar(1) = 1.46e-002

MSE on Prediction = 1131.6.

281 Training Samples.

Training DataTest DataGP EstimationOPG PredictionSupport Vectors

RUN = 10 Kpar(1) = 0.0100, kpar(2) = 2000 Overlap between section = 0 Training samples. Max numbers of Support Vectors = 50,

5128_24 08-May-2005

700 750 800 850 900 950 1000-400

-300

-200

-100

0

100

200Initial kpar(1) = 1.00e-002

Final kpar(1) = 2.45e-003

MSE on Prediction = 95.1.

281 Training Samples.

Training DataTest DataGP EstimationOPG PredictionSupport Vectors

OGP on TSP: Local Minimum

For both runs: Initial kpar = 1e-2

Final kpar = 1.42e-2MSE = 1132

Final kpar = 2.45e-3MSE = 95

Page 34: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

OGP on TSP: Impact of Over-fitting

940 960 980 1000 1020 1040

-100

0

100

200

RUN = 7

Kpar(1) = 0.3000, kpar(2) = 2000

Overlap between section = 30 Training samples.

Max numbers of Support Vectors = 50,

Page 35: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

OGP on TSP: Impact of Number of Samples on Prediction

Page 36: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

OGP on TSP: Impact of Number of Samples on Prediction

cpu=6sec

RUN = 11 81 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 0 Training samples. Max numbers of Support Vectors = 50,

5128_46 08-May-2005

900 920 940 960 980 1000-300

-200

-100

0

100

200Initial kpar(1) = 1.00e-003

Final kpar(1) = 2.42e-003

MSE on Prediction = 124.6.

81 Training Samples.

Training DataTest DataGP EstimationOPG PredictionSupport Vectors

Page 37: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

OGP on TSP: Impact of Number of Samples on Prediction

cpu=16 sec

RUN = 11 181 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 0 Training samples. Max numbers of Support Vectors = 50,

5128_52 08-May-2005

800 850 900 950 1000-400

-300

-200

-100

0

100

200Initial kpar(1) = 1.00e-003

Final kpar(1) = 2.25e-003

MSE on Prediction = 91.7.

181 Training Samples.

Training DataTest DataGP EstimationOPG PredictionSupport Vectors

Page 38: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

RUN = 11 281 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 0 Training samples. Max numbers of Support Vectors = 50,

5128_42 08-May-2005

700 750 800 850 900 950 1000-400

-300

-200

-100

0

100

200Initial kpar(1) = 1.00e-003

Final kpar(1) = 2.21e-003

MSE on Prediction = 85.9.

281 Training Samples.

Training DataTest DataGP EstimationOPG PredictionSupport Vectors

OGP on TSP: Impact of Number of Samples on Prediction

cpu=27sec

Page 39: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

OGP on TSP: Impact of Number of Samples on Prediction

RUN = 11 381 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 0 Training samples. Max numbers of Support Vectors = 50,

5128_44 08-May-2005

600 650 700 750 800 850 900 950 1000-400

-300

-200

-100

0

100

200Initial kpar(1) = 1.00e-003

Final kpar(1) = 2.66e-003

MSE on Prediction = 104.8.

381 Training Samples.

Training DataTest DataGP EstimationOPG PredictionSupport Vectors

cpu=45sec

Page 40: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

OGP on TSP: Impact of Number of Samples on Prediction

cpu=109sec

RUN = 11 481 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 0 Training samples. Max numbers of Support Vectors = 50,

5128_45 08-May-2005

500 600 700 800 900 1000-400

-300

-200

-100

0

100

200Initial kpar(1) = 1.00e-003

Final kpar(1) = 3.66e-003

MSE on Prediction = 99.1.

481 Training Samples.

Training DataTest DataGP EstimationOPG PredictionSupport Vectors

Page 41: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

OGP on TSP: Impact of Number of Samples on Prediction

cpu=233sec

RUN = 11 581 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 0 Training samples. Max numbers of Support Vectors = 50,

5128_51 08-May-2005

400 500 600 700 800 900 1000-600

-400

-200

0

200Initial kpar(1) = 1.00e-003Final kpar(1) = 3.58e-003MSE on Prediction = 632.6.

581 Training Samples.

Training DataTest DataGP EstimationOPG PredictionSupport Vectors

Page 42: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

OGP on TSP: Impact of Number SVs on Prediction

Page 43: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

OGP on TSP: Impact of Number SVs on Prediction

cpu=19sec

RUN = 11 181 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 0 Training samples. Max numbers of Support Vectors = 10,

5128_54 08-May-2005

800 850 900 950 1000-400

-300

-200

-100

0

100

200Initial kpar(1) = 1.00e-003

Final kpar(1) = 2.23e-003

MSE on Prediction = 101.2.

181 Training Samples.

Training DataTest DataGP EstimationOPG PredictionSupport Vectors

Page 44: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

OGP on TSP: Impact of Number SVs on Prediction

cpu=16sec

RUN = 11 181 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 0 Training samples. Max numbers of Support Vectors = 50,

5128_52 08-May-2005

800 850 900 950 1000-400

-300

-200

-100

0

100

200Initial kpar(1) = 1.00e-003

Final kpar(1) = 2.25e-003

MSE on Prediction = 91.7.

181 Training Samples.

Training DataTest DataGP EstimationOPG PredictionSupport Vectors

Page 45: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

OGP on TSP: Impact of Number SVs on Prediction

cpu=16sec

RUN = 11 181 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 0 Training samples. Max numbers of Support Vectors = 100,

5128_53 08-May-2005

800 850 900 950 1000-400

-300

-200

-100

0

100

200Initial kpar(1) = 1.00e-003

Final kpar(1) = 2.24e-003

MSE on Prediction = 88.9.

181 Training Samples.

Training DataTest DataGP EstimationOPG PredictionSupport Vectors

Page 46: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

OGP on TSP: Final Runs

Page 47: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

RUN = 11 200 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 30 Training samples. Max numbers of Support Vectors = 50,

5128_69 08-May-2005

RUN = 11 230 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 30 Training samples. Max numbers of Support Vectors = 50,

5128_70 08-May-2005

RUN = 11 230 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 30 Training samples. Max numbers of Support Vectors = 50,

5128_71 08-May-2005

RUN = 11 230 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 30 Training samples. Max numbers of Support Vectors = 50,

5128_72 08-May-2005

RUN = 11 210 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 30 Training samples. Max numbers of Support Vectors = 50,

5128_73 08-May-2005

RUN = 11 200 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 30 Training samples. Max numbers of Support Vectors = 50,

5128_74 08-May-2005

RUN = 11 230 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 30 Training samples. Max numbers of Support Vectors = 50,

5128_75 08-May-2005

RUN = 11 230 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 30 Training samples. Max numbers of Support Vectors = 50,

5128_76 08-May-2005

RUN = 11 230 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 30 Training samples. Max numbers of Support Vectors = 50,

5128_77 08-May-2005

RUN = 11 210 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 30 Training samples. Max numbers of Support Vectors = 50,

5128_78 08-May-2005

RUN = 11 200 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 30 Training samples. Max numbers of Support Vectors = 50,

5128_79 08-May-2005

RUN = 11 230 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 30 Training samples. Max numbers of Support Vectors = 50,

5128_80 08-May-2005

0 500 1000 1500 2000 2500-600

-400

-200

0

200

400

600

800

RUN = 11 230 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 30 Training samples. Max numbers of Support Vectors = 50,

5128_81 08-May-2005

Running 200 Sample at a time, with 30 sample overlap

OGP on TSP: Final Runs

Page 48: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

1880 1900 1920 1940 1960 1980 2000 2020 2040 2060 2080

0

200

400

600

RUN = 6

Kpar(1) = 1.0000, kpar(2) = 2200

Overlap between section = 0 Training samples.

OGP on TSP: Why an overlap?

Page 49: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

RUN = 11 230 Training Samples. Kpar(1) = 0.0010, kpar(2) = 2000 Overlap between section = 30 Training samples. Max numbers of Support Vectors = 50,

5128_100 08-May-2005

2000 2500 3000 3500 4000 4500 5000-300

-200

-100

0

100

200

300

400

500Initial kpar(1) = 1.00e-003

Final kpar(1) = 9.24e-004

Does not always behaves!...

OGP on TSP: Final Runs

Page 50: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Difficult to find the right set of parameters

•Initial Kernel Parameter

•Number of Support Vectors

•Number of Training Samples per run

OGP on TSP: Conclusion

Page 51: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Beam Position Optimization

Page 52: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Gaussian Beam

-200-100

0100

200-200

0

200

-30

-25

-20

-15

-10

-5

0

y

x

Inte

nsi

ty

With Small Noise

Page 53: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Gaussian Beam

-200-100

0100

200-200

0

200

-30

-25

-20

-15

-10

-5

0

y

x

Inte

nsi

ty

With Noise

Page 54: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Gaussian Beam

-200-100

0100

200-200

0

200

-30

-25

-20

-15

-10

-5

0

y

x

Inte

nsi

ty

With Noise

Page 55: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Gaussian Beam

-200-100

0100

200-200

0

200

-30

-25

-20

-15

-10

-5

0

y

x

Inte

nsi

ty

With Noise

Page 56: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Gaussian Beam

-200-100

0100

200-200

0

200

-30

-25

-20

-15

-10

-5

0

y

x

Inte

nsi

ty

With Noise

Page 57: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

-200

-100

0

100

200

-200-100

0100

200-30

-20

-10

0

x

y

Inte

nsi

ty

Gaussian Beam Position Optimization

Sampling the beam at a given position, and measuring Power

Objective: Find the top of the beam

Page 58: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

• Idea:

– With a few initial samples, use the OGP to get an estimate of the beam profile and position

– Move toward the max of the estimate– Add this new sample to the training set– Iterate

Beam Position Optimization

Page 59: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Beam Position Optimization

-200

0

200

-200-100

0100

2000

0.2

0.4

0.6

0.8

1

-200

-100

0

100

200

-200-100

0100

200-30

-20

-10

0

xy

Inte

nsi

ty

Page 60: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Beam Position Optimization

-200

0

200

-200-100

0100

2000

0.2

0.4

0.6

0.8

1

-200

-100

0

100

200

-200-100

0100

200-30

-20

-10

0

xy

Inte

nsi

ty

Page 61: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Beam Position Optimization

-200

0

200

-200-100

0100

2000

0.2

0.4

0.6

0.8

1

-200

-100

0

100

200

-200-100

0100

200-30

-20

-10

0

xy

Inte

nsi

ty

Page 62: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Beam Position Optimization

-200

0

200

-200-100

0100

2000

0.2

0.4

0.6

0.8

1

-200

-100

0

100

200

-200-100

0100

200-30

-20

-10

0

xy

Inte

nsi

ty

Page 63: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Beam Position Optimization

-200

0

200

-200-100

0100

2000

0.2

0.4

0.6

0.8

1

-200

-100

0

100

200

-200-100

0100

200-30

-20

-10

0

xy

Inte

nsi

ty

Page 64: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Beam Position Optimization

-200

0

200

-200-100

0100

2000

0.2

0.4

0.6

0.8

1

-200

-100

0

100

200

-200-100

0100

200-30

-20

-10

0

xy

Inte

nsi

ty

Page 65: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Beam Position Optimization

-200

0

200

-200-100

0100

2000

0.2

0.4

0.6

0.8

1

-200

-100

0

100

200

-200-100

0100

200-30

-20

-10

0

xy

Inte

nsi

ty

Page 66: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Beam Position Optimization

-200

0

200

-200-100

0100

2000

0.2

0.4

0.6

0.8

1

-200

-100

0

100

200

-200-100

0100

200-30

-20

-10

0

xy

Inte

nsi

ty

Page 67: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Beam Position Optimization

-200

0

200

-200-100

0100

2000

0.2

0.4

0.6

0.8

1

-200

-100

0

100

200

-200-100

0100

200-30

-20

-10

0

xy

Inte

nsi

ty

Page 68: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Beam Position Optimization

-200

0

200

-200-100

0100

2000

0.2

0.4

0.6

0.8

1

-200

-100

0

100

200

-200-100

0100

200-30

-20

-10

0

xy

Inte

nsi

ty

Page 69: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Beam Position Optimization

-200

0

200

-200-100

0100

2000

0.2

0.4

0.6

0.8

1

-200

-100

0

100

200

-200-100

0100

200-30

-20

-10

0

xy

Inte

nsi

ty

Page 70: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Beam Position Optimization

-200

0

200

-200-100

0100

2000

0.2

0.4

0.6

0.8

1

-200

-100

0

100

200

-200-100

0100

200-30

-20

-10

0

xy

Inte

nsi

ty

Page 71: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Beam Position Optimization

-200

0

200

-200-100

0100

2000

0.2

0.4

0.6

0.8

1

-200

-100

0

100

200

-200-100

0100

200-30

-20

-10

0

xy

Inte

nsi

ty

Page 72: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

Works faster than current algorithm (Finds the top with less

steps)

Does not work well if no noise.

Can be improved

OGP for Beam Optimization: Conclusion

Page 73: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

• Specific Kernel: s1 = s2 (Beam is symmetric in x, y)

• Use the known Beam divergence to set the initial Kernel

Parameters

• Optimize choice of sample

– Going directly to the estimated top might not be the best

(Because it does not help to improve the estimate)

– Improve robustness by minimizing probability to sample at lower power

OGP for Beam Optimization: Possible Improvements

Page 74: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

A. IntroductionB. OGP

1. Definition of Gaussian Process2. Sparse Online GP algorithm (OGP)

C. Simulation Results1. Comparison with LS SVM on Boston Housing data

set (Batch)2. Time Series Prediction using OGP3. Optical Beam Position Optimization

D. Conclusion

Page 75: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

• OGP is an interesting tool

• Complex software

• Many tunings to insure stability and convergence

• Not easy to use

• Next Steps:

More comparison with online LS SVM

– Performance

– Cpu Time

Conclusion

Page 76: Study of Sparse Online Gaussian Process for Regression EE645 Final Project May 2005 Eric Saint Georges

References

• [1] Gaussian Processes, C.K. Williams, March1, 2002

• [2] Efficient Implementation of Gaussian Processes, Mark Gibbs, David J.C. MacKay, May 28, 1997

• [3] Sparse Online Gaussian Processes, Lehel Csató, Manfred Opper, October 9, 2002

• [4] Neural Networks for pattern Recognition, Christopher M. Bishop

• [5] Time Series Competition Data downloaded from http://www.esat.kuleuven.ac.be/sista/workshop/competition.html

• [6] Castó OGP toolbox for Matlab and Demo Program Tutorial from http://www.kyb.tuebingen.mpg.de/bs/people/csatol/ogp/