principal component analysis: preliminary studies Émille e. o. ishida if - ufrj first rio-saclay...

20
Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro - dec/2006

Upload: anthony-daniel

Post on 03-Jan-2016

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

Principal Component Analysis: Preliminary Studies

Émille E. O. IshidaIF - UFRJ

First Rio-Saclay Meeting: Physics Beyond the Standard ModelRio de Janeiro - dec/2006

Page 2: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

The main objective of:

PhysicsStatistics

ScienceSimplification

Statistics is the art of extracting simple comprehensible facts that tell us what we want to know for practical reasons

Principal Component Analysis (PCA) is a tool for simplifying one particular class of data......

–astro-ph/9905079

Page 3: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

For example...nn objects and pp things we know about them...

-height;-n° publications;-flier miles;-fuel consumption;

-height;-n° publications;-flier miles;-fuel consumption;

-height;-n° publications;-flier miles;-fuel consumption;

-height;-n° publications;-flier miles;-fuel consumption;

-height;-n° publications;-flier miles;-fuel consumption;

-height;-n° publications;-flier miles;-fuel consumption;

n=6n=6 objects and p=4p=4 things we know about them...

How this parameters are related to each other?

–astro-ph/9905079

Page 4: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

For example...

Do people who spend most of their lives in airports publish more?

Do people with inefficient cars fly more..... or just the ones with lots of publications do?

Do these correlations represent any real causal connection?

or..... once you buy a car, stop publishing and give lots of talks in exotic foreign locations?

–astro-ph/9905079

Page 5: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

First try: Plot everything against everything else...

...as the number of parameters increases this becomes impossibly complicated!

PCA looks for sets of parameters that always correlate togheter

The first application of PCA was in social science....

Ex: give a sample of n people a set of p exams testing their creativity, memory, math skills....And look for correlations.....

Result: nearly all tests correlates to each other, indicating that one underlying variable could predict the performancesin all tests

IQ.....an infamous begginig...!!

–astro-ph/9905079

Page 6: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

General Idea:

Given a sample of: n objects;p measured quantities - xi (i=1,2,3,....,p)

Find a new set of p orthogonal variables i , ... peach a linear combination of the original ones

pipjijii xaxaxa .......11

Determine aij such that the smallest number of new variables account for as much of the sample variance as possible.

Principal Components

–astro-ph/9905079

Page 7: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

Basic Statistics n1 x,...,xx 1 sample

},{ nn11 yx,...},y,{x 2 sample

Mean Value:

n

i

i

n

xx

1

n

i

i

n

xx

1

22

1

Covariance:

n

i

ii

n

yyxxyx

1 1),cov(

Variance:

http://csnet.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf

Page 8: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

Covariance Matrix in 2-D

),cov(),cov(

),cov(),cov(

yyxy

yxxxC

Eigenvectors New axes (new uncorrelated variables)

Eigenvalues variances in the direction of the Principal Components

The largest eigenvalue First Principal Component

p

jjji bx

1

http://csnet.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf

Page 9: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

But.....that´s not our case....

We want to make inferences about a model using a sample of data....

Parameter EstimationParameter Estimation

:Estimator

ˆlim

data

2

12

222min

;ln1

n

i

ixfEIb

I

Consistency:

Bias:

Efficiency:

Robusteness:

ˆEb

(noise) pdf the in sassumption initial of ceIndependen

http://pdg.lbl.gov/)

Page 10: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

The Method of Maximum Likelihood

;

,....,1

i

n

m1

xf

x,...,xx

pdf

parameters

sample

Function Likelihood the maximizethat of values i ˆ

;1

i

m

ixfL

http://pdg.lbl.gov/)

Page 11: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

For an unbiased estimator....

ˆ

1 lnˆji

ij

LC

We can calculate the covariance between We can calculate the covariance between the the parameters parameters of the theoryof the theory

Fisher MatrixFisher Matrix ji

ijij

LCF

ln21

http://pdg.lbl.gov/)

Page 12: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

What about Cosmology?

Direct evidence for an accelerated expansion:

Can we get information out of SN Ia Can we get information out of SN Ia observations without the assumption of observations without the assumption of

General Relativity?General Relativity?

2 2 2 2 2 2 2 22

1( )1

ds dt a t dr r d sen dkr

Page 13: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

2

0 0

1( ) 1

( ) exp 1 ( ) ln(1 )z

a dq z

aH dt H

H z H q u d u

z u

vduqduMpcH

zz

0 00

1ln1exp1

log525

Definitions....

25log5

10

MpcdMzm

uH

duzzd

LBB

z

L

Page 14: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

As proposed by Shapiro & Turner (2006)...As proposed by Shapiro & Turner (2006)...

z u

vduqduMpcH

zz

0 00

1ln1exp1

log525

N

i ii

iiii zzzc

zzzcwherezczq

1 ,0

,1

Modulus DistanceModulus Distance

z = 0.05;Data from Gold Sample

(Riess et al.)

2

2

2exp

2

1,;

i

if

Gaussian probabilitydistribution in each bin...

Page 15: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

N

j

x

i lk

j

j

ji

l

j

k

j

j

kl

j

F1 1

2

2

1

bin. th- the inside SN of

number the denotes and where

j

xxxxx jN,...,, 21

The Fisher Matrix

Observation about ...

jx

i

ibinthj N1

2

Page 16: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

PC1

PC2

PC3

PC4

PC5

PC6

Page 17: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

Reconstruction of q(z)

We need more data!

–arXiv:astro-ph/0512586

Page 18: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

Next Steps....

Small corrections in the present codeSmall corrections in the present code(optimization);(optimization);

Change the observable;Change the observable;

Get used to this procedure and be able to Get used to this procedure and be able to handle large data sets in a model handle large data sets in a model

independent wayindependent way

Page 19: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

References- D. Huterer e G. Starkman, Parametrization of dark energy

properties: A Principal-Component Approach, Physical Review Letters, 90 (3), Janeiro/2003

– C. Shapiro e M. S. Turner, What do we really know about cosmic acceleration?, arXiv:astro-ph/0512586

– G. Cowan, Statistical Data Analysis, Clarendon Press, Oxford (1998)

– P. J. Francis and B. J. Wills, Introduction to Principal Component Analysis, arXiv: astro-ph/9905079

– W.-M. Yao et al., Journal of Physics G 33, 1 (2006)available on the PDG WWW pages (URL: http://pdg.lbl.gov/)

Page 20: Principal Component Analysis: Preliminary Studies Émille E. O. Ishida IF - UFRJ First Rio-Saclay Meeting: Physics Beyond the Standard Model Rio de Janeiro

–arXiv:astro-ph/0512586

Shapiro & Turner (2006) Principal Components