asca: analysis of multivariate data from an experimental design, biosystems data analysis group...

18
ASCA: analysis of multivariate data from an experimental design, Biosystems Data Analysis group Universiteit van Amsterdam

Upload: brendan-park

Post on 01-Jan-2016

228 views

Category:

Documents


0 download

TRANSCRIPT

ASCA: analysis of multivariate data from an experimental design,

Biosystems Data Analysis group

Universiteit van Amsterdam

                   

Contents

• ANOVA

• SCA

• ASCA

• Conclusions

ANOVA

• different design factors contribute to the variation

withinBABtotal SSSSSSSSSS A

For two treatments A and B the total sum of squares can be split into several contributions

cdqdqcqqcdq αββαμx

Example

Experiment:

Time: 6, 24 and 48 hours

Experimental Design:

Rats are given Bromobenzene that affects the liver

Groups: 3 doses of BB

Animals: 3 rats per dose per time point

Vehicle group, Control group

Rat

111

Rat

211

Rat

311

Rat

112

Rat

212

Rat

312

Rat

113

Rat

213

Rat

313

Rats

6 hours

24 hours

48 hours

0246810

chemical shift (ppm)

2.932.7175 2.075

3.7525 3.675

3.0475 5.38 3.285

2.055

3.0275

Measurements: NMR spectroscopy of urine

NMR Spectroscopy

- Each type of H-atom has a specific Chemical shift

- The peak height is number of H-atoms at this chemical shift = metabolite concentration

- NMR measures ‘concentrations’ of different types of H-atoms

02468100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

chemical shift (ppm)

2.93

2.7175 2.075

3.7525 3.675

3.0475

5.38 3.285

2.055

3.0275

Different contributions

0 0.2 0.4 0.6 0.8 1-0.5

0

0.5

1

1.5

2

2.5

3

3.5

4

time

Me

tab

olit

e c

on

cen

tra

tion

0 0.2 0.4 0.6 0.8 1time

Time

Animal

0 0.2 0.4 0.6 0.8 1time

Dose

0 0.2 0.4 0.6 0.8 1time

Trajectories

Experimental Design

The Method I: ANOVA

Constraints:

0k

k

0hk

hki

hki

0h

hk0 0.2 0.4 0.6 0.8 1time

0 0.2 0.4 0.6 0.8 1time

0

0

0

0 0.2 0.4 0.6 0.8 1time

Data

Individualih

Dose grouph

Timek

MeaningSymbol

khihx

hkhk hkihkkhkix Estimates of these factors:

khhkikkhkhki xxxxxxxxhkhk ............

The Method IIANOVA is a Univariate technique

x X

2.932.7175 2.0753.7525 3.675

3.0475 5.38 3.285

2.055

3.0275

khihx

hkhk hkihkkhkix

αβγαβαT XXX1mX

M A T R I C E S :

αβγαβαT XXX1mX

M A T R I C E S :

2

αβγ

2

αβ

2

α

2T2XXX1mX

F o r a l l v a l u e s i n t h e A N O V A e q u a t i o ne . g . :

αk Xα

F o r a l l v a l u e s i n t h e A N O V A e q u a t i o ne . g . :

αk Xα

Structured !

Multivariate DataNMR Spectroscopy

02468100

0.1

0.2

0.3

0.4

0.5

0.6

0.7

chemical shift (ppm)

2.93

2.7175 2.0753.7525 3.675

3.0475

5.38 3.285

2.055

3.0275

Covariance between the variables

-0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5-0.02

-0.01

0

0.01

0.02

0.03

0.04

2.05 ppm6

.01

pp

m

Or:Relationship between the columns of X

X

The Method III: Principal Component Analysis

3D 2D … Imagine!350D 2D !!!

10

0.5

1

0

0.5

1

xx2

0

0.5

1

1.5

2

2.5

3

x 3 X

Loading PC 1

Loading PC 1

Loading PC 2

Loading PC 2

ETP1mX TT loadingsscores

residuals

-0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

PC 1

PC

2

Scores

The Method IV: ANOVA and PCA ASCA

Column spaces areOrthogonal

E

Parts of the data not explained by the component models

αβγαβαT XXX1mX

EPTPTPT1mX Tαβγαβγ

Tαβαβ

Tαα

T

X

Tαβγ

PαβγPαβPα

TαβTα

X

Tαβγ

PαβγPαβPα

TαβTα

X

Tαβγ

PαβγPαβPα

TαβTα

In Words:

• ASCA models the different contributions to the variation in the data

• ASCA takes the covariance between the variables into account

• ASCA gives a solution for the problem at hand.

Results I

40 %

Xαβ

XαβγXαXα

XαβXαβ

Xαβγ

6 24 48

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

Time (Hours)

Sco

res

controlvehiclelowmediumhigh

αβ -scores

Results II

• Quantitative effect!

• No effect of vehicle

• Scores are in agreement with visual inspection

6 24 48

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5

Time (Hours)

Sco

res

controlvehiclelowmediumhigh

Results III biomarkers

3.9675 2.735

3.6753.7525

2.0552.5425

5.383.0475

2.58252.6975

3.9675 2.7352.6975

2.933.0275 2.91

2.5825

2.075

3.285

2.055

3.8875 3.73 2.0553.0475 2.93 2.075

2.735

3.0275

3.2625

3.285

0246810chemical shift (ppm)

αβγ

αβ

αDifferencesbetween submodels

Interesting for Biology

Interesting for Diagnostics

Unique to the α submodel

Conclusions

• Metabolomics (and other –omics) techniques give multivariate datasets with an underlying experimental design

• For this type of data, ASCA can be used• The results observed for this experiment

are in accordance with clinical observations• The metabolites that are responsible for

this variation can be found using ASCA BIOMARKERS

Discussion

1. How can I perform statistics on the ASCA model? (e.g. Significance testing)

2. Are there other constraints possible for this model? (e.g. stochastic independence)

3. Are there alternative methods for solving this problem?