investigating the utility of clinical outcome-guided mutual information network in network-based cox...

46
Investigating the utility of clinical outcome- guided mutual information network in network-based Cox regression Hyun-hwan Jeong, So Yeon Kim, Kyubum Wee, Kyung-Ah Sohn Department of Information and Computer Engineering, Ajou University

Upload: hyun-hwan-jeong

Post on 05-Aug-2015

53 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Investigating the utility of clinical outcome-guided mutual information network in

network-based Cox regression

Hyun-hwan Jeong, So Yeon Kim, Kyubum Wee, Kyung-Ah Sohn

Department of Information and Computer Engineering, Ajou University

Page 2: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Outline

• Motivation

• Methods

• Results

• Conclusions

1

Page 3: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

MOTIVATION

2

Page 4: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Cox regression model (Cox, 1972) (1/2)

• Commonly used model in survival analysis

– Proportional hazard model

– Parameters estimation using partial log-likelihood

• Pros and Cons

– Pros: able to handle censored patients

– Cons: not feasible for a situation when 𝑝 ≫ 𝑛

3

Page 5: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Cox regression model (Cox, 1972) (2/2)

• Several variants of Cox model proposed to resolve this issue

– 𝐿1-regularization (Tibshirani, 1997)

– 𝐿2-regularization in Hibert space (Li and Luan, 2002)

• However, these models are still prone to noise and over-fitting to the small 𝑝

– The models consider only marginal effects for each individual feature

4

Page 6: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

A solution – use of prior information

• Network based Cox regression (Net-Cox) (Zhang et al., 2013)

– An extension of 𝐿2-Cox regression

– Network regularization in penalized term which reflects effect of interactions of pairwise features

5

Page 7: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Our first investigation

• Two types of network used in Zhang et al.’s study

– Co-expression network

– Functional linkage network

• A potential limitation of these networks

– No considerations of the association between features and outcomes

• Our assumption

– “network which reflects the association may improve the prediction performance”

6

Page 8: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Summary of this study

• Applying outcome-guided mutual information network into Net-Cox

• Prediction power comparison of different types of network

• Demonstrating the utility of the network for three genomic profiles of ovarian cancer patients in TCGA

7

Page 9: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

METHODS

1. Overview of survival analysis

2. Outcome-guided network construction

3. Applying outcome-guided mutual information to Net-Cox

8

Page 10: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Data preparationAnalysis using

a survival modelPrediction & detection

Overview of survival analysis for genomic profile

9

Page 11: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Data preparationAnalysis using

a survival modelPrediction & detection

Overview of survival analysis for genomic profile

10

Information of Patients• For 𝑛 patients over 𝑝 genes• Three types of information

expression profile - 𝑋 ∈ ℝ𝑛×𝑝

follow-up time - 𝑡 ∈ ℝ𝑝×1

observed status - 𝛿 ∈ 0,1 𝑝×1

Page 12: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Data preparationAnalysis using

a survival modelPrediction & detection

Overview of survival analysis for genomic profile

11

Estimation• baseline hazard - ℎ0(𝑡𝑖)• Regression coefficient - 𝛽

Model selection• Propositional hazard model - ℎ(𝑡|𝑋)

Page 13: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Data preparationAnalysis using

a survival modelPrediction & detection

Overview of survival analysis for genomic profile

12

Prediction of survivability Detection of important features

Page 14: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

METHODS

1. Overview of survival analysis

2. Outcome-guided network construction

3. Applying outcome-guided mutual information to Net-Cox

13

Page 15: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Construction of the outcome-guided mutual information networks (1/2)

• Mutual information

– An association measure in information theory• based on Shannon’s entropy.

• able to measure linear/non-linear association measure between two random variables.

– The measure widely used in GWAS to measure strength of association between SNPs and traits. (Leem et al. 2014, Hu et al. 2011)

𝐼 𝑋1, 𝑋2; 𝑌 = 𝐻 𝑋1, 𝑋2 + 𝐻 𝑌 − 𝐻(𝑋1, 𝑋2, 𝑌)

pair of genomic features

binary outcomes

14

Page 16: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Construction of the outcome-guided mutual information networks (2/3)

• Two parameters - 𝜃 and 𝜎

mutual information

𝜃

# o

f ed

ges

θ = 𝑚𝑎𝑥𝑖≠𝑗 𝐼avg 𝑖, 𝑗

𝐼avg 𝑖, 𝑗 =1

30

𝑝=1

30

𝐼avg 𝑔𝑖 , 𝑔𝑗; 𝑌𝑝

15

Page 17: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Construction of the outcome-guided mutual information networks (3/3)

• Two parameters - 𝜃 and 𝜎

mutual information

𝜃 𝜃 ∗ 1 + 𝜎

# o

f ed

ges

θ = 𝑚𝑎𝑥𝑖≠𝑗 𝐼avg 𝑖, 𝑗

𝐼avg 𝑖, 𝑗 =1

30

𝑝=1

30

𝐼avg 𝑔𝑖 , 𝑔𝑗; 𝑌𝑝

16

Page 18: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Construction of the outcome-guided mutual information networks (3/3)

• Two parameters - 𝜃 and 𝜎

mutual information

𝜃 𝜃 ∗ 1 + 𝜎

# o

f ed

ges

θ = 𝑚𝑎𝑥𝑖≠𝑗 𝐼avg 𝑖, 𝑗

𝐼avg 𝑖, 𝑗 =1

30

𝑝=1

30

𝐼avg 𝑔𝑖 , 𝑔𝑗; 𝑌𝑝

𝐺𝜎 = 𝑔𝑖 , 𝑔𝑗 𝑔𝑖 , 𝑔𝑗 ∈ 𝑃 𝑎𝑛𝑑 𝐼 𝑔𝑖 , 𝑔𝑗; 𝑌 ≥ 𝜃(1 + 𝜎)}

17

Page 19: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

METHODS

1. Overview of survival analysis

2. Outcome-guided network construction

3. Applying outcome-guided mutual information to Net-Cox

18

Page 20: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Overview of Net-Cox (1/3)

19

Log-likelihood in 𝐿2-Cox

Page 21: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Overview of Net-Cox (2/3)

20

Log-likelihood in 𝐿2-Cox+

Network regularization

Page 22: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Overview of Net-Cox (3/3)

21

Feature network

Page 23: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Applying outcome-guided mutual information to Net-Cox (1/2)

22

Outcome-guidedmutual information network

𝒍𝒑𝒆𝒏 𝜷, 𝒉𝟎 = 𝒍 𝜷, 𝒉𝟎 −𝟏

𝟐𝝀𝜷′[ 𝟏 − 𝜶 𝑳 + 𝜶𝑰]𝜷

• 𝐿 = 𝐼 − 𝑆 𝐼 : identity matrix 𝑆 : normalized Laplacian matrix

• 𝜆 - control parameter• α ∈ 0,1 Network contribution parameter If 𝛼 = 1, Net-Cox ≡ 𝐿2-Cox

Penalty terms

• Penalized total log-likelihood of Net-Cox

Page 24: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Applying outcome-guided mutual information to Net-Cox (2/2)

• Definition of edge weight in 𝐺𝜎

𝐺𝜎𝑖𝑗 =𝐼 𝑔𝑖 , 𝑔𝑗; 𝑌

𝑔𝑖,𝑔𝑘∈𝐺𝜎𝐼(𝑔𝑖 , 𝑔𝑘; 𝑌) (𝑔𝑗,𝑔𝑘)∈𝐺𝜎

𝐼(𝑔𝑗 , 𝑔𝑘; 𝑌)

• Matrix normalization for Laplacian constraint

– 𝑆 = 𝑅−1

2 𝐺𝜎𝐶−1

2

• 𝑅𝑖𝑖 = 𝑗 𝐺𝜎 𝑖𝑗

• 𝐶𝑖𝑖 = 𝑗 𝐺𝜎′

𝑖𝑗

23

Page 25: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

RESULTS

24

Page 26: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Dataset - Ovarian Cancer data in TCGA

• Measurements for 10,022 genes of 340 cancer patients in three different genomic level

• Survival month classification:– Short-term(<36 month), long-term(otherwise)

Genomic profile Platform Data Type

CNA Affymetrix SNP 6Discrete (GISTIC)

Continuous

mRNA Agilent microarray Continuous

methylation Illumina Infinium HumanMethylation27 Continuous

25

Page 27: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Performance comparison of feature networks

• Outcome-guided mutual information network

• Co-expression network (Zhang et al. 2013)

• Functional linkage network (Zhang et al. 2013)

• Without feature network (𝐿2-Cox model)

26

Page 28: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Best parameter selection (1/2)

• 5-fold cross validation

– 80% of samples for training and testing

– 20% of samples for validation

• Parameters

– 𝜎 = 0.00, 0.05, 0.10, 0.20, 0.25, 0.30

– 𝜆 = 10−4, 10−3, 10−2, 10−1

– 𝛼 = [0.1, 0.3, 0.5, 0.7, 0.9, 1.0]

27

Page 29: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Best parameter selection (2/2)

• Performance measure

– time-dependent 𝐴𝑈𝐶 𝑡 𝑓 𝑋

• 𝑓 𝑋 = 𝑋′𝛽

– 𝐴𝑈𝐶[𝑡|𝑓 𝑋 ] – area under the 𝑅𝑂𝐶 𝑡 𝑓 𝑋curve

– 𝑅𝑂𝐶 𝑡 𝑓 𝑋 - sensitivity vs. 1-specificity– 𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦 𝑐, 𝑡 𝑓(𝑋) = Pr 𝑓 𝑋 > 𝑐 𝛿 𝑡 = 1

– 𝑠𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦 𝑐, 𝑡 𝑓(𝑋) = Pr 𝑓 𝑋 > 𝑐 𝛿 𝑡 = 0

28

Page 30: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Assessment of significance

• Log-rank test for validity of the group assessment– Prognostic indices 𝑃𝐼 = 𝑋′ 𝛽

• Top 40% of patients as high-risk, bottom 40% patients as low-risk in descending order

• Network analysis– For sub-network of 100 largest coefficient genes

– Measurement of network properties

• Enrichment test for biological terms– ToppGene (https://toppgene.cchmc.org)

– Gene Ontology(GO), disease, pathway

29

Page 31: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

RESULTS

1. Cross validation & statistical assessment

2. Network analysis

30

Page 32: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Prediction accuracy for 𝐺𝜎

31

Page 33: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Optimal parameters for each network

32

Profile Network 𝝈 𝛌 𝛂 Mean(AUC)

CNA

Mutual Information 0.30 10−3 0.1 0.5875

Correlation - 10−3 0.9 0.5817

Functional Linkage - 10−3 0.3 0.5786

𝐿2 − Cox - 10−3 1.0 0.5810

mRNA

Mutual Information 0.10 10−4 0.1 0.6317

Correlation - 10−4 0.9 0.6280

Functional Linkage - 10−4 0.3 0.6242

𝐿2 − Cox - 10−4 1.0 0.6288

METH

Mutual Information 0.30 10−4 0.5 0.5912

Correlation - 10−4 0.7 0.5894

Functional Linkage - 10−4 0.3 0.5860

𝐿2 − Cox - 10−4 1.0 0.5899

Page 34: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Performance comparison of different network types in validation set

33

Page 35: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Kaplan-Meier survival curves with log-rank test

34

Mutualinformation

Co-expression network

FunctionalLinkage

𝑳𝟐-Cox

CNA mRNA METH

Page 36: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Significant genes for each profile

35

Page 37: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Enrichment test of 100 largest coefficient genes for each profile

36

Profile Category Name p-valueAdjustedp-value

CNA

Disease Ovarian Neoplasms 5.24E-04 2.84E-02

Disease Carcinoma 7.42E-03 3.66E-02

mRNA

GO:MF chitinase activity 4.80E-06 1.78E-03

GO:BP chitin catabolic process 4.83E-06 4.94E-03

GO:BP chitin metabolic process 4.83E-06 4.94E-03

Page 38: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

RESULTS

1. Cross validation & statistical assessment

2. Network analysis

37

Page 39: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Sub-network with large coefficients genes for each profile

38

CNA

METH

mRNA

Page 40: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Network properties of mutual information sub-network

39

Properties CNA mRNA METH

Nodes 78 154 149

Connected components 7 54 50

𝑹𝟐 of node degree distribution 0.672 0.922 0.909

Network centralization 0.312 0.058 0.052

Characteristic path length 2.117 2.187 2.164

Average number of neighbors 2.564 1.299 1.342

Network density 0.033 0.008 0.009

Network heterogeneity 1.383 0.828 0.809

Page 41: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Enrichment test for each sub-network of profiles

40

Profile Category Name p-valueAdjusted

p-value

CNA

Disease Hyperlipidemias 1.76E-05 2.24E-03

Disease Obesity 1.98E-03 1.21E-02

Disease Insulin Resistance 8.96E-03 2.42E-02

Disease Neoplasms 1.05E-02 2.78E-02

mRNADisease Carcinoma 1.19E-04 1.23E-02

Disease Neoplasm Recurrence, Local 1.25E-02 4.96E-02

METH

GO:MF G-protein coupled peptide receptor activity 4.15E-05 1.21E-02

GO:MF peptide receptor activity 4.86E-05 1.21E-02

GO:MF anaphylatoxin receptor activity 1.78E-04 1.78E-02

GO:MF calcitonin receptor activity 1.78E-04 1.78E-02

GO:MF thyroxine 5'-deiodinase activity 1.78E-04 1.78E-02

Page 42: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

CONCLUSIONS

41

Page 43: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Conclusions (1/2)

• Outcome-guided mutual information network

– Can further improve prediction performance

– The permutation testing scheme also helps the improvement

• In the analysis genomic profiles ovarian cancer patients,

– Sub-network for largest coefficient genes have network topologies of biological networks and show biological significance

42

Page 44: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Conclusions (2/2)

• However, the proposed method shows rather marginal performance improvement.

– It seems due to a mismatch between the high mutual information value and the small value of penalty term in the Net-Cox model.

– We plan to modify of the penalty term and it will further improve of the power.

43

Page 45: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

감사합니다.

44

Page 46: Investigating the utility of clinical outcome-guided mutual information network in network-based Cox regression

Q & A

45