pathway based omics data classification

Post on 22-Jan-2018

30 Views

Category:

Education

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

PathwaybasedOMICsdataclassification

Bioinformatics- 2016/2017

Goals

• Classificationwithpathways- Groupofgenesthatareinvolvedinthesamebiologicalfunctions

• Identifyrelationsamongpathways

• BuildagraphofinteractionsbetweenpathwaysandmiRNAs

Data

• BreastCancer(BC)• 151patients

• RNA(20501genes)• miRNA(1046)

• 4classes• LumA - 55• LumB - 59• Basal- 24• Her2- 13

PathwaysbyMSigDBà KEGG,Reactome,Biocarta,C6…

• Glioma• 167patients

• RNA(12042genes)• miRNA(534)

• 4classes• Proneural - 52• Classical- 37• Mesenchymal- 54• Neural- 24

Introduction

DiscriminantFuzzyPatterns

EnrichmentAnalysis

ClassificationlinearSVM

PermutationTest

Genes

miRNAsMSigDB

InteractionGraph

Firststep

• BCDatasetà TrainingSet(70%)andTestSet(30%)

• GliomaDatasetà TrainingSet(75%)andTestSet(25%)

Featureselection

• Discriminantfuzzypattern• Toomanyfeaturesà Identifydiscriminantgenes

• Enrichmentà Groupinggenesinpathways(MSigDB)• IdentifywhichpathwaysaresignificantlyrepresentedbythegenesselectedwiththeDFPalgorithm

DiscriminantFuzzyPattern– Gridsearch(BC)

• Skipfactorà 0,1,2,3• Factortoskipoutliers.Lowervalueàmorevaluesskipped(0:don’tskip)

• Zetaà 0.35,0.4,0.45,0.5• Threshold used inthemembership functions tolabel thefloatvalues withadiscretevalue

• piValà 0.4,0.45,0.5,0.55,0.6,0.65,0.7,0.75,0.8• Percentage ofvalues ofaclass todetermine thefuzzy patterns

• Overlappingà 1,2• Determines thenumber ofdiscretelabels

• Genes after DFPà 578withSkip Factor 1,Zeta0.4,piVal 0.65andOverlapping 1

DiscriminantFuzzyPattern– Gridsearch(Glioma)• Skipfactorà 1,2,3• Zetaà 0.35,0.4,0.45,0.5• piValà 0.6,0.65,0.7• Overlappingà 1,2

Genes after DFPà 635withSkip Factor 1,Zeta0.35,piVal 0.65andOverlapping 1

Enrichment

• BreastCancer• Numberofpathwaysselectedthroughenrichment:1585• Numberofpathwayswithmorethan10genes:859

• Glioma• Numberofpathwaysbyenrichment:1612p-value0.0001• First1000pathwayswithmorethan10genesandlowestp-value

ClassificationwithSVM

• Linear SVM

• Two level cross-validation• 3outer folds• 2inner folds

• C:1e-5,1e-4,1e-3,1e-2,1e-1,1e0,1e1,1e2,1e3,1e4,1e5,1e6

FirststepofclassificationPatients

gene

spathway

1

gene

spathway

2

gene

spathway

i

gene

spathway

3

LinearSVM1

LinearSVM2

LinearSVM3

LinearSVMi

Classprob.

Patie

nts

Patie

nts

Patie

nts

Patie

nts

Classprob.

Classprob.

Classprob.

Sizevs PathwaysAccuracy(BC)

Correlation:0.028

SizevsPathwaysAccuracy(Glioma)

Correlation:0.342

Pathwaysafterpermutationtest

• 1000permutationtestsonthepathwayswithbestaccuracies

• Breastcancer• Numberofpathwaysthatpassedpermutationtest:36

• Lowestaccuracy77.9%• Highestaccuracy84.6%

• Glioma• Numberofpathwaysthatpassedpermutationtest:278

• Lowestaccuracy80%• Highestaccuracy88%

• ACEVEDO_FGFR1_TARGETS_IN_PROSTATE_CANCER_MODEL_UP

• DEBIASI_APOPTOSIS_BY_REOVIRUS_INFECTION_DN

• DELACROIX_RARG_BOUND_MEF

• ENK_UV_RESPONSE_EPIDERMIS_UP

• ENK_UV_RESPONSE_KERATINOCYTE_DN

• FARMER_BREAST_CANCER_APOCRINE_VS_BASAL

• GO_CELLULAR_RESPONSE_TO_LIPID

• GO_CIRCULATORY_SYSTEM_PROCESS

• GO_GLAND_DEVELOPMENT

• GO_REGIONALIZATION

• GO_REGULATION_OF_CELL_CYCLE_PHASE_TRANSITION

• GO_REGULATION_OF_PROTEIN_SERINE_THREONINE_KINASE_ACTIVITY

• GO_RESPONSE_TO_ALCOHOL

• GO_RESPONSE_TO_ESTROGEN

• GO_RESPONSE_TO_STEROID_HORMONE

• GO_UROGENITAL_SYSTEM_DEVELOPMENT

• GSE1460_NAIVE_CD4_TCELL_ADULT_BLOOD_VS_THYMIC_STROMAL_CELL_DN

• GSE21927_SPLEEN_VS_4T1_TUMOR_MONOCYTE_BALBC_DN

• GSE23502_WT_VS_HDC_KO_MYELOID_DERIVED_SUPPRESSOR_CELL_COLON_TUMOR_DN

• GSE26351_WNT_VS_BMP_PATHWAY_STIM_HEMATOPOIETIC_PROGENITORS_UP

• HALLMARK_ESTROGEN_RESPONSE_LATE

• LEI_MYB_TARGETS

• LIU_PROSTATE_CANCER_DN

• MODULE_18

• MODULE_255

• MODULE_52

• NFE2L2.V2

• SATO_SILENCED_BY_METHYLATION_IN_PANCREATIC_CANCER_1

• SHEN_SMARCA2_TARGETS_DN

• SMID_BREAST_CANCER_RELAPSE_IN_BONE_DN

• V$ALPHACP1_01

• V$TEF1_Q6

• V$ZIC2_01

• VANTVEER_BREAST_CANCER_ESR1_DN

• VECCHI_GASTRIC_CANCER_EARLY_DN

• ZHANG_BREAST_CANCER_PROGENITORS_UP

BCPathways

GliomaPathways

• MEISSNER_NPC_HCP_WITH_H3K4ME2 • YYCATTCAWW_UNKNOWN • RIGGI_EWING_SARCOMA_PROGENITOR_UP • DEURIG_T_CELL_PROLYMPHOCYTIC_LEUKEMIA_DN

• MODULE_169 • GO_REGULATION_OF_MEMBRANE_POTENTIAL • GSE24574_BCL6_LOW_TFH_VS_NAIVE_CD4_TCELL_UP

• GSE25677_MPL_VS_R848_STIM_BCELL_DN • REACTOME_AXON_GUIDANCE • MODULE_19• HELLER_HDAC_TARGETS_SILENCED_BY_METHYLATION_UP

• GO_ACTIN_BINDING• GSE3982_EOSINOPHIL_VS_BASOPHIL_UP• GSE3982_MAC_VS_TH2_UP • V$TATA_C • GO_REGULATION_OF_ANATOMICAL_STRUCTURE_SIZE

• MODULE_52 • SCHAEFFER_PROSTATE_DEVELOPMENT_48HR_UP•DAVICIONI_TARGETS_OF_PAX_FOXO1_FUSIONS_UP

•DEURIG_T_CELL_PROLYMPHOCYTIC_LEUKEMIA_UP

• GSE21063_CTRL_VS_ANTI_IGM_STIM_BCELL_NFATC1_KO_16H_UP

• KAECH_NAIVE_VS_MEMORY_CD8_TCELL_DN• SANSOM_APC_TARGETS_DN • GO_SINGLE_ORGANISM_CELL_ADHESION• HOLLMANN_APOPTOSIS_VIA_CD40_DN

• GSE22025_TGFB1_VS_TGFB1_AND_PROGESTERONE_TREATED_CD4_TCELL_DN

• GO_CELL_SUBSTRATE_JUNCTION

• GSE3982_NEUTROPHIL_VS_EFF_MEMORY_CD4_TCELL_UP

• HIRSCH_CELLULAR_TRANSFORMATION_SIGNATURE_UP

• GSE21927_SPLENIC_C26GM_TUMOROUS_VS_BONE_MARROW_MONOCYTES_DN

• GSE3982_BASOPHIL_VS_CENT_MEMORY_CD4_TCELL_UP

• MCBRYAN_PUBERTAL_BREAST_4_5WK_UP

• GO_CELL_CELL_JUNCTION

• GSE13411_NAIVE_BCELL_VS_PLASMA_CELL_UP

• GO_AXON

• GO_REGULATION_OF_INTRACELLULAR_PROTEIN_TRANSPORT

• GO_TELENCEPHALON_DEVELOPMENT

• GSE13484_UNSTIM_VS_12H_YF17D_VACCINE_STIM_PBMC_DN

• LEF1_UP.V1_DN

• CASORELLI_ACUTE_PROMYELOCYTIC_LEUKEMIA_UP

• GO_ACTIVATION_OF_IMMUNE_RESPONSE

• GO_EPITHELIAL_CELL_DIFFERENTIATION

• GO_POSITIVE_REGULATION_OF_CELL_ADHESION

• GSE15735_2H_VS_12H_HDAC_INHIBITOR_TREATED_CD4_TCELL_UP

• MODULE_8

• BLALOCK_ALZHEIMERS_DISEASE_INCIPIENT_UP

• GO_DENDRITE

• GSE3982_CENT_MEMORY_CD4_TCELL_VS_TH2_UP

• KIM_WT1_TARGETS_UP

• GO_REGULATION_OF_NEURON_PROJECTION_DEVELOPMENT

GliomaPathways

Graph- (1)

• BuildinteractiongraphbetweenpathwayandmiRNAcommunities

• Wefirstcomputeinteractionsbetweenpathways• InteractionScorematrix

• WethenaddmiRNAsconnectingthemtopathways• CorrelationmatrixbetweenmiRNAsandgenes• Fisher'sexacttest

• WeaddedgesbetweenmiRNAs• Weightednetworkprojection

Graph- (2)

• GroupmiRNAsincommunities• Walktrap algorithm

• WereplacemiRNAswithnodesrepresentingmiRNAcommunities

• Wefinallyidentifycommunitiesinthewholeinteractiongraph

InteractionScore

Relationsamongpathways:interactionscore(IS)

!" = |%& −%(|"& + "(

MandSarerespectivelymeanandstandarddeviationofthetwopathwaysxandy

Weapplyacutoffontheresultinginteractionmatrix

miRNAandPathwaysinteraction

• WeevaluatePearsoncorrelationbetweenthemiRNAandallthegenesinthepathway.Wethenapplyacutofftoselectstrongcorrelations.

• ThenforeachmiRNAandpathwayweuseFisher’sexacttest,todetermineifthemiRNAissignificantlylinkedtothepathway(i.e.wecheckifthereisasignificantnumberofgenesincommon)

TheGoal

TheGoal(2)

FinalclassificationPatients

Stacking

with

pathw

ay

SVM Classes foreach patient

Patients

gene

spathways

LinearSVM

LinearSVM

LinearSVM

LinearSVM

Classprob.

Patie

nts

Patients

miRNA

sand

pathw

ayconn

ected

LinearSVM

LinearSVM

LinearSVM

LinearSVM

Classprob.

Patie

nts

Patients

Stacking

with

pathway

andmiRNA

s

SVM Classes foreach patient

Fine

top related