Download - Identification of Genetic Markers for Treatment Success in ...circgenetics.ahajournals.org/content/circcvg/early/2014/09/08/CIRC... · Patients: Insight from Cardiac Resynchronization

DOI: 10.1161/CIRCGENETICS.113.000384

1

Identification of Genetic Markers for Treatment Success in Heart Failure

Patients: Insight from Cardiac Resynchronization Therapy

Running title: Schmitz et al.; Genetics in HF treatment success

Boris Schmitz, PhD1,2; Renata DeMaria, MD3; Dimitris Gatsios, BSc4; Theodora

Chrysanthakopoulou, BSc, MSc5; Maurizio Landolina, MD6; Maurizio Gasparini, MD7;

Jonica Campolo, MSc3; Marina Parolini, BStat3; Antonio Sanzo, MD6; Paola Galimberti, MD7;

Michele Bianchi, MD8; Malte Lenders, PhD2; Eva Brand, MD, PhD2; Oberdan Parodi, MD3;

Maurizio Lunati, MD8; Stefan-Martin Brand, MD, PhD1

1Institute of Sports Medicine, Molecular Genetics of Cardiovascular Disease, 2Internal Medicine D, Department of Nephrology, Hypertension and Rheumatology, University Hospital Münster, Münster, Germany; 3CNR Institute of Clinical Physiology, Cardiothoracic and Vascular Department, Niguarda

Ca’ Granda Hospital, Milan, Italy; 4University of Ioannina, Ioannina University Campus; 5Neuron Energy Solutions G.P., Science & Technology Park of Epirus, Ioannina, Greece; 6Department of

Cardiology, Fondazione IRCCS Policlinico San Matteo, Pavia; 7Department of Cardiology, Humanitas Research Hospital IRCCS, Rozzano-Milan; 8Cardiothoracic and Vascular Department,

Niguarda Ca’ Granda Hospital, Milan, Italy

Correspondence:

Dr. rer. nat. Boris Schmitz

University Hospital Münster

Institute of Sports Medicine

Molecular Genetics of Cardiovascular Disease

Horstmarer Landweg 39

48149 Münster, Germany

Tel: +49/251/83-52996

Fax: +49/251/83-35387

E-mail: [email protected]

Journal Subject codes: [11] Other heart failure, [33] Other diagnostic testing, [27] Other treatment

; ,

D,DD PPPPhDhDhDhD1111

o 2 it of Nephrology, Hypertension and Rheumatology, University Hospital Münster, MCNR Institute of Clinical Physiology, Cardiothoracic and Vascular Department, Nid e

no ys m

of Spopoportrttrtss MeMeeMediciciciinnne, Molecular Genetics of fff CaCC rdiovascular Disisiseeease, 2Internal Medicit ooof NNNeN phrolololologygyygy, ,, HyyyHypepeppertrtrttenee sisisionononon andndndn RRRRheheheumuu aatologggy,y,y,y, UUUniiiivevevev rsrsrsity y y Hospspspspititiitalalall Mününününstststerererr,,,, MCNCNCNC R Institute offof Cllinnnicaaaall l l PPhysysysiiollloggy, CCarddioothhororacacacacicc andnnd Vaasccculalalarrr Deeepapapap rtmemm nttt, NNNidaaaa HHHHosoo pital,,,, MMiilaaan, Itttaly;;; 4444UnUU ivvvere sssitty ooof f Ioananninanan , Iooaannnninna UUnnniveersrssity CaCaCaCampmpmpus; 5NNeolututttioioioi nsnsnsns GGG P.P., SSScicicic enenencececece & TTTTeccecechhnhh ollllogogogogyyyy PaPaPaP rkrkrkrk ooof fff EpEpEpE iriririrususus, IIoIoIoannanannininin nananaa, GrGrGrreeeeeeeecececee; 6DeDeDeD pappapartrtrtrtmememen

ology, Fondaziooonenenen IIIRCRCRCRCCSCSCSC PoPoPoP lilililiclclclinininnicicico ooo SaSSaS n n n MaMaMaatttttttteoee , PaPaPaP vivivivia;a;aa 7DeDeDeD papapap rtrtrtmememementnnn of Cardiologys Research Hos ipipipitattal lll IRIRIRCCCCCCC S,SS RRRRozozozzazazanonono-MiMMiM lallan;n;n; 88CCCardrddrdioioi ththththororacacacicici aaandndnd VVVVaascular Departm

NiNiNiNiguggg ararrardadadad CCCCa’a’a’a GGGGrararrandndndnda aa a HoHoHoHospspspititititalalal,,, MiMiMiM lalalaan,nn,n, IItatatatalylylyl

by guest on May 17, 2018

http://circgenetics.ahajournals.org/D

ownloaded from

by guest on M

ay 17, 2018http://circgenetics.ahajournals.org/

Dow

nloaded from



ownloaded from

by guest on M


Dow

nloaded from



ownloaded from

by guest on M


Dow

nloaded from



ownloaded from

by guest on M


Dow

nloaded from



ownloaded from

by guest on M


Dow

nloaded from



ownloaded from

by guest on M


Dow

nloaded from

http://circgenetics.ahajournals.org/













2

Abstract:

Background – Cardiac resynchronization therapy (CRT) can improve ventricular size, shape and

mass and reduce mitral regurgitation by reverse remodelling of the failing ventricle. About 30%

of patients do not respond to this therapy for unknown reasons. In this study, we aimed at the

identification and classification of CRT responder by the use of genetic variants and clinical

parameters.

Methods and Results – Out of 1,421 CRT patients, 207 subjects were consecutively selected and

CRT responder and non-responder were matched for their baseline parameters before CRT.

Treatment success of CRT was defined as a decrease in left ventricular end systolic volume

(LVESV) >15% at follow-up echocardiography compared to LVESV at baseline. All other

changes classified the patient as CRT non-responder. A genetic association study was performed,

which identified 4 genetic variants to be associated with the CRT responder phenotype at the

allelic (p<0.035) and genotypic (p<0.031) level: rs3766031 (ATPIB1), rs5443 (GNB3), rs5522

(NR3C2) and rs7325635 (TNFSF11). Machine learning algorithms were used for the

classification of CRT patients into responder and non-responder status, including combinations

of the identified genetic variants and clinical parameters.

Conclusions - We demonstrated that rule induction algorithms can successfully be applied for

the classification of heart failure patients in CRT responder and non-responder status using

clinical and genetic parameters. Our analysis included information on alleles and genotypes of 4

genetic loci, rs3766031 (ATPIB1), rs5443 (GNB3), rs5522 (NR3C2) and rs7325635 (TNFSF11),

pathophysiologically associated with remodelling of the failing ventricle.

Key words: heart failure, cardiovascular disease, risk factor, resynchronization, reverse remodeling, data mining, machine learning

y

aaaatttt babababaseseseselilililinenenene. AlAlAlAlllll otototheheheherrrr

ation sssstutututudydydydy wwwwasasasas pepepeperfrr o

tified 4 genetic variants to be associated with the CRT responder phenotype at th

0 5

n

on of CRT patients into responder and non-responder status, including combinat

t

n We demonstrated that rule induction algorithms can successfully be applied

tifiedededed 4444 gggeneneneeeticicicc variants to be associatedddd wwwiti h the CRT resppooonder phenotype at th

0.00033353 ) and geennotyttypipipic c c (p(p(pp<0<0<0<0 0.003131313 ) lelll vvel:lll rrs376766000313131 ((((ATATATA PIPIP B1B1B11(((( ))),) rs554444444433 3 ((((GNGNGNGNB3B3B3B3))), rrrs5ss 5

nd dd rsss7377 25633335 (TTTNFSFSSF11111).).).). Maca hhhih nne llllearnninng alalalgooriithmsmm weeerrre uusesesed fofofoforr r thhheee

on of CRT patienenentsts intto o o rerer spspsppondededer anaa d dd nononon-n-resps ononondedeer ststatttususu ,, , inii clcc uduu ing combinat

tified gegg netic va iriiiants a ddnd clllil nicall l papp rameters.

nnnss - WeWeWe dddememononsttstraratettedddd ththththatat rrululullee ininini dududu tctctioioionn alalalgogoriririththththmsms ccanan ssucuccecessssfufufuf llllllyy bebebe aapppplililiededed



ownloaded from



3

Introduction

The concept of individually optimized therapy, often referred to as personalized medicine, is

rapidly advancing in the field of modern health care,1 in particular for common diseases.

Personalized medicine is expected to improve the treatment of cardiovascular disease (CVD),

including prognosis of treatment outcomes.2 As a novel integrative approach, personalized

medicine in treatment of CVD will have to collect and selectively evaluate a patient’s unique

clinical and anthropometric parameters as well as information on genetic predisposition. It is

well known that CVD is a highly heritable trait,3 with individual combinations of multiple

genetic variants accounting for different CVD phenotypes4 in combination with classic risk

factors. Classic risk factors alone explain a large proportion (>50%) of CVD risk, while an

estimated 15% to 20% of myocardial infarction (MI) patients have none of the traditional risk

factors.5,6 Increased knowledge of the molecular mechanisms involved as well as insight into the

additive and interactive effects of multiple genetic variants and environmental factors have been

postulated as the foundation for novel therapeutic strategies.7 Even at the current state of

knowledge, genetic information allows clinicians to stratify individuals who are at intermediate

risk by generation of clinically useful treatment recommendations if interpreted correctly.7

We have most recently developed a data mining approach including rule-based machine

learning algorithms for the classification of CVD patients and the extraction of potential risk

predictors including genetic variants.8 In the current study, we have applied this methodology on

top of a genetic association study to extract potential combinations of genetic variants and

clinical parameters as markers for treatment success in patients with chronic systolic heart failure

(HF) treated by cardiac resynchronization therapy (CRT).

CRT combines right atrial and ventricular pacing with pacing of the left ventricular (LV)

binations of multiplplplple eee

ation witititith hhh lllclas isic ririririssksksk

a

5% to 20% of myocardial infarction patients have none of the traditional r

ncreased knowledge of the molecular mechanisms involved as well as insight in

d interactive effects of multiple genetic variants and environmental factors have

as the fo ndation for no el therape tic strategies 7 E en at the c rrent state of

assicccc rrisisissk kkk fafafaaccctooorsrsrs alone explain a large prororopportion (>50%) oof f f f CVD risk, while an

555%%% to 20% off f mmmyococcardididiiaala infnfnfnfaarccctiion (MMI)) ppattieieientntntnts haaavee nononne ofofoff the trrraddditttionnnaal r

ncreasasasedededed kkknoowlwll dededgege oof thhheee mmmmolecucuculallalar rrr meme hchchhananisisissmsmmsms iiiinvvvollololveveddd aas wweleelell ll as iiinsnsigigi hththht iiin

d interactive efffffects of ff multll ipiipllel gggenetici variiiai nts and ddd enviiiironmental factors have

hth ffo dda iti ff lel thhe tiic trat iie 77 EE t hth t stat fof by guest on May 17, 2018


ownloaded from



4

free wall by a third lead, introduced through the coronary sinus in the great cardiac vein to

resynchronize contraction between and within ventricles. CRT has been shown to ameliorate

ventricular size, shape and pump function, and reduce mitral regurgitation by reverse

remodelling (RR) of dilated failing ventricles and to improve survival in patients with moderate

to severe HF and intraventricular conduction delay.9 However, it is estimated that over one third

of patients do not respond to this therapy.10

Many criteria to define a positive response to CRT have been used with a wide variability

between studies.11,12 Proposed measures include (1) primary clinical end points such as mortality

due to progressive pump dysfunction or CV events and cardiac transplantation; (2) secondary

clinical end points such as re-hospitalization for worsening HF, and (3) subjective or objective

changes in functional capacity expressed as improved New York Heart Association (NYHA)

class or the increase in the distance walked in 6 minutes, respectively, 3 to 6 months after CRT

implantation. Echocardiographic criteria include changes observed 3 to 6 months after the

procedure in left ventricular ejection fraction (LVEF) or left ventricular end-systolic (LVESV) or

end-diastolic (LVEDV) volume, using different cut-off values. RR has been shown to start early

after CRT, to peak between 6 and 12 months and to be sustained in the long term, up to 5 years,

with only little further improvement.13,14

Agreement between clinical and echocardiographic criteria has been shown to be modest

at best.11 In general, the rate of response using clinical criteria is higher compared to the rate of

response when remodelling markers are considered but clinical measures of response are poorly

correlated to long-term prognosis. Conversely, death from CV causes or progressive pump

failure, have been shown to be dependent on RR, and changes in LVESV are acknowledged as a

reliable surrogate end point. RR and CV mortality appear to correlate in the medium-term and

nd points such as momomom r

lantatititition; (2(2(2(2) ))) secoooondndndnda

d points such as re-hospitalization for worsening HF, and (3) subjective or object

functional capacity expressed as improved New York Heart Association YHA

C

o

n left entric lar ejection fraction (LVEF) or left entric lar end s stolic (LVES

d poioioiointntntsss sususuchchchch aasss re-hospitalization for worororseening HF, and (((3)3)3)) subjective or object

funununnctional capapaacityy eeexpppprererer ssedededed asss iimppprooveded Newewww YYYoorkkk HHeaarttt AAsAssssociatatata ioiii nnn (((N( YHYHYHYHA

increreeasasasaseeee iiin tttheheh ddddisisi ttatancn e wawawaalklklklked iiiinnn 6666 mmminunuttetes, rrreeeespepepectivivivivelelelely, 3333 to 6666 mmmonttthshsh aaftftffterer C

on. Echocardiogrgg appphihihihic critii eriiia inclull deddd chahhh nggges obsbbb erved ddd 3333 to 6666 months after the

n ll feft nt iri lla jej iti ff ctiio (L(LVEVEF)F) llefft triic ll dnd st loliic ((LVLVEES by guest on May 17, 2018


ownloaded from



5

the relationship is sustained up to 5 years.15-17

Predictors of CRT success have been extensively investigated and include female gender,

non-ischemic etiology of HF, symptom severity, myocardial scar burden, QRS morphology and

duration and technical factors such as LV lead placement or proportion of time paced.10,18-21

Whether genetic variants associated with CVD may be differentially associated with CRT

success has been hitherto poorly investigated.

Our approach aimed at more specifically classifying CRT responders by inclusion of

predictive genetic markers within a study of CRT patients recently published by our group.22

Methods

Study design and patient selection

The CRT study for identification of predictive genetic markers was a retrospective multicenter

case-control study conducted at 3 Italian centers.22 The study was approved by the institutional

ethics committees of the participating centers and patients expressed their written informed

consent to participate. The study included HF patients who had undergone CRT to correct

mechanical dyssynchrony represented by a sequence abnormality in atrio-ventricular, or inter- or

intra-ventricular contraction according to guideline indications: any etiology of HF, NYHA

classification II - IV, a QRS duration on surface electrocardiogram

and LV end-diastolic diameter >55 mm.23 Assessment of scar burden was performed prior to

patient selection for the procedure; patients with extensive scar burden were excluded from CRT.

Further study entry criteria were stable positioning of the left lead at the lateral or postero-lateral

wall level and proportion of time paced >97%. In patients with atrial fibrillation, atrioventricular

(AV) node ablation was performed to achieve this percent pacing target and AV delay was

optimized under echocardiographic guidance immediately post implant.

blished by y our grououououp.ppp

g

tudy for identification of predictive genetic markers was a retrospective multicen

o o

mittees of the participating centers and patients expressed their written informed

gnnn aand patieeennt sellleleccctititiononon

tuddddy yy y fofofoorrr idididenenene tititifiiiicacacac ttit onononon offf f prprprredededediictitititiveveveve ggggeneneneteteteticcc mmmmarararrkekekekersrsrss wwwasasasas aaaa retetetetroorospspspspecececctitititiveeee mmmmulullultttit cececen

ol study conduccccteteteted dd atatata 333 IIIItatatat lililiianananan ccccenenene teteteersrsrsrs..22222 TTThehehee sstudydydydy wwwwasasaa aaaappppppp rorororoveveved ddd bbbyb the institutio

mimmitttteeeess ofof tthehhe pparartiticiicipapatititingng ccenentettersrs aandnd ppatatieientntss exexprpresessesed dd ththheiieirr wrwritittettenn ininfoformrmeded



ownloaded from



6

Out of 1,421 patients (18% deceased), implanted with CRT since 2002, the study enrolled

207 consenting subjects who had undergone the procedure since at least 6 - 12 months, had a

valid echocardiographic study to define the remodelling status at 6 to 12 months (median 9

months), and were consecutively reviewed in the electrophysiology outpatient clinic for routine

follow-up between March and December 2009 (figure 1).

Definition of treatment success

CRT treatment success, designated as reverse remodelling (RR+), was defined as a significant

decrease in LVESV >15% (i.e. a reduction in LV size) at follow-up compared to LVESV at

baseline determined by echocardiography. All other changes classified the patient as CRT non-

responder (RR-). For each RR+ patient, a RR- subject was enrolled matched by gender, age,

NYHA functional class, HF etiology and baseline LVEF.

Echocardiography

LVESV was measured by transthoracic echocardiography examinations at rest using

conventional methods with commercially available ultrasound devices (Sonos 7500 and IE33,

Philips Medical Systems, Andover, USA; Sequoia C256 Acuson, Siemens, Mountain View,

USA; Famiglia Mylab25, Esaote, Genoa, Italy; Vivid System 7, GE/Vingmed, Milwaukee, USA)

equipped with a 2.5 - 3.5 MHz-phased-array sector scan probe. Parameters were obtained by 2-

and 4-chamber view using the biplane discs' summation method (Simpson's rule).24

Genotyping

Patients’ blood was sampled during a follow-up outpatient visit. Genomic DNA was extracted at

the University Hospital of Münster. Genotyping was performed, blinded to patients’ remodelling

status, using TaqMan SNP genotyping assays on the real-time PCR System ABI7900 (Life

Technologies Corporation, Carlsbad, USA) in a 384 well format. For detailed PCR conditions

ompared to LVESV V V V a

d the pppattttieiii tttnt as CRCRCRCRTTTT n

e

c

o

a

al methods ith commerciall a ailable ltraso nd de ices (Sonos 7500 and IE

RRRR--).).).). FFFFororor eeeacch h h h RR+ patient, a RR- subjjjjeecectt was enrolled mmaaatched by gender, age

ctttiooonal class, HHHFFF etioooologygygyg andndndnd basaaselinnnee LVVEFFF.

ograppphyhyhyhy

as measured by yy transthhoh raciiiic echohhh cardddioii grgg appphyhhyh examiiinatiiiions at rest usinggg

all ethhodds iithh ici lalll ilil bablle llt dd dde ii (S(S 77505000 dd IEIE by guest on May 17, 2018


ownloaded from



7

see supplemental information. Replicate samples and samples without template were used as

controls. Genotyping call rates were >95%. Hardy-Weinberg equilibrium was tested by

calculating the expected genotype frequencies from the allele frequencies. Deviation from the

observed genotype frequencies was determined by chi-square test. Genotype distributions of the

6 analyzed genes were compatible with Hardy-Weinberg equilibrium, except for rs5723 within

SCNN1G.

Selection of genes and genetic variants

With respect to the selection of appropriate genetic variants, we conducted a literature search

including different combinations of the terms “genetic variant”, “single nucleotide

polymorphism”, “cardiovascular disease” and “vascular remodel(l)ing”

(http://www.ncbi.nlm.nih.gov/pubmed; last date of access 28.02.2010). The main focus of the

search was on variants for which functional data was available. The results were evaluated for

appropriate and reasonable quality of the report and reproducibility. Due to the smaller sample

size of our study group, genetic variants with a reported minor allele frequency <10% in

Caucasian population have not been included. Data on gene regulation from our own lab has also

been taken into account.25 The final set of genetic variants tested included the common GNB3

(guanine nucleotide-binding beta polypepti

26 enhanced activity of

atrial inward rectifier potassium currents27 and increased response to vasoactive hormones.28

ATP1B1 encodes the Na+/K+- -subunit, an oligomeric membrane-bound protein

essential for maintenance of the myocardial resting membrane potential.29 Total Na+/K+-ATPase

concentration has been reported to be decreased by 40% in endomyocardial biopsies from

patients with compromised cardiac function.30 The ATP1B1 locus has repeatedly been associated

ucted a literature seaeaeaearc

e nuclllel ttottidididide

i

w.ncbi.nlm.nih.gov ubmed; last date of access 28.02.201 . The main focus of t

on variants for which functional data was available. The results were evaluated

e and reasonable quality of the report and reproducibility. Due to the smaller sam

st d gro p genetic ariants ith a reported minor allele freq enc <10% in

ismmm”,””, ““““cacacardrdrddioiii vavavascular disease” and “vassscucuculal r remodel(((l)ing”g”g””

w....nccccbi.nlm.nihhh.gggovv/ppppubbmemmm d; lllaasttt ddateee oof acaccessss 2228.8.8.02.220100)0 ... ThThThhe maaaaininnin fffoocusss oof t

on vavaaririririananana tts ffforor whihihi hhchch funnctctcttioioioionallll dadadatatatata wasas aavavavaililillabababablelele. ThThThhe reresusults wewewwere eevavalulul atatt deded

and reasonable qqqualililil tytyty off f hhthhe repopp rt a ddndd reppprodu iciibibibibilililitytyty. DDue to the smaller sam

st dd etiic iiant ii hth tedd imi lalllelle ff <1010%% iin by guest on May 17, 2018


ownloaded from



8

with CVD.31 TNFSF11 encodes the osteoprotegerin ligand (OPGL; receptor activator of nuclear

-B ligand, RANKL). Enhanced myocardial expression of the OPG/RANKL/RANK axis

has been reported to contribute to LV remodelling32 while circulating OPG levels have been

suggested as independent predictors for CV mortality.33,34 The analysis also included NR3C2

rs5522, which has been shown previously to be associated with successful CRT.23 In addition,

genetic variants of the epithelial sodium channel (ENaC) alpha/gamma (SCNN1A [rs3759324],

SCNN1G [rs5723]) have been tested since ENaC has been suggested as a mediator of aldosterone

in the vascular endothelium.35

Statistical analysis

Variables are presented as number (frequency percent) or median [interquartile range]. Chi-

square test for categorical variables and Student’s t-test or Mann-Whitney test for continuous

variables were used to compare the baseline characteristics of both groups. Relative allele and

genotype frequencies were compared by chi-square test (Fisher’s exact test, where appropriate).

Recessive/dominant associations were tested by comparing allele and genotype frequencies

between RR- and RR+ groups using contingency table and chi-square or Fisher’s exact test.

Given the group sample sizes of 80 RR+ and 76 RR- patients, the power to detect differences in

allele frequencies of 0.16 for an allele of 34% frequency exceeded 80%. P-values <0.05 were

considered statistically significant. To correct for multiple comparison, we used the Benjamini

and Yekutieli36

using the formula p = a/ (1/i), where a = 0.05, i ranges from 1 to N and N represents the number

of comparisons including clinical and genetic variables (N=20). The associations between RR+

and genetic variants were assessed by multivariable logistic regression, after adjustment for

clinically-relevant potential confounders. The incremental predictive performance for RR+ of the

are presented as number (frequency percent) or median [interquartile range]. Chi

were used to compare the baseline characteristics of both groups. Relative allele a

r i

dominant associations ere tested b comparing allele and genot pe freq encies

are prprpresesesenenenteteted ddd asasas number (frequency percececentnnt) or median [inttterererquartile range]. Chi

foor categoricall l vvvarriabababablesss s and ddd SStS uduudenttt’ss t-tetetest oorr MaMaMM nnnn-WWWhiiitnnneyyy tttest fofofofor cococontininini uuo

were useseseed ddd ttto ccomompapaparere ttthe bbbassasaseline chhchchararara acctetett iririists icicicssss ofofof bbbbototothhhh grgrououppps. ReRReRellllativeve aalllllll lelele e a

reqqquencies were compapp redd d bybbyb chihihih -sqqquare test (F((Fishhehh r’’’’s exact test,,, where apppprpp opppr

dodd imi nt iia iti test ded bb iin lalllelle dd t ff ici by guest on May 17, 2018


ownloaded from



9

predicted probability risk was determined by C statistic for 1) clinical variables and 2) the

combination of clinical variables and genetic variants. The areas under the Receiver Operating

Characteristic (ROC) curve (AUC) with their 95% confidence interval were determined and

compared by the method of DeLong et al.37 The Statistical Package for the Social Sciences

(SPSS) v 17 was used.

Data mining and machine learning

Patients were grouped in the two categories RR+ and RR-. After data adjustment for

simultaneous analysis of heterogeneous datasets, 5 independent classifiers (supplemental table 1)

including either patients’ clinical (n = 207; RR+ = 107, RR- = 100; supplemental table 2) or

genetic information (n = 156) or a combination of both (n = 156) were subjected to a multitude

of 15 machine learning algorithms (supplemental methods; supplemental table 3). For each

classifier, we used the 10-fold cross-validation approach to evaluate the general accuracy of the

algorithm. Data were randomly partitioned into ten separate sets and each algorithm was

provided with nine of the sets as training data, while the remaining set was used as test cases.

The process was repeated ten times using the different possible test sets. The resulting accuracies

were averaged. For the Decision Table and Voting Feature Intervals algorithm the Leave One

Out cross-validation method was used. For this method, the dataset containing N observations is

split into two subsets. One containing N-1 observations, which is used as the training set and one

containing 1 observation which is used for validation. The process is repeated in all possible

ways until all observations have been used for validation. Random Forest, C4.5, PART, Decision

Table, Bayes Network and Multilayer Perceptron that proved to be the most reliable (i.e. not

overtrained) and accurate algorithms after the initial testing, were further analyzed; they were

applied several times, with different values for the parameters to identify the most efficient

fiers (s( upplp ementaaaall l l ttat

pplementtatt lll l ttttablblblble 2222)))) o

ormation (n = 156) or a combination of both (n = 156) were subjected to a multit

h

w o

D

ith nine of the sets as training data hile the remaining set as sed as test case

ormmmatatatioioioonn n (n(n(nn = 111156555 ) or a combination of bobobobotht (n = 156) werre e e subjected to a multit

inne learning alglglgoritthhmhmh s (s(s(s( upplplplp emmmeentaaall meetthoddds;s;s;s; ssssuuppppleemenntttal taaable 3)3)3)3). FoFoFor eaeeach

we usesesedddd thththt e 1000 fff-fololld dd crcross-vavavav llililidatitititiononon aaaapppproroacach hh toototo eeevavavaluuuattatate thththee genenenenerararalll acccucuraracycy o

Data were rando lmllly yy papp rtiiti ioiii ned ddd iini to ten seppparate sets andddd each hh llallgogg rithm was

itii hh ini ff hth ts tr iai ini ddat hihille thhe iai ini et dd t t by guest on May 17, 2018


ownloaded from



10

configuration in terms of specificity, sensitivity and accuracy for the detection of RR+ and RR-

individuals (supplemental tables 4 - 8).

PART is a blend of C4.538 and RIPPER39. Both methods adopt a two-stage approach: a

set of rules is produced and subsequently refined by omission (C4.5) or adjustment (RIPPER).

As C4.5, PART generates rules from decision trees and utilizes the ‘divide and conquer’ rule

learning method as RIPPER, while inferring rules by repetitive generation of partial decision

trees. Initially, a rule is produced, then the covered instances are removed and PART continuous

building rules recursively for the residual instances until none is left. As the name suggests,

PART generates partial decision trees with branches to undefined sub-trees instead of fully

explored trees, integrating building and pruning stages to identify a stable sub-tree that cannot be

further cut down. When this sub-tree has been created, tree building ceases and a single rule is

produced. For missing values, PART adopts the approach of C4.5: in case an instance cannot be

assigned deterministically to a branch because of a missing attribute value, it is assigned to each

of the branches with a weight proportional to the number of training instances going down that

branch, normalized by the total number of training instances with known values at the node.

Results

Study population

The characteristics of the HF patient study population before CRT is shown in table 1. All

patients suffered from severe pump dysfunction and advanced symptoms. The RR status with

CRT at follow-up was determined after a median of 9 [7-12] months. No significant differences

existed in the clinical variables used for matching (age, atrial fibrillation, NYHA class, LVEF

and LVESV) between the patients analyzed and those not included in the study (figure 1). RR-

and RR+ group included 76 and 80 patients, respectively. Consistent with clinical matching,

As the name sugggeseseseststtt

-trees iniii tttsteaddd d ffof ffffulululullylylyly

e n

d

For missing values, PART adopts the approach of C4.5: in case an instance cann

e

ches ith a eight proportional to the n mber of training instances going do n

ees,,, ininini tettetegrgrgratatatinnng g g building and pruning staaaggges to identify y a stttabababable sub-tree that cann

dodoowwwnw . When tthiiis ssubbb-b trrrreeeeeee hasaas beeeeen cccreeateedd, treree e e e bububuildddinng cceaaseeesss anddd d aaa sininngleee rrul

For mimimissssssssininini g vavallulueses, PAPPAPARTTT aaaaddddopts thhththeeee apapprprproaoachchch ooooff f C4C4C4C4.555::: iiinin ccasase anananan iiinstaancnce e cacannn

eterministicallyy to a brbbb anchhh bbbbecause off f a miiissiini g gg attribibibibute valulll e,,, iiiit is assiggned to

hhe ii hth iei hght iti lal t hth bmb ff tr iai ini ii ta ioi dd by guest on May 17, 2018


ownloaded from



11

baseline parameters and medication were similar between groups, except for a slightly higher

prevalence of type 2 diabetes mellitus (p=0.057). Significant differences, resulting from the

defined remodelling phenotypes, were found between RR+ and RR- subjects for volume

(p<0.001) and function (p<0.001) changes (figure 2). In RR+ patients, LVEDV decreased by 22

ml [-37 to -16 ml] and LVEF improved by 11% [6 to 16%] to a clinically relevant extent,

whereas changes in LV volume ( LVEDV 2 ml [-4 to +10 ml]) and LVEF ( LVEF 2.5% [-2 to

+5%]) were slight in RR- patients.

Genetic association study

Information on genetic variants was available for 156 CRT study participants. Out of 6

previously established genetic variants that had been associated with CVD phenotypes, 4 were

associated with the RR+ phenotype (table 2) at the allelic (p<0.035) and genotypic (p<0.031)

level: rs3766031 (ATPIB1), rs5443 (GNB3), rs5522 (NR3C2) and rs7325635 (TNFSF11).

Identified associations remained significant after correction for multiple testing by the Benjamini

and Yekutieli false discovery rate method36 ATPIB1), rs5443 (GNB3)

and rs5522 (NR3C2).

By multivariable logistic regression analysis after adjustment for age, gender, LVEF,

atrial fibrillation, NYHA class, type 2 diabetes mellitus, baseline LVEDV and etiology of HF,

GNB3, ATP1B1 and NR3C2 remained independently associated with RR+ (table 3), whereas

TNFSF11 was of borderline significance (p=0.051). Minor allele carriage appeared to be

significantly associated with CRT success for both GNB3 rs5443 (OR 3.155 [95% CI 1.434 –

6.941], p=0.004) and ATP1B1 rs3766031 (OR 2.853 [95% CI 1.149 – 7.084], p=0.024). By

contrast, minor allele carriers of NR3C2 rs5522 showed a lower chance of RR+ (OR 0.320 [95%

CI 0.120 – 0.851], p=0.022) than major allele carriers. Female gender (OR 3.855 [95% CI 1.010

cipants. OuOO t ofof 6

established genetic variants that had been associated with CVD phenotypes, 4 w

with the RR+ phenotype (table 2) at the allelic (p<0.035) and genotypic (p<0.03

6

associations remained significant after correction for multiple testing by the Benj

li f l di h d36 ATPIB1) 5443 (GN

estttabababablililishshshshedededd genennete ic variants that had beeeen nn aassociated wwith CVCCC D phenotypes, 4 w

wwwwithhh h the RR+ phphhennoottypeee e (tabbblle 222) at tthhe aalllelic ((((p<p<p<p<00.030035) aannnd gggeeenotypypypy icc ((pp<00.0.03

6603111 ((((ATATATA PIB1B1B1(((( ),),), r 55s54444 3 (((GNGNGNG B3), rs5555555 2222 (((NRNRNR3C3C3CC2(((( ) ) )) anddd rs737373252525633335555 (((TNFSFSFSF1F1F1111NNNN ))).

assssssococociaiatitionononss s rereremamamainineded sssigigninifificacacantntnt aaaftftererer cccorororrererectctctioionn n foforr r mumumultltipiplele tttesesestitingngng bbyy y ththeee BeBenjn

li f l di h d3636 ATPIB1) 5443 (GN by guest on May 17, 2018


ownloaded from



12

– 14.721], p=0.048), type 2 diabetes mellitus (OR 0.227 [95% CI 0.078 – 0.660], p<0.006) and

valvular heart disease (OR 0.109 [95% CI 0.018 – 0.675], p<0.017) were also independently

associated with the RR phenotype. The concordance index, a measure of model fit, was 76.6%.

The C statistic (figure 3) documented the incremental predictive value of the model combining

clinical and genetic information (AUC 0.794 [95% CI 0.720 – 0.855]) vs. the clinical model

(AUC 0.678 [95% CI 0.597 – 0.751]), p=0.002.

Data mining and machine learning

When comparing specificity, sensitivity and accuracy of the different algorithms applied within

each classifier, we observed that some algorithms performed generally better than others (table

4). Approximation of 100% accuracy (based on the 10-fold cross-validation method) as detected

for K Nearest Neighbors, Non Nested Generalised Exemplars and Random Tree in some

classifiers indicated artificial overtraining of the applied method. Within the classifiers “Clinical

& Genotypes” and “Clinical & Alleles” the rule-based methods C4.5 and PART performed well,

exceeding 82.5% accuracy. Since rule-based methods produce lower complexity classification

results with higher transparency, which may be used to generate expert consensus in a modified

Delphi method40, we identified the PART algorithm41 as appropriate for the generation of

efficient and interpretable rules (table 5) in this series.

Discussion

In the current study, we demonstrated that machine learning algorithms can successfully be

applied for the classification of HF patients treated with CRT into responders and non-

responders using clinical and genetic parameters to model prediction of RR. Our analysis

included information on alleles and genotypes newly associated with the CRT responder

phenotype.

algorithms appliedddd wwww

y betttttter tttthahhh n ttotthehhh rsrsrsrs (((t(ta

i t

e

e d

82 5% acc rac Since r le based methods prod ce lo er comple it classificat

imatatatioioionnn ofofofof 11100%0%00% accuracy (based on theeee 1110-00 fold cross-validididdation method) y as det

esssst NNNeighbors,, NNNonn NNeNeN ststststeede GGGGeenereeralissseed EExxempmpplalalarsrsrr annndd Raandndndomomom Treeeee innn sssommmee

ndicaaateteteed ddd artifififi iicicialall oovevertrainininnininining ofofoff ttthehehehe appplplpliieied dd memememethththoood. WiWiWiWithththiiinin thehehehe ccclalalal sssifififiieiersrs “““CCCl

es” and “Clinic llall &&&& AAAlllllll lleles”””” thhhhe rule b-bbas deddd methohh ddsd CCC4.444 555 anddd d PAPAPAPART ppperformed

82882 55%% SiSi lle bb ded ethhodds dd ll lpl iit lcl isififi t by guest on May 17, 2018


ownloaded from



13

Markers and determinants of CRT response

Predicting whether a patient will benefit from CRT has long been an issue of interest and

surrogate end points of response at mid-term follow-up have been used repeatedly.12,13,18 The

correlation between primary clinical measures of response, such as cardiac death, and

symptomatic improvement has been observed to be poor, whereas RR after CRT strongly

correlates with clinical outcome.16-18 Consistently, as marker of CRT success, we used

echocardiographic RR after a median follow-up of 9 months, a time interval coincident with

peak changes in trials with repeated echocardiographic assessments.13

We selected a well-balanced data set of RR+ and RR- patients matched for known

clinical parameters that have been associated with a different incidence of RR after CRT such as

ischemic etiology of HF, lower LVEF, atrial fibrillation, shorter QRS duration and female

gender.18-21 As extensive myocardial scarring and procedural factors, including LV lead position

and percent pacing are important technical determinants of CRT success, 17,20,42,43 a limited scar

burden and technical success were prerequisites for enrolment in the study. Furthermore post-

implant AV delay optimization, which also impacts on response,44 was routinely performed.

However, we observed significant differences in patients’ outcome, potentially based on

unknown interactions of clinical parameters such as type 2 diabetes mellitus45 and undetected

genetic predispositions.

The data-mining approach

Machine learning algorithms have already been used to model the pathobiology of complex

CVD such as IHD,46 based on the combination of classic risk factors and genotype information.

This approach has mainly been used in large population data sets to identify subpopulations of

individuals at increased risk for the analyzed trait.46,47 A genetic profile in a disease model may

matchhhh ddedd ffffor kkkknownwnwnwn

a u

t

t

technical s ccess ere prereq isites for enrolment in the st d F rthermore po

amemeeteteterrrsss ththththatata hhhavavave been associated with aaa ddifferent incidencecece of RR after CRT su

tioolooogo y of HF, looowererr LVEVEVEVEF, aaattriaaal fibbbriillattioon, shshshooroo tter QQRSS dddurararatttit on andndndn ffffemalallle

As eeextxtxtx enenene sivee mmyoyoyocacardr ialll sccscscarringngng aandndndn pprorocecedududurrrral ll faffafactttorrorors,s iiincncludidididingngng LV VV llleleadaddd ppo

t papp cinggg are imppportant techhnh ic lalll ddddeter imiiinants of CCCCRTRTT success,,, 1717177 222,2000,42,43 a limited

t hhniic lal ii isite ffo llm t iin thhe t dd FF rthhe by guest on May 17, 2018


ownloaded from



14

be superior to a single measurement of risk factors if the included functional variants lead to a

life-time exposure to the affected condition.48 Following the assumption that CV risk factors

have diverse and interdependent effects in individuals with a plurality of unknown parameters

and variables, we applied 15 different machine learning algorithms to datasets of HF patients

treated with CRT to discriminate RR- from RR+ individuals and included combinations of

phenotypic risk factors and genetic information. We identified the PART algorithm as

appropriate for the generation of efficient and interpretable rules in this series.

Rule deduction and patient classification using PART

Using PART, rules of lower complexity with a maximum of ten variables were generated, which

could be applied to a sufficient number of patients with adequate accuracy and transparency. The

method of rule induction generates a set of “if(combined)-then” rules that can be used to

discover interesting patterns in the data set (knowledge extraction) or, as a classification rule, to

predict the outcome of subjects. PART generated rules with up to 100% accuracy using each of

the five classifiers “Clinical”, “Genotypes”, “Alleles”, “Clinical & Genotypes” and “Clinical &

Alleles”. Although these rules were generated for computational classification of CRT patients

and may be too complex for an individual straightforward analysis, an interpretation of some

patients correctly (93.75% accuracy) as RR+, which translates into the finding that younger

female patients respond well to CRT, consistently with common clinical observations. Lack of

type 2 diabetes mellitus was no classifier of high accuracy in our model even when combined

with other clinical parameters (<91.7% accuracy). In combination with the allele information on

rs5443, the rule [diabetes = No AND rs5443 = T AND LVEDV > 197] exceeded 96% accuracy,

pointing towards a protective role of the GNB3 rs5443 T allele in this setting. Female gender and

bles were generattttedededed, w

pplied to a sufficient number of patients with adequate accuracy and transparency

r

teresting patterns in the data set (knowledge extraction) or, as a classification rul

c

ssifiers “Clinical” “Genot pes” “Alleles” “Clinical & Genot pes” and “Clinic

ppliedededed ttttooo aa sususufffficicicicient number of patients wiwiww tht adequate accuurrracy and transparency

ruuleee e induction gggeneeraaatessss aa a set tt ofofof “if(cccoombbinned)d)d)-t-ttthheh nn” ruules thhhatt t cccan bebebebe uuseseed ttto

terestititingngngng pattetet rnrnss inini ttthehh dddatatataaaa set (k(k(k( nonononowlwlwlw ededddgege eextxttxtrararacttctctiiiion)n)n) oorr, aass a clclclclasasassififificacatititionon rrulu

outcome of subbbjejjej cts. PAPAPAARTRTRT gggenerateddd rullel s with uppp to 10101000%0%0%% accuracyy usinggg eacyy

ififiie ““ClCliiniic lal”” ““GGe t ”s” “A“Allll lel ”” ““ClCliiniic lal && GG ot ”” dd “C“Clili ini by guest on May 17, 2018


ownloaded from



15

the minor T allele of GNB3 rs5443 were also associated with CRT success by multivariable

logistic regression analysis.

Study limitations

The population studied in this investigation, although phenotypically well characterized, was

retrospectively enrolled, consequently timing of follow-up echocardiography to define RR was

not fixed but ranged from 6 to 12 months. Variability in timing of echocardiographic assessment

is widely accepted in clinical trials of CRT, where a range of 45 days around the scheduled

follow-up is generally used, and is probably unavoidable in “real world designs” such as in our

study. However, although RR is known to occur even later than the first year,13 longer follow-up

is also likely to include intercurrent events unrelated to pump failure that may halt or invert an

established favorable remodelling. Therefore, the median distance of 9 months observed in our

series represents an appropriate time point. The study was relatively small and potentially not

adequately powered to detect all genotype/phenotype interactions and not all genetic variants

potentially associated with RR status after CRT have been included in the analysis. All our

patients were Caucasians, so genetic findings might not be extendable to other races. As the

dataset was relatively small, the results obtained by multivariable logistic regression analysis

may be of limited accuracy. The current study should therefore be considered as a pilot study

that could be the basis for a larger and prospective study.

Although the sample was balanced across many clinical confounders, additional

parameters may be missing in the investigation. In particular, the groups were not matched for

QRS morphology, an important predictor of CRT response, alone and in conjunction with a QRS

17,18,49 However, less than 10% of our patients had neither LBB nor a QRS

The current models

d designs” such assss iiiin

rst year,13131313 llllongeggg r fofofofollllllllo

ly to include intercurrent events unrelated to pump failure that may halt or invert

n

e n

n

associated ith RR stat s after CRT ha e been incl ded in the anal sis All o r

ly to oo inininnclclclcludududdeee innntettetercurrent events unrelateeed d dd tto pump failure thththhat may halt or invert

ffaf vvvov rable remomoodellll iinining. TTTThererrefooree, thhhee meeddiann dddisisistaancccee of 9 mmmononnths oboobo ssserrrveeddd in

esenttts ss anananan apprropopopririiatattee titt me ppppoioioioint. ThThThThe tstst dududdy y wawawasss rerelalalattttivevevelylylyly ssmamall aaaandnndnd potttenentitt alallllyly n

pppowered to detect alllllll gegg notytytypepp /p//p/ hehh notytytypepp iiiinteractiions andd d not llallllll gegg netic varian

iciat ded iithh RRRR tat fafte CRCRTT hh bbe ii ll ddedd iin thhe ll isi AAllll by guest on May 17, 2018


ownloaded from



16

may present some features of so called model overtraining. This effect is mainly marked by

accuracy values approximating 100% and results from data overfitting. Testing sensitivity and

specificity in an additional and independent data set will be needed to prove broad practicability

of the model. The model might perform less accurate when used on a data set containing specific

records that were not included in the original data set.

Conclusion

Our data mining approach has identified combinations of different factors including genetic

variants with impact on HF treatment outcomes, pointing to so far unknown underlying

biological mechanisms. These findings underscore that an effective and efficient model for HF

has to be based on a multi-parameter model, including numerous known potential modifiers, to

meet the needs for the high complexity of the disease.

As any treatment of disease has certain risks and costs, there will always be treatment risk

thresholds.50 Current clinical decision-making in HF patients is based on well-established

conventional measures and treatment is recommended if the individual risk is acceptable, even if

treatment success is not fully predictable. Our study on CRT response in HF patients may help to

guide appropriate therapy and improve clinical outcomes, at least in otherwise uncertain cases

since it provides additional individual risk information.

Funding Sources: This study was supported by the European Union, FP7-ICT-2007-2, project

number 224635, “VPH2-Virtual Pathological Heart of the Virtual Physiological Human”. EB is

supported by a Heisenberg professorship from the Deutsche Forschungsgemeinschaft (Br1589/8-

2).

Conflict of Interest Disclosures: None

g g

knowwwwn n n n unununundededederlrlrlrlyiyiyiyingngngng

mechanisms. These findings underscore that an effective and efficient model for

ased on a multi arameter model, includin numerous known otential modifier

e

a e

5 g p

mechaanininismsmsms.ss TThehh se findings underscore thahat an effective andndnd efficient model for

asseddd d on a mululultitti---pararamamameeteterere mmmodododo eelee , inclccc uudinngg nunumemeerorousss kkknooowwwn ppotottotennntitititialalala mmmododododififififieiii r

eedsdsdss ffforororr tttthehehee hhhigigigh hhh cococompmpmpm leeexixixx tytytyy oooof thththhe ee didididiseseseasasasase.

any treatment ooooff ff dididiiseseses asasasse eee hahahaas sss cececerrrrtatatat inininn rrrrisisisksksksk aaandndndn cosssstststst , ththththererreeee wiwiwiillllll aaalwlwlwlwaaya s be treatme

550 CuCurrrrenentt clclliniinicici alal dddececisisioiionn-mmakkakiningg iniin HHHFF F papatitienentstts iiss babbasesedd onon wwellell-l-esestatablblisishehedd



ownloaded from



17

References:

1. Chan IS, Ginsburg GS. Personalized medicine: progress and promise. Annu Rev Genomics Hum Genet. 2011;12:217-244.

2. Thanassoulis G, Vasan RS. Genetic cardiovascular risk prediction: will we get there? Circulation. 2010;122:2323-2334.

3. Marenberg ME, Risch N, Berkman LF, Floderus B, de Faire U. Genetic susceptibility to death from coronary heart disease in a study of twins. N Engl J Med. 1994;330:1041-1046.

4. Brand-Herrmann SM. Where do we go for atherothrombotic disease genetics? Stroke.2008,39:1070-1075.

5. Yusuf S, Hawken S, Ounpuu S, Dans T, Avezum A, Lanas F, et al. INTERHEART Study Investigators. Effect of potentially modifiable risk factors associated with myocardial infarction in 52 countries (the INTERHEART study): casecontrol study. Lancet. 2004;364:937-952.

6. Khot UN, Khot MB, Bajzer CT, Sapp SK, Ohman EM, Brener SJ, et al. Prevalence of conventional risk factors in patients with coronary heart disease. JAMA. 2003;290:898-904.

7. Humphries SE, Drenos F, Ken-Dror G, Talmud PJ. Coronary heart disease risk prediction in the era of genome-wide association studies: current status and what the future holds. Circulation.2010;121:2235-2248.

8. Gatsios D, Garofalakis J, Chrysanthakopoulou T, Tripoliti E, De Maria R, Franzosi MG, et al. Knowledge extraction in a population suffering from heart failure. ITAB. 2010;1-6.

9. Holzmeister J, Leclercq C. Implantable cardioverter defibrillators and cardiac resynchronisation therapy. Lancet. 2011;378:722-730.

10. Birnie DH, Tang ASL. The problem of non-response to cardiac resynchronization therapy. Curr Opin Cardiol. 2006;21:20-26.

11. Fornwalt BK, Sprague WW, BeDell P, Suever JD, Gerritse B, Merlino JD, et al. Agreement is poor among current criteria used to define response to cardiac resynchronization therapy. Circulation. 2010;121:1985-1991.

12. St John Sutton MG, Plappert T, Abraham WT, Smith AL, DeLurgio DB, Leon AR, et al, Multicenter In-Sync Randomized Clinical Evaluation (MIRACLE) Study Group. Effect of cardiac resynchronization therapy on left ventricular size and function in chronic heart failure. Circulation. 2003;107:1985-1990.

13. Ghio S, Freemantle N, Scelsi L, Serio A, Magrini G, Pasotti M, et al. Long-term left ventricular reverse remodeling with cardiac resynchronization therapy: results from the CARE-HF trial. Eur J Heart Fail. 2009;11:480-488.

INTERHEART SSStutututudwiwiwiwithththth mmmmyoyoyoyocacacacardrdrdrdiaiaiaialll innnnfafafafarrcrcc

20040444 3;33364646464:999937373737 9-952525252.

Na 4

ies SE, Drenos F, Ken-Dror G, Talmud PJ. Coronary heart disease risk predictiol

2

D, Garofalakis J, Chr anthak oulou T, Tri liti E, De Maria R, Franzosi MG,e e traction in a pop lation s ffering from heart fail re ITAB 2010;1 6

N, KhKhKhKhotottot MMMMB,B,BB BBBajajajajzer CT, Sapp SK, Ohmananan EEEM, Brener SJ, etettet al. Prevalence of allll riisisi k factorooo s s s inininin ppatatatatieieentntntnts wiwiwiwithththth ccccorororo onnnnararara y hheaart didiidiseseseseasaa e.e.e.e. JAJAJAJ MAMAAMA. 2000003030303;2;2;22900:8:88:8989898-9-9-9-904

iesss SSSSE,E Dreeenonn ss FFF, KKeeen-DDDroror G,G, TTTaalmmumudd PJ.. CoCoCorronanaryy hheartt dddiseaeaase risisiskk k prprprediiicttioenommmeeee ww-wwiiide asassosocicii tatatioii n stststududududies: cccuuurrererer ntttt ssttatatutut sss anaananddd whwhwhatatatat ttttheheh ffffuturururureee hhhholdddss. CiCiCiC rcul

2235-2248.

DDD, Garofaalalaalakikkikis ss J,J,JJ, CCCChrhhrhrysysyssananananththhthakakakakopoppopououooulolololou uu T,T,TT, TTTriririripopoolilillitittiti EEEE, ,, DeeDeDe MMMMarararariaiaiia RRRR,,, FrFrFrrananaanzosi MG,tr iti ii lla iti ffffe iri ff hhe t ffailil ITITABAB 20201010 1;1 66 by guest on M


Dow

nloaded from



18

14. Yu CM, Bleeker GB, Fung JW, Schalij MJ, Zhang Q, van der Wall EE, et al. Left ventricular reverse remodeling but not clinical improvement predicts long-term survival after cardiac resynchronization therapy. Circulation. 2005;112:1580-1586.

15. Ypenburg C, van Bommel RJ, Borleffs CJ, Bleeker GB, Boersma E, Schalij MJ, et al. Long-term prognosis after cardiac resynchronization therapy is related to the extent of left ventricular reverse remodeling at midterm follow-up. J Am Coll Cardiol. 2009;53:483-490.

16. Foley PW, Chalil S, Khadjooi K, Irwin N, Smith RE, Leyva F. Left ventricular reverse remodeling, long-term clinical outcome, and mode of death after cardiac resynchronization therapy. Eur J Heart Fail. 2011;13:43-51.

17. Yu CM, Hayes DL.Cardiac resynchronization therapy: state of the art 2013. Eur Heart J.2013;34:1396-403.

18. van Bommel RJ, Bax JJ, Abraham WT, Chung ES, Pires LA, Tavazzi L, et al. Characteristics of heart failure patients associated with good and poor response to cardiac resynchronization therapy: a PROSPECT (Predictors of Response to CRT) sub-analysis. Eur Heart J.2009;30:2470-2477.

19. Wikstrom G, Blomström-Lundqvist C, Andren B, Lönnerholm S, Blomström P, Freemantle N, et al. The effects of aetiology on outcome in patients treated with cardiac resynchronization therapy in the CARE-HF trial. Eur Heart J. 2009;30:782-788.

20. Adelstein EC, Tanaka H, Soman P, Miske G, Haberman SC, Saba SF, et al. Impact of scar burden by single-photon emission computed tomography myocardial perfusion imaging on patient outcomes following cardiac resynchronization therapy. Eur Heart J. 2011;32:93-103.

21. Linde C, Abraham WT, Gold MR, Daubert C; REVERSE Study Group. Cardiac resynchronization therapy in asymptomatic or mildly symptomatic heart failure patients in relation to etiology: results from the REVERSE (REsynchronization reVErses remodelling in Systolic Left vEntricular Dysfunction) study. J Am Coll Cardiol. 2010;56:1826-1831.

22. De Maria R, Landolina M, Gasparini M, Schmitz B, Campolo J, Parolini M, et al. Genetic variants of the renin-angiotensin-aldosterone system and reverse remodeling after cardiac resynchronization therapy. J Card Fail. 2012;18:762-768.

23. Dickstein K, Vardas PE, Auricchio A, Daubert JC, Linde C, McMurray J, et al. 2010 Focused update of ESC Guidelines on device therapy in heart failure. Eur Heart J.2010;31:2677-2687.

24. Rudski LG, Lai WW, Afilalo J, Hua L, Handschumacher MD, Chandrasekaran K, et al. Guidelines for the echocardiographic assessment of the right heart in adults: a report from the American Society of Echocardiography endorsed by the European Association of Echocardiography, a registered branch of the European Society of Cardiology, and the Canadian

azzi L,LLL et ttt allll. ChChChCharracacacactttediac rerereresysysysyncncncnchrhrhrhrononononizizizizatatatio

P4

o ae effects of aetiology on outcome in patients treated with cardiac resynchronizatt

e ssingle photon emission comp ted tomograph m ocardial perf sion imaging on

PROOOSPSPPSPECECECE T TTT (PPPrrredictors of Response to CCCRTRR ) sub-analysy isss.. EuEEE r Heart J.JJ470700-2-2-2-2477.

ommm GGGG, Blommmstrröööm-Luuundqdqdqviv st CC, AAnddrdreen BB, Lööönnnererhholmllm S, BBBlomommströöömm m PPP, Freeeeemae efffffefeffectctctctssss of aaetettiioiololol gygy on ouououutctctctcome ee ininin pppatatttieieii tntnttss trtrtreeeeatetetedddd wiwiwithththth ccarardiddd acccc rrreeesynnchchhhroroniniizaz tthe CARE-HF ttttrirrir alalal. EuEuEuEur rrr HeHeHeearararart t t JJJJ. .. 20202000909090 ;3;330:00 787878782-2-2-788888.8.8 JJJJJJ

eiiin EC, Taananananakakakak HHHH,, SoSoSSomamamaman nnn P,PP,P, MMMissisiskekkeke GGGG, , HaHaHaHabebeebermrmrmr anananan SSSC,CC,C, SSSababababa aa SFSFSFSF,, etetetet aaal.l.l. IIIImpmm act of sssii lle hhoto iis isi tedd to hh drdiiall frf iio iim iin by guest on M


Dow

nloaded from



19

Society of Echocardiography. J Am Soc Echocardiogr. 2010;23:685-713.

25. Schmitz B, Nedele J, Guske K, Maase M, Lenders M, Schelleckes M, et al. Soluble Adenylyl Cyclase in Vascular Endothelium: Gene Expression Control of Epithelial Sodium Channel-Na+/K+-ATPase- Hypertension. 2014;[Epub ahead of print].

26. Siffert W, Rosskopf D, Siffert G, Busch S, Moritz A, Erbel R, et al. Association of a human G-protein beta3 subunit variant with hypertension. Nat Genet. 2005;18:45-48.

27. Dobrev D, Wettwer E, Himmel HM, Kortner A, Kuhlisch E, Schuler S, et al. G-Protein beta(3)-subunit 825T allele is associated with enhanced human atrial inward rectifier potassiumcurrents. Circulation. 2005;102:692-697.

28. Wenzel RR, Siffert W, Bruck H, Philipp T, Schäfers RF. Enhanced vasoconstriction to endothelin-1, angiotensin II and noradrenaline in carriers of the GNB3 825T allele in the skin microcirculation. Pharmacogenetics. 2002;12:489-495.

29. Smith JG, Avery CL, Evans DS, Nalls MA, Meng YA, Smith EN, et al. Impact of ancestry and common genetic variants on QT interval in African Americans. Circ Cardiovasc Genet.2012;5:647-655.

30. Schwinger RH, Bundgaard H, Müller-Ehmsen J, Kjeldsen K. The Na, K-ATPase in the failing human heart. Cardiovasc Res. 2003;57:913-920.

31. Newton-Cheh C, Eijgelsheim M, Rice KM, de Bakker PI, Yin X, Estrada K, et al. Common variants at ten loci influence QT interval duration in the QTGEN Study. Nat Genet. 2009;41:399-406.

32. Ueland T, Yndestad A, Øie E, Florholmen G, Halvorsen B, Frøland SS, et al. Dysregulated osteoprotegerin/RANK ligand/RANK axis in clinical and experimental heart failure. Circulation.2005;111:2461-2468.

33. Røysland R, Masson S, Omland T, Milani V, Bjerre M, Flyvbjerg A, et al. Prognostic value of osteoprotegerin in chronic heart failure: The GISSI-HF trial. Am Heart J. 2010;160:286-293.

34. Ueland T, Dahl CP, Kjekshus J, Hulthe J, Böhm M, Mach F, et al. Osteoprotegerin predicts progression of chronic heart failure: results from CORONA. Circ Heart Fail. 2011;4:145-152.

35. Kusche-Vihrog K, Callies C, Fels J, Oberleithner H. The epithelial sodium channel (ENaC): Mediator of the aldosterone response in the vascular endothelium? Steroids. 2010 ;75:544-549.

36. Benjamini Y, Yekutieli D. The control of the false discovery rate in multiple testing under dependency. Ann Stat. 2001;29:1165-1188.

37. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more

d vasoconstrictionnnn ttttoo oo333 888825252525T T T T alalalallelelelelelelele iiiinnnn ththththe e ee skkkk

Jo t7

em

n mten loci infl ence QT inter al d ration in the QTGEN St d N t G t 2009;4

JG, AvAvAverereryy CLCCC , EvEEE ans DS, Nalls MA, Meeengnng YA, Smith EN,NN, etee al. Impact of anceonnnn ggggenetic vvvvarrriaiaaiantnnn ss ononoon QQQQT T ininnintetetet rvvvvalalal in n n n AfAA ricaan AmAmmAmerereericcccananana s.s.s. CiCiiCircrrr CCCCarararardidididiovasasasasc c c GeGeGeGenet-66655555 .

ger RHRHRHRH, BBBundnddgagagaararddd HHHH, MMMülülüllleleleler-EhEhEhEhmsmsmsmseeene JJJJ, KjKjKjKjeleleldsdsdsdsenenen KKKK. ThThThThee NNNaNa, K-KKK ATATAATPaPaP sese iiinn thththeman heart. Carddddioioioi vavavaascscscs RRRReseses.. 202020200303033;5;5;5;57:77:9191919 3-3-3-9292920.0.0.0.

n-Cheh C,, EEEEijijijijgegeegelslslslsheheeheimimiim MMMM, RiRiRiRicecece KKKKM,M,MM, ddde eee BaBBaBakkkkkkkkererer PPPPI,I,I, YYYYiniin XXXX, , EsEsEsEstrtrttradadadada a aa K,K,KK, eeet al. Commte llo ici ii fnfll QQTT iinte lal dd atiio iin thhe QQTGTGENEN SSt dd NN t GG t 20200909 4;4 by guest on M


Dow

nloaded from



20

correlated receiver operating characteristic curves: a nonparametric approach. Biometrics.1988;44:837-845.

38. Quinlan RJ. C4.5: programs for machine learning. San Francisco, CA: Morgan Kaufmann; 1993.

39. Cohen W. Fast effective rule induction. In Morgan Kaufmann. 1995;115-123.

40. Murphy MK, Black NA, Lamping DL, McKee CM, Sanderson CF, Askham J, et al. Consensus development methods, and their use in clinical guideline development. Health Technol Assess. 1998;2:1-88.

41. Frank E, Witten IH. Generating Accurate Rule Sets Without Global Optimization. Machine Learning: Proceedings of the Fifteenth International Conference, Morgan Kaufmann Publishers, San Francisco. 1998;144-151.

42. Derval N, Steendijk P, Gula LJ, Deplagne A, Laborderie J, Sacher F, et al. Optimizing hemodynamics in heart failure patients by systematic screening of left ventricular pacing sites: the lateral left ventricular wall and the coronary sinus are rarely the best sites. J Am Coll Cardiol.2010;55:566-575.

43. Mullens W, Grimm RA, Verga T, Dresing T, Starling RC, Wilkoff BL, et al. Insights from a cardiac resynchronization optimization clinic as part of a heart failure disease management program. J Am Coll Cardiol 2009;53:765–773.

44. Bertini M, Delgado V, Bax JJ, Van de Veire NR. Why, how and when do we need to optimize the setting of cardiac resynchronization therapy? Europace. 2009;Suppl5:v46-57.

45. Höke U, Thijssen J, van Bommel RJ, van Erven L, van der Velde ET, Holman ER, et al. Influence of diabetes on left ventricular systolic and diastolic function and on long-term outcome after cardiac resynchronization therapy. Diabetes Care. 2013;36:985-991.

46. Stengård JH, Dyson G, Frikke-Schmidt R, Tybjærg-Hansen A, Nordestgaard BG, Sing CF. Context-dependent associations between variation in risk of ischemic heart disease and variation in the 5’ promoter region of the Apolipoprotein E gene in Danish women. Circ Cardiovasc Interv. 2010;3:22-30.

47. Austin PC, Tu JV, Ho JE, Levy D, Lee DS. Using methods from the data-mining and machine-learning literature for disease classification and prediction: a case study examining classification of heart failure subtypes. J Clin Epidemiol. 2013;66:398-407.

48. Kathiresan S, Melander O, Anevski D, Guiducci C, Burtt NP, Roos C, et al. Polymorphisms associated with cholesterol and risk of cardiovascular events. N Engl J Med. 2008;358:1240-1249.

49. Stavrakis S, Lazzara R, Thadani U. The benefit of cardiac resynchronization therapy and

F, et ttt allll. OOOOptitititimizizizizingngngng ventrrrriciciciculululularararar ppppacacacacinininingggg si

a6

s rynchronization optimization clinic as part of a heart failure disease management

he setting of cardiac res nchroni ation therap ? E 2009;S ppl5: 46 57

eft vvvenenentrtrtriciciculuuu arr wwwall and the coronary sinnnnuuus are rarely the bebebessst sites. J Am Coll Ca66-6-6-575757575.

s W,W,WW, Grimmmm RRAAA, VVeeergaaa TTT,, Drrese iining TTT, Staarrlinnng g RCC,, WiWWilkofff ff BLL, et aaal... Innsssighhhtss frynchrhrhhrononononizizizi atioonn opopoptititi imimizatititionnonon cliniiiiccc asasasas pararttt fofoff aa hhhheaeaearttrtrt faiaiailuululurere ddddisiii easessese mannagagememenentAm Coll Carddioioioioll 22220000000 9;9;9;;53553:7:7:776565656 –7–7–7–77373733....ll

MM, Delgaadodododo VVV,,, BaBaBaBax xxx JJJJJJ, , , VaVVaVan nn n dedede VVVVeieiieirererere NNNNR.RR.R. WWWWhyhyhy, , , hohhohow ww w anannand dd d whwhwhwheneneen ddddo o o wewewewe nnneed to he et iti ff drdiia hhr ii atiio hth ?? EE 20200909 S;S lpl55: 4466 5757 by guest on M


Dow

nloaded from



21

QRS duration: a meta-analysis. J Cardiovasc Electrophysiol. 2012;23:163-168.

50. Hlatky MA, Greenland P, Arnett DK, Ballantyne CM, Criqui MH, Elkind MS, et al. Criteria for evaluation of novel markers of cardiovascular risk: a scientific statement from the American Heart Association. Circulation. 2009;119:2408-2416.



ownloaded from



22

Table 1: Baseline characteristics of the study population

Values are expressed as n (frequency percent) or median [interquartile range]. P-values for categorical variables were calculated by chi-square or Fisher’s exact test, p-values for non-categorical variables were calculated by Student’s t- or Mann-Whitney test. IHD, ischaemic heart disease; IDC, idiopathic dilated cardiomyopathy; VALV, valvular defect; LVEF, left ventricular ejection fraction; LVEDV, left ventricular end diastolic volume; LVESV, left ventricular end systolic volume; MI, myocardial infarction; RAS, renin-angiotensin system; RR, reverse remodelling.

All(n=156)

RR+(n=80)

RR- (n=76)

P- value

Anthropometry

Gender (male) 136 (87%) 67 (84%) 69 (91%) 0.234

Age (years) 62 [56-70] 64 [57-71] 61 [56-70] 0.681

Type 2 diabetes mellitus 27 (17%) 9 (11%) 18 (24%) 0.057

History of hypertension 43 (27%) 21 (28%) 22 (31%) 0.717

Previous MI 63 (41%) 30 (39%) 33 (44%) 0.514

Atrial fibrillation 25 (16%) 11 (14%) 14 (18%) 0.514

Aetiology 0.157

IHD 79 (51%) 39 (49%) 40 (53%) -

IDC 66 (42%) 38 (47%) 28 (37%) -

VALV 11 (7%) 3 (4%) 8 (10%) -

Medications

Beta-blockers 126 (82%) 63 (82%) 63 (83%) 1.000

RAS inhibitors 149 (96%) 74 (94%) 75 (99%) 0.210

Aldosterone antagonists 97 (64%) 46 (61%) 51 (67%) 0.500

Echocardiography

NYHA class II (vs III-IV) 45 (29%) 22 (28%) 23 (30%) 0.727

LVEF (%) 27 [22-30] 27 [22-30] 27 [23-30] 0.665

LVEDV (ml) 227 [190-310] 230 [200-330] 227 [174-295] 0.253

LVESV (ml) 170 [135-231] 178 [140-240] 164 [121-222] 0.253

QRS duration (msec) 160 [140-180] 169 [150-188] 160 [140-180] 0.163

follow-up (month) 9 [7-12] 10 [7-12] 9 [7-12] 0.879

0.1515151577

404040 (((5353533%)%)%)%) ----

28 (37%)%)%)%) ---

11 (7%) 3 (4%) 8 (10%)

k

b

n

r

a

11 (7%) 33 (4%) 8 888 (10%) -

keeeerssss 1222266 6 (82%) 6363 (822%)%)%)%) 6663 (((838333%) 1...0000

bitooorsrsrsrs 144449 999 (9(9996%6%6%6%) 7474747 ((((9494949 %)%)%)%) 7575575 (9999999%)%)%)% 0.0.0.0.21110

ne antagonists 97979797 (((64646464%)%)%)% 464646 (((616161%)))) 51515151 ((676767%)%%% 0.500

raphy

aaass II (vs IIIIIII---IVIVIVIV))) 45454545 ((2929299%)%)%)% 22222222 ((((282828%)%)%)%) 23232323 (((3030300%)%)%)% 0.727



ownloaded from



23

Table 2: Genotype and allele frequencies

Gene SNP Minor allele

RR+(n=80)

RR- (n=76)

P-valueallele

P-valuegenotype

SCNN1A rs3759324 C T/C 125/33 129/21 0.134 CT+CC 0.123

CC 2 (2%) 1 (1%) vs.

TT 48 (61%) 55 (73%) TT

CT 29 (37%) 19 (25%)

SCNN1G rs5723 G C/G 158/2 138/12 0.005 GG+CG 0.057

CC 79 (99%) 69 (92%) vs.

GG 1 (1%) 6 (8%) CC

CG 0 (0%) 0 (0%)

ATP1B1 rs3766031 T C/T 131/29 138/12 0.011 TT+CT 0.005

CC 52 (65%) 64 (85%) vs.

TT 1 (1%) 1 (1%) CC

CT 27 (34%) 10 (13%)

GNB3 rs5443 T C/T 93/67 110/38 0.004 TT+CT 0.006

CC 26 (33%) 41 (55%) vs.

TT 13 (16%) 5 (7%) CC

CT 41 (51%) 28 (38%)

TNFSF11 rs7325635 A G/A 108/52 83/67 0.035 AA+AG 0.031

GG 36 (45%) 21 (28%) vs.

AA 8 (10%) 13 (17%) GG

AG 36 (45%) 41 (55%)

NR3C2 rs5522 C T/C 150/10 126/24 0.006 CT+CC 0.014

TT 71 (89%) 54 (72%) vs.

CC 1 (1%) 3 (4%) TT

CT 8 (10%) 18 (24%)

Values are expressed as n (frequency percent). P-values for categorical variables were calculated by chi-square or Fisher’s exact test. SCNN1A, epithelial sodium channel alpha subunit; SCNN1G, epithelial sodium channel gamma subunit; ATP1B1, Sodium/potassium-transporting ATPase subunit beta-1; GNB3, guanine nucleotide binding protein (G protein), beta polypeptide 3; TNFSF11, tumor necrosis factor (ligand) superfamily, member 11 (RANKL). NR3C2, mineralocorticoid receptor. Underlined p-values marc associations which remained significant after correction for multiple testing (clinical and genetic variants comparisons combined) by the Benjamini and Yekutieli false discovery rate method (p ).

CC

0.011 TTTTTTTT+C+C+C+CTTTT

CCCCCCCC 52 ((((656566 %)% 64 (85%) vs.

TT 1 (11%%) 1111 (1( %%%%) CCCCCCCC

CT 227 (334%%) 100 ((13%)%%

rs54545443 TT C/T 9333/6/677 11100/388 00.004 TTTTT C+CTT

CC 26262626 (((333333%)%)%)% 41414141 ((55555555%)%%)% vs.

TT 1313313 ((((1616116%)%%% 5 5 5 5 (7(77(7%)%%% CC



ownloaded from



24

Table 3: Multivariate logistic regression analysis of genotypes associated with RR+

P-value Odds ratio 95% confidence intervall

Age 0.950 0.999 0.957-1.042

Female gender 0.048 3.855 1.010-14.721

LVEF 0.402 1.037 0.952-1.130

LVEDV 0.109 1.005 0.999-1.011

Atrial fibrillation 0.781 1.182 0.365-3.832

NYHA class II vs. III-IV 0.698 1.176 0.518-2.673

Type 2 diabetes mellitus 0.006 0.227 0.078-0.660

Ischemic aetiology (reference) 0.043

0.395-2.259 Idiopathic dilated cardiomyopathy 0.898 0.945

Valvular heart disease 0.017 0.109 0.018-0.675

GNB3 (TT+CT vs. CC) 0.004 3.155 1.434-6.941

ATP1B1 (TT+CT vs. CC) 0.024 2.853 1.149-7.084

TNFSF11 (AA+AG vs. GG) 0.051 0.436 0.189-1.005

NR3C2 (CC+CT vs. TT) 0.022 0.320 0.120-0.851

LVEF, left ventricular ejection fraction; LVEDV, left ventricular end diastolic volume; ATP1B1, Sodium/potassium-transporting ATPase subunit beta-1; GNB3, guanine nucleotide binding protein (G protein), beta polypeptide 3; TNFSF11, tumor necrosis factor (ligand) superfamily, member 11 (RANKL); NR3C2, mineralocorticoid receptor.

0.365-3.832222

0.518-2.2 677773333

a

c

h

abeeeetetetess meellllllititititus 0.006 0.227 0.078-0.660

aeete iioioi logy (referrennnce) 00.04443

0.395-2.259c dilated cardiommmyoyyopapp thyyyy 0.898 0.0.0.0.9499 5

hhheaeaeartrtrt dddisisiseaeaeaseseses 000.010101777 000.101010999 000.010101888-000.676767555



ownloaded from



25

Table 4: Specificity, sensitivity and accuracy* of the applied machine learning algorithms

*Accuracy results are based on the 10-fold cross-validation approach except for Decision Table and Voting Feature Intervals in which the “Leave One Out” method was used.

Dataset “Clinical” “Genotypes” “Alleles” “Clinical & Genotypes” “Clinical & Alleles”

Method specificity sensitivity accuracy specificity sensitivity accuracy specificity sensitivity accuracy specificity sensitivity accuracy specificity sensitivity accuracy

Bayes Network 49.00% 70.09% 59.90% 75.00% 53.95% 64.74% 75.00% 55.26% 65.38% 68.75% 76.32% 72.44% 73.75% 67.11% 70.51%

Naive Bayes 62.00% 58.88% 60.39% 75.00% 55.26% 65.38% 75.00% 57.89% 66.67% 76.25% 76.32% 76.28% 77.50% 75.00% 76.28%

Multilayer Perceptron 85.00% 87.85% 86.47% 58.75% 93.42% 75.64% 51.25% 96.05% 73.08% 98.75% 100.00% 99.36% 98.75% 98.68% 98.72%

RBF Network 51.00% 67.29% 59.42% 67.50% 65.79% 66.67% 63.75% 72.37% 67.95% 77.50% 69.74% 73.72% 77.50% 69.74% 73.72%

K Nearest Neighbors 98.00% 97.20% 97.58% 62.50% 89.47% 75.64% 62.50% 89.47% 75.64% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00%

HyperPipes 100.00% 5.61% 51.21% 100.00% 0.00% 51.28% 100.00% 0.00% 51.28% 100.00% 5.26% 53.85% 100.00% 5.26% 53.85%

Voting Feature

Intervals74.00% 50.47% 61.84% 73.75% 56.58% 65.38% 75.00% 56.58% 66.03% 73.75% 76.32% 75.00% 77.50% 69.74% 73.72%

Decision Table 34.00% 84.11% 59.90% 57.50% 75.00% 66.03% 61.25% 68.42% 64.74% 67.50% 65.79% 66.67% 68.75% 67.11% 67.95%

Decision Table Naive

Bayes Combination

65.00% 63.55% 64.25% 52.50% 85.53% 68.59% 53.75% 81.58% 67.31% 77.50% 77.63% 77.56% 76.25% 73.68% 75.00%

RIPPER 44.00% 73.83% 59.42% 62.50% 65.79% 64.10% 68.75% 63.16% 66.03% 73.75% 47.37% 60.90% 67.50% 53.95% 60.90%

Non Nested Generalised Exemplars

100.00% 98.13% 99.03% 63.75% 75.00% 69.23% 63.75% 75.00% 69.23% 100.00% 98.68% 99.36% 100.00% 100.00% 100.00%

PART 69.00% 90.65% 80.19% 53.75% 89.47% 71.15% 50.00% 93.42% 71.15% 87.50% 81.58% 84.62% 83.75% 97.37% 90.38%

C4.5 66.00% 89.72% 78.26% 60.00% 84.21% 71.79% 61.25% 78.95% 69.87% 87.50% 85.53% 86.54% 77.50% 88.16% 82.69%

Random Forest 99.00% 100.00% 99.52% 57.50% 94.74% 75.64% 57.50% 94.74% 75.64% 98.75% 100.00% 99.36% 100.00% 98.68% 99.36%

Random Tree 100.00% 100.00% 100.00% 60.00% 92.11% 75.64% 60.00% 92.11% 75.64% 100.00% 100.00% 100.00% 100.00% 100.00% 100.00%

98.75% 100.00000 %

77777777.50%50%50%50% 69.69.69.69.74%74%74%74%

%

.50% 89 47% 75 64% 62 50% 89 47% 75 64% 100 00% 100 00%

0 %

%

.50% 89.999 47%%%% 75.64% 62.50% 89.888 47% 75.64%%%% 100.00% 100.00%

0.00000 %%% 0.00%0% 0%0 51.515151 28%28%28%28% 1000000 .00%% %% 0.00% 0%0%0% 51.51 28%8%8%% 101001010 .00000000% %%% 5.25.25 25.26%

.75%%%% 56.565656 58%58%58%8% 65.656565 38%8%8%% 75.5.5.5.00%00%00%00% 565656.5 58%58%58%58% 66.66.66.66.03%03%03%03% 73.73.73.73 75%75%75%75% 76.76.76.76 323232%32

.50% 75.00%% 666666.03%03%03%0 6161.611 25%25%25%5% 6868.668 4442% 646464.74%74%74%4% 6767.677 50% 65.79%



ownloaded from



26

Table 5: Rules for CRT patient classification generated by the PART algorithm

Rule Class Patients Correct Wrong Accuracy

Based on clinical parametersLVEF <= 30 AND aetiology = IHD RR- 10 8 2 80.00%aetiology = VALV AND Age > 45 RR- 10 8 2 80.00%NYHA = 3 AND aetiology = IHD AND LVESV <= 220 RR- 21 18 3 85.71%aetiology = IDC AND sex = F AND LVESV <= 156 RR- 7 6 1 85.71%aetiology = IDC AND diabetes = Yes RR- 8 7 1 87.50%NYHA = 2 AND chronic_AF = Yes RR- 9 8 1 88.89%sex = M AND aetiology = IDC AND NYHA = 3 RR- 10 9 1 90.00%aetiology = IHD AND diabetes = No AND NYHA = 3 AND LVESV > 125 AND LVESV <= 220 AND LVEF > 25 AND LVEDV <= 268 RR+ 12 11 1 91.67%

sex = F AND Age <= 63 RR+ 16 15 1 93.75%diabetes = Yes AND NYHA = 3 AND aetiology = IHD AND LVEDV <=230 RR- 9 9 0 100.00%

aetiology = IHD AND sex = M AND NYHA = 2 AND sustained_VA =Yes AND LVESV <= 150 RR- 5 5 0 100.00%

Based on alleles

rs5522 = T AND rs3766031 = T AND rs7325635 = G RR+ 21 17 4 80.95% rs5723 = G RR- 7 6 1 85.71%rs5723 = C AND rs5443 = T AND rs7325635 = G RR+ 24 23 1 95.83%

Based on genotypes

rs3766031 = CT AND rs5522 = TT RR+ 31 25 6 80.65%rs5723 = CC AND rs5443 = TT RR+ 11 9 2 81.82%rs5723 =CC AND rs5443 =TT AND rs5522 = TT RR+ 10 9 1 90.00%rs5723 =CC AND rs7325635 =GG AND rs5522 = TT RR+ 10 9 1 90.00%rs3766031 = CT AND rs5522 = TT AND rs7325635 = GG RR+ 13 12 1 92.31%rs5723 = GG RR- 6 6 0 100.00%

Based on clinical parameters and allelesdiabetes = No AND aetiology = IDC AND rs7325635 = G AND sex = MAND rs5443 = C AND LVEDV > 262 RR- 5 4 1 80.00%LVESV <= 266 AND sex = M AND aetiology = IHD AND rs5443 = T RR- 7 6 1 85.71%rs5723 = C AND rs5522 = C AND LVEF <= 34 AND NYHA = 3 RR- 17 15 2 88.24%rs5723 = C AND diabetes = Yes AND sex = M AND LVEF > 15 RR- 12 11 1 91.67%

rs5723 = C AND diabetes = No AND rs7325635 = G AND aetiology = IDC RR+ 15 14 1 93.33%

diabetes = No AND rs5443 = T AND LVEDV > 197 RR+ 26 25 1 96.15%

Based on clinical parameters and genotypes

LVEF <= 31 AND aetiology = IDC AND diabetes = No AND rs7325635 = GA RR- 10 8 2 80.00%

rs5723 = CC AND rs5443 = CT AND diabetes = No AND LVEDV> 190 RR+ 37 30 7 81.08%

rs5723 = CC AND NYHA = 3 AND rs5522 = CT RR- 16 13 3 81.25%rs5723 = CC AND rs3766031 = CC AND rs5522 = TT AND sex = MAND diabetes = Yes RR- 10 9 1 90.00%

rs7325635 = AA AND NYHA = 3 RR- 4 4 0 100.00%

NYHA = 2 AND rs5522 = TT RR- 8 8 0 100.00%

The table presents selected rules based on different classifiers using the PART algorithm. Only rules with accuracy

LVEF, left ventricular ejection fraction; LVEDV, left ventricular end diastolic volume; LVESV, left ventricular end systolic volume; RR, reverse remodelling, VA, ventricular arrhythmias, AF, atrial fibrillation.

5555 00 0 0 101010100.0.0.0.0000000

17 4444 80808080 951

D 3

y

A 5N 2D 0D 0

10

al parameters and alleles

RR- 7 6 1 85.71D rss55545 434344 = T ANNNNAA DDD rs7325635 = G RR+ 24 23 1 95.83

ypppes

ANNDN rsrr 5522 = TT RR+++ 3111 25555 66 6 800.65ND rsrsrss555444443333 = TTTTTTTT RRRRR+ 1111111 9 2 8111.82D rs5443 =TT ANAA D rss5555522 = TT RR+ 10 999 1 90.00D rs7325635 =GG ANNNNAA D rsrsrsrs55555252525 222 ==== TTTTTT RRRRRRRR+++ 10101010 99 1 90.00ANAA D rs5522 = TT ANNAA DD rs732566635 = GGGG RR+ 1311 12 1 92.31

RRRRRR --- 6666 6666 000 100.000

all papararamemetetersrs aandnd aalllleleleses by guest on May 17, 2018


ownloaded from



27

Figure Legends

Figure 1: Flow chart of the CRT study analysis

Figure 2: Median changes in LVEDV and LVEF for RR+ and RR- groups. Changes in LVEDV

and LVEF were compared between baseline and follow-up and presented in a box plot diagram.

Significant differences, resulting from the defined remodelling phenotypes, were found between

RR+ and RR- for volume (p<0.001) and function changes (p<0.001).

Figure 3: Receiver Operating Characteristic (ROC) curves of patients’ clinical and clinical and

genetic data. Clinical data alone and clinical data combined with genetic information resulted in

two significantly different ROC curves (p=0.002). The C statistic documented the incremental

predictive value of the model combining clinical data with genetic information.

Receiver Operating Characteristic (ROC) curves of patients’ clinical and clinical

a t

cantly different ROC curves (p=0.002). The C statistic documented the incremen

v

Recececeivivivi ererer OOOpepp rararatittt ng Characteristic (ROCCC) ))) ccurves of patientstts’’’’ clinical and clinical

a... CCClinical dataa aloneee annnnd ddd clininnniici aaal dattta combm innnededed wwwithhh ggeneetiiic inininnformmmmaaatiooonnn reeesuult

cantltllly yy didididifffffferenenttt RORROROCCCC curvvvesseses ((((p=0.000 000000002)2)2)2 . ThThTThee C CC ststststatatatisisisticcc dodododocucummentntntntedeeded tthee iiincncrerememen

value of the modeddd l ll combbbbinii iniii g gg clllliiini iici alll ddddata iiwith gggeneticii infformatiiiion.



ownloaded from




ownloaded from


BrandMichele Bianchi, Malte Lenders, Eva Brand, Oberdan Parodi, Maurizio Lunati and Stefan-Martin

Landolina, Maurizio Gasparini, Jonica Campolo, Marina Parolini, Antonio Sanzo, Paola Galimberti, Boris Schmitz, Renata DeMaria, Dimitris Gatsios, Theodora Chrysanthakopoulou, Maurizio

Cardiac Resynchronization TherapyIdentification of Genetic Markers for Treatment Success in Heart Failure Patients: Insight from

Print ISSN: 1942-325X. Online ISSN: 1942-3268 Copyright © 2014 American Heart Association, Inc. All rights reserved.

TX 75231is published by the American Heart Association, 7272 Greenville Avenue, Dallas,Circulation: Cardiovascular Genetics

published online September 10, 2014;Circ Cardiovasc Genet.

http://circgenetics.ahajournals.org/content/early/2014/09/08/CIRCGENETICS.113.000384World Wide Web at:

The online version of this article, along with updated information and services, is located on the

http://circgenetics.ahajournals.org/content/suppl/2014/09/10/CIRCGENETICS.113.000384.DC1Data Supplement (unedited) at:

http://circgenetics.ahajournals.org//subscriptions/

is online at: Circulation: Cardiovascular Genetics Information about subscribing to Subscriptions:

http://www.lww.com/reprints Information about reprints can be found online at: Reprints:

document. Permissions and Rights Question and Answer this process is available in the

located, click Request Permissions in the middle column of the Web page under Services. Further information aboutnot the Editorial Office. Once the online version of the published article for which permission is being requested is

can be obtained via RightsLink, a service of the Copyright Clearance Center,Circulation: Cardiovascular Genetics Requests for permissions to reproduce figures, tables, or portions of articles originally published inPermissions:



ownloaded from

http://circgenetics.ahajournals.org/content/early/2014/09/08/CIRCGENETICS.113.000384

http://circgenetics.ahajournals.org/content/suppl/2014/09/10/CIRCGENETICS.113.000384.DC1

http://www.ahajournals.org/site/rights/

http://www.lww.com/reprints

http://circgenetics.ahajournals.org//subscriptions/


SUPPLEMENTAL MATERIAL

Supplemental Methods

Genotyping PCR conditions

TaqMan SNP genotyping assays were performed on the real-time PCR System ABI7900 (Life

Technologies Corporation, Carlsbad, USA) in a 384 well format (2.5 μl TaqMan Genotyping

Master Mix [2x], 0.125 μl TaqMan SNP Genotyping Assay [40x], 2.375 μl DNase free water

and 2 ng DNA). Real-time PCR conditions were as follows: initial denaturation at 95°C for 10

min; 40 cycles of 95°C for 15 sec and 60°C for 1 min.

Machine learning algorithms used in the study

Within each classifier, 15 different machine learning algorithms were applied. We used

Random Forest,1 Decision Tables,2 Bayesian Network,3,4 Naive Bayes,3,4 Multilayer

Perceptron,3,4 RBF Network,3 K Nearest Neighbors,4,5 HyperPipes,6 Voting Feature Intervals,7

Decision Table Naive Bayes Combination,8 Repeated Incremental Pruning to Produce Error

Reduction (RIPPER),9 Non Nested Generalised Exemplars (NNGE),10 PART,11 Decision Tree

Induction (C4.5)12 and Random Tree.6 The different methods were evaluated for their

specificity, sensitivity and accuracy for the detection of RR+ and RR- individuals.

1

Supplemental References

1. Breiman L. Random Forests. Machine Learning. 2001;45:5-32.

2. Kohavi R. The Power of Decision Tables. 8th European Conference on Machine

Learning. 1995;174-189.

3. Bishop C. Pattern Recognition and Machine Learning. 1st ed. New York: Springer; 2006.

4. Tan PN, Steinbach M, Kumar V. Introduction to Data Mining. Pearson Education Inc;

2006.

5. Aha J, Kibler D, Albert M. Instance - Based Learning Algorithms. Machine Learning.

1991;6:37-66.

6. Witten I, Frank E. Data mining: practical machine learning tools and techniques with Java

implementations. Morgan Kaufmann, 2000.

7. Demiroz G, Guvenir H. Classification by Voting Feature Intervals. Lecture Notes In

Computer Science. 1997;1224:85-92.

8. Hall M, Frank E. Combining Naive Bayes and Decision Tables. Association for the

Advancement of Artificial Intelligence 2008.

9. Cohen W. Fast effective rule induction. In Morgan Kaufmann. 1995;115-123.

10. Martin B. Instance-Based Learning: Nearest Neighbor With Generalization. Thesis to the

Department of Computer Science, University of Waikato, Hamilton, New Zealand; 1995.

11. Frank E, Witten IH. Generating Accurate Rule Sets Without Global Optimization.

1998;144-151.

12. Quinlan RJ. C4.5: programs for machine learning. San Francisco, CA: Morgan Kaufmann;

1993.

2

Supplemental table 1: List of independent classifiers used for the classification of heart failure patients in CRT responder and non-responder

“Clinical” “Genotypes” “Alleles” “Clinical & Genotypes”

“Clinical & Alleles”

sex rs5443 (CC/TT/CT) rs5443 (C/T) sex sex

age rs3766031 (CC/TT/CT) rs3766031 (C/T) age age

aetiology of heart failure rs5723 (CC/GG/CG) rs5723 (C/G) aetiology of HF aetiology of HF

LVEF (Left Ventricular Ejection Fraction) rs5522 (TT/CC/CT) rs5522 (C/T) LVEF LVEF

LVESV (LV End Systolic Volume) rs7325635 (GG/AA/AG) rs7325635 (A/G) LVESV LVESV

chronic AF (Atrial Fibrillation) chronic AF chronic AF

NYHA classification NYHA classification NYHA classification

LVEDV (LV End Diastolic Volume) LVEDV LVEDV

diabetes diabetes diabetes

sustained VA (Ventricular Arrhythmias) sustained VA sustained VA

rs5443 (CC/TT/CT) rs5443 (C/T)

rs3766031 (CC/TT/CT) rs3766031 (C/T)

rs5723 (CC/GG/CG) rs5723 (C/G)

rs5522 (TT/CC/CT) rs5522 (C/T)

rs7325635 (GG/AA/AG) rs7325635 (A/G)

NYHA, New York Heart Association.

3

Supplemental table 2: Clinical baseline parameters of patients available for data mining

All

(n=207) RR+

(n=107) RR-

(n=100) P-

value

Gender (male) 174 (84%) 85 (79%) 89 (89%) 0.086 Age (years) 63 [57-70] 64 [57-69] 63 [56-70] 0.740

Type 2 diabetes mellitus 33 (16%) 12 (11%) 21 (21%) 0.060

Atrial fibrillation 35 (17%) 15 (14%) 20 (20%) 0.271 NYHA class 0.550

class II 143 (69%) 31 (29%) 33 (33%) -

class III-IV 143 (69%) 76 (71%) 67 (67%) - Aetiology 0.169

IHD 98 (47%) 48 (45%) 50 (50%) -

IDC 94 (45%) 54 (51%) 40 (40%) - VALV 15 (7%) 5 (5%) 10 (10%) -

Medication

Beta-blockers 158 (81%) 80 (80%) 78 (82%) 0.719 RAS inhibitors 186 (95%) 94 (94%) 92 (97%) 0.499

Echocardiography

LVEF (%) 26 [22-30] 27 [22-30] 26 [22-30] 0.979 LVEDV (ml) 224 [182-286] 225 [194-293] 223 [172-284] 0.429

LVESV (ml) 170 [131-222] 170 [135-232] 163 [121-220] 0.279

Follow-up (month) 9 [7-12] 9 [7-12] 10 [7-13] 0.246

Values are expressed as n (frequency percent) or median [interquartile range]. P-values for categorical variables

were calculated by Chi-square or Fisher’s exact test, p-values for non-categorical variables were calculated by

Student’s t- or Mann-Whitney test. IHD, ischaemic heart disease; IDC, idiopathic dilated cardiomyopathy;

VALV, valvular defect; LVEF, left ventricular ejection fraction; LVEDV, left ventricular end diastolic volume;

LVESV, left ventricular end systolic volume; MI, myocardial infarction; RAS, renin-angiotensin system; RR,

reverse remodelling.

4

Supplemental table 3: Parameter settings for data mining algorithms used in the study

Algorithm Parameter settings

Bayesian Network# Search method: K2 algorithm

Maximum number of parents of a node: 1

Naive Bayes* none applied

Multilayer Perceptron

Hidden Layers: (number of attributes + number of classes)/2

Learning Rate: 0.3

Bias: 0.2

Normalization: From -1 to 1 All nominal attributes were converted into binary numeric attributes. An attribute with k values was transformed into k binary attributes if the class was nominal (using the one-attribute-per-value approach) Epochs: 500

RBF Network

Minimum Standard Deviation: 0.1

The number of clusters generated by K means: 2

Ridge value for the logistic or linear regression: 1.00E-08

K Nearest Neighbors

No Distance Weighting

Search Algorithm: Linear Search

Distance Function: Euclidean Distance

HyperPipes Bias: 0.6

Voting Feature Intervals Weight feature intervals by confidence

Cross Validation: Leave One Out

Decision Table

Evaluation of attribute combinations using: Accuracy Search method used to find good attribute combinations: Best First; Direction: Forward; Maximum size of the lookup cache: 1; Number of backtracks: 5 Cross Validation: Leave One Out

Decision Table Naive Bayes Combination

Measure used to evaluate the performance of attribute combinations: Accuracy Evaluation of attribute combinations using forward selection (naive Bayes)/backward elimination (decision table) Number of folds used for pruning: 3

RIPPER

Minimum total weight of the instances in a rule: 2

Number of optimization runs: 2

Number of attempts for generalization: 5

Non Nested Generalised Exemplars

Number of folders for mutual information: 2

Confidence factor for pruning: 0.25

PART

Minimum number of instances per rule: 2

Number of folds used for pruning: 3

Confidence factor for pruning: 0.25

C4.5


Number of folds used for pruning: 3

Maximum depth of the trees: Unlimited

Random Forest

Number of attributes to be used in random selection: Unlimited

Number of trees to be generated: 10

Maximum depth of the trees: Unlimited

Random Tree Number of attributes to be used in random selection: log_2(number of attributes) + 1


Modeling of continuous variables: #discretization by minimization heuristic; *assuming a Gaussian distribution.

5

Supplemental table 4: Results of Random Forest, C4.5, PART, Decision Table, Bayes Network and Multilayer Perceptron using different parameter values in the “Clinical” data set

Dataset “Clinical”

Method specificity sensitivity accuracy

Random Forest (2 Trees) 97.00% 91.59% 94.20%






C4.5 (min number of instances/leaf: 2) 58.00% 76.64% 67.63%





PART (min number of instances/rule: 2) 71.00% 77.57% 74.40%


PART (min number of instances/rule:10) 68.00% 46.73% 57.00%



Decision Table (search method: BestFirst) 34.00% 84.11% 59.90%

Decision Table (search method: GreedyStepwise) 34.00% 84.11% 59.90%

Decision Table (search method: LinearForwardSelection) 34.00% 84.11% 59.90%

Decision Table (search method: RankSearch) 45.00% 76.64% 61.35%

Decision Table (search method: ScatterSearchV1) 45.00% 76.64% 61.35%

Decision Table (search method: SubsetSizeForwardSelection) 34.00% 84.11% 59.90%

Bayes Network (method for searching network structures: ICSSearchAlgorithm) 0.00% 100.00% 51.69%

Bayes Network (method for searching network structures: Naive Bayes) 49.00% 70.09% 59.90%

Bayes Network (method for searching network structures: gHillClimber) 49.00% 70.09% 59.90%

Bayes Network (method for searching network structures: gK2) 49.00% 70.09% 59.90%

Bayes Network (method for searching network structures: gRepeatedHillClimber) 49.00% 70.09% 59.90%

Bayes Network (method for searching network structures: gSimulatedAnnealing) 60.00% 71.03% 65.70%

Bayes Network (method for searching network structures: gabuSearch) 49.00% 70.09% 59.90%

Bayes Network (method for searching network structures: lHillClimber) 0.00% 100.00% 51.69%

Bayes Network (method for searching network structures: lK2) 49.00% 70.09% 59.90%

Bayes Network (method for searching network structures: lLAGDHillClimber) 0.00% 100.00% 51.69%

Bayes Network (method for searching network structures: lRepeatedHillClimber) 0.00% 100.00% 51.69%

Bayes Network (method for searching network structures: lSimulatedAnnealing) 0.00% 100.00% 51.69%

Bayes Network (method for searching network structures: lTabuSearch) 0.00% 100.00% 51.69%

Bayes Network (method for searching network structures: lTAN) 45.00% 71.03% 58.45% Multilayer Perceptron (1 hidden layer neurons = [number of attributes + number of classes]/2) 85.00% 87.85% 86.47%

Multilayer Perceptron (1 hidden layer 2 neurons) 39.00% 94.39% 67.63%

Multilayer Perceptron (1 hidden layer neurons = number of attributes) 82.00% 96.26% 89.37% Multilayer Perceptron (1 hidden layer neurons = number of attributes + number of classes) 86.00% 92.52% 89.37%

6

Supplemental table 5: Results of Random Forest, C4.5, PART, Decision Table, Bayes Network and Multilayer Perceptron using different parameter values in the “Alleles” data set

Dataset “Alleles”





































Bayes Network (method for searching network structures: lTAN) 63.75% 71.05% 67.31%

Multilayer Perceptron (1 hidden layer neurons = [number of attributes + number of classes]/2) 51.25% 96.05% 73.08%


Multilayer Perceptron (1 hidden layer neurons = number of attributes) 57.50% 94.74% 75.64%

Multilayer Perceptron (1 hidden layer neurons = number of attributes + number of classes) 57.50% 94.74% 75.64%

7

Supplemental table 6: Results of Random Forest, C4.5, PART, Decision Table, Bayes Network and Multilayer Perceptron using different parameter values in the “Genotypes” data set

Dataset “Genotypes”








































8

Supplemental table 7: Results of Random Forest, C4.5, PART, Decision Table, Bayes Network and Multilayer Perceptron using different parameter values in the “Clinical & Alleles” data set

Dataset “Clinical & Alleles”








































9

Supplemental table 8: Results of Random Forest, C4.5, PART, Decision Table, Bayes Network and Multilayer Perceptron using different parameter values in the “Clinical & Genotypes” data set

Dataset “Clinical & Genotypes”








































10

Download - Identification of Genetic Markers for Treatment Success in ...circgenetics.ahajournals.org/content/circcvg/early/2014/09/08/CIRC... · Patients: Insight from Cardiac Resynchronization

Top Related