asthma exacerbation prediction and interpretation based on … · abstract background: asthma...
TRANSCRIPT
Original Paper
Asthma Exacerbation Prediction and Interpretation based on Time-sensitive Attentive Neural Network: A Retrospective Cohort Study
1YangXiang,Ph.D.,[email protected]
1,2HangyuJi,M.D.,[email protected]
1YujiaZhou,M.B.B.S.,[email protected]
1FangLi,Ph.D.,[email protected]
1JingchengDu,Ph.D.,[email protected]
1LailaRasmy,M.S.,[email protected]
1StephenWu,Ph.D.,[email protected]
1Wenjin.JimZheng,Ph.D.,[email protected]
1HuaXu,Ph.D.,[email protected]
1DeguiZhi,Ph.D.,[email protected]
1YaoyunZhang,Ph.D.,[email protected]
1CuiTao*,Ph.D.,[email protected]
1SchoolofBiomedicalInformatics,TheUniversityofTexasHealthScienceCenterat
Houston,Houston,TX,U.S.
2DivisionofGastroenterology,Guang'anmenHospital,ChinaAcademyofChinese
MedicalSciences,Beijing,China
*correspondingauthor
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.
Abstract Background:Asthmaexacerbationisanacuteorsub-acuteepisodeofprogressiveworseningofasthmasymptomsandcanhavesignificantimpactsonpatients’dailylife.In2016,12.4millioncurrentasthmatics(46.9%)intheU.S.hadatleastoneasthmaexacerbationinthepreviousyear.Objective:Theobjectivesofthisstudyweretopredicttheriskofasthmaexacerbationsandtoexplorepotentialriskfactorsinvolvedinprogressiveasthma.Methods:Weproposedatime-sensitiveattentiveneuralnetworktopredictasthmaexacerbationusingclinicalvariablesfromelectronichealthrecords(EHRs).TheclinicalvariableswerecollectedfromtheCernerHealthFacts®databasebetween1992and2015including31,433asthmaticadultpatients.Interpretationsonboththepatientlevelandthecohortlevelwereinvestigatedbasedonthemodelparameters.Results:TheproposedmodelobtainsanAUCvalueof0.7003through5-foldcross-validation,whichoutperformsthebaselinemethods.Theresultsalsodemonstratethattheadditionofelapsedtimeembeddingsconsiderablyimprovestheperformanceonthisdataset.Throughfurtheranalysis,itwaswitnessedthatriskfactorsbehaveddistinctlyalongthetimelineandacrosspatients.Wealsofoundsupportingevidencefrompeer-reviewedliteratureforsomepossiblecohort-levelriskfactorssuchasrespiratorysyndromesandesophagealreflux.Conclusions:Theproposedtime-sensitiveattentiveneuralnetworkissuperiortotraditionalmachinelearningmethodsandperformsbetterthanstate-of-the-artdeeplearningmethodsinrealizingeffectivepredictivemodelsforthepredictionofasthmaexacerbation.Webelievethattheinterpretationandvisualizationofriskfactorscanhelptheclinicalcommunitytobetterunderstandtheunderlyingmechanismsofthediseaseprogression.Keywords:asthmaexacerbation;predictivemodel;time-sensitive;elapsedtimeembedding;deeplearning;attentionmechanism
Introduction Asthmaisacommonandserioushealthproblemwhichaffects235millionpeopleworldwide[1]andanestimatedof26.5millionpeople(8.3%oftheU.S.population)intheU.S.[2].Asthmatakesasignificanttollonthepopulationwhichimposesanunacceptableburdenonhealthcaresystemswithatotalannualcostof$81.9billionin2013intheU.S.[3,4].Asthmamaydevelopintoexacerbationifitisnotwellcontrolledorstimulatedbyspecificriskfactors[3].In2016,12.4millioncurrentasthmatics(46.9%)intheU.S.hadatleastoneasthmaexacerbationinthepreviousyear[2].Exacerbationsofasthmacanbesevereandrequireimmediatemedicalinterventions,eitherasanemergencydepartmentvisitoranadmissiontohospital[5].Seriousasthmaexacerbationsmayevenresultindeath[6].Therefore,itisofpracticalsignificancetomakeearlypredictionssothatinterventionscanbecarriedoutinadvancetoreducetheprobabilityofexacerbation.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
Investigationsonpredictionandriskfactorrecognitionforasthmaexacerbationhavebeenrespectable,inwhichthemainstreamadoptstraditionalstatisticalmethods,suchaslogisticregression[7,8],proportional-hazardsregression[9],andgeneralizedlinearmixedmodels[10].However,mostofthemhaveonlyexploredasmallgroupofcandidateriskfactors,areusuallyhardtoextendtootherdatasetsandhardtomakepersonalizedpredictions.Withtheexplosionofhealthcaredatainrecentyears,machinelearningmethodshavegrowninprominenceforthisdomain,duetotheirsuperiorityoverstatisticmethodsinprocessinglargernumbersofvariablesandcapacityinminingmorepossiblecorrelationsbetweenthem[11].TypicalmodelsincludeNaïveBayes[13],Bayesiannetworks[12–14],artificialneuralnetworks[12],GaussianProcess[12],andSupportVectorMachines[12,13].Althoughdifferentattemptshavebeenmade,therearestillseveraldeficienciesinapplyingthesetraditionalmachinelearningmethods.Forexample,ignoringtemporaldependenciesbetweenvariablesmightnotprovideameaningfulriskestimationoffutureexacerbationsforindividualpatients[15].Furthermore,mostapproachesonlyconcentrateontheperformancebutlackfurtherattentiontopersonalizedriskfactors.Recentpredictivemodeling-relatedstudiesfocusedmoreondeeplearning,whichhasanupperhandonhealthcarepredictionsbecauseofitsflexibilityindealingwithlongitudinaldata[16],powerfullearningcapabilities[17],andabilitytotackletheproblemofdatairregularity[18].Oneofthemostpopulararchitecturesofdeeplearning-basedpredictivemodelistherecurrentneuralnetworks(RNNs),whichtakeapatient’svisitsequenceastheinputandmakepredictionsaccordingtotheencodedrepresentations.Multiplesuccesseshavebeenachievedinapplyingdeeplearningondiseaseprediction[19],mostlyusingvariantsofRNNswithdistinctnetworkcomponents,suchasattentionmechanismforevaluatingweightsofeachvariable[20–22],andspecialconfigurationsintacklingtheproblemoftimedecays[20,21,23,24].Typicalpredictiontasksincludethepredictionofdiabetesmellitus[18],Parkinson[18,25], chronicheartfailure[21],sepsis[26],mortalityandreadmission[20].Inspiredbypreviousstudies,weappliedLongShort-termMemory(LSTM),apopularRNNvariantusedbydozensofpreviousstudies[19,20,23,27],forasthmaexacerbationprediction.WeproposedtheTime-SensitiveAttentiveNeuralNetwork(TSANN),whichemploysaself-attentionmechanism[28]tohelpmodelthecontextofbothvisit-levelandcode-levelvariables.Meanwhile,toincorporatetheimpactofelapsedtime,weprojectedthevisittimeofeachclinicalvariableintoalow-dimensionalspaceandassignedanumericvectortoeachtime.MakinguseoftheattentionweightsofTSANN,dataanalysiswasthenconductedtoinvestigatepersonalizedandcohort-levelriskfactors.
Asfarasweknow,thisisthefirstdata-drivenstudytopredictasthmaexacerbationusingdeeplearningandEHRdata,thefirstefforttointroduceelapsedtimeembeddingsintoclinicalpredictivemodeling,andthefirstattempttovisualizeriskfactorsofasthmaexacerbationonboththeindividuallevelandthecohortlevel,whichhavebeeninsufficientlyexploredinmostpreviousstudies.Webelievethat
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
theoutcomesofthisstudycanhelptheclinicalcommunitytobetterunderstandtheunderlyingmechanismsofthediseaseprogressionandtoassistindecision-making.Althoughfocusingonasthmaexacerbationforthisspecificproject,theproposedapproachcanalsobeadoptedinriskpredictionforotherchronicdiseases.
Methods
Database ThisstudyusedCernerHealthFacts®,aHIPAA-compliantdatabasecollectedfrommultipleenrolledclinicalfacilities,containingmostlyin-patientdata.DatainHealthFactswereextracteddirectlyfromtheEHRsfromhospitalswithwhichCernerhasadatauseagreement.Encountersmayincludethepharmacy,clinicalandmicrobiologylaboratory,admission,andbillinginformationfromaffiliatedpatientcarelocations.Allpersonalidentifyinginformationofthepatientswasanonymized.Inourstudy,weprimarilyfocusedontheimpactofclinicalfactorsonasthmaexacerbation,soweextracteddiagnosis,medications,anddemographiccharacteristicssuchasgender,race,andagefromthedatabase.TheUniversityofTexasHealthScienceCenter(UTHealth)hadagreementswithCernertousethisdataforresearchpurposes.TheinstitutionalreviewboardatUTHealthapprovedthestudyprotocol.
Study Design Weconductedaretrospectivestudytopredicttheriskofasthmaexacerbation.Weextractedpatients’recordsbetween1992and2015fromtheCernerdatabase.Forclarity,wedefineseveraltermsinadvance(Table1).Table 1. Defined terms for asthma exacerbation prediction. Term Definition index date the date of the first diagnosis of asthma in a patient’s EHR exacerbation date the date of the first diagnosis of asthma exacerbation after the
index date case group patients with asthma and later asthma exacerbations within 365
days and satisfy the inclusion and exclusion criteria (see Multimedia Appendix 1 for more details)
control group patients with asthma but without exacerbations within 365 days and satisfy the inclusion and exclusion criteria (see Multimedia Appendix 1 for more details)
prediction date training set: for the case group, the visit date before the exacerbation date; for the control group, the penultimate visit date within 365 days. testing set A: the 5th visit starting from the index date testing set B: defined analogous to the training set (following [21])
observed time window
the time window between the index date and the prediction date
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
Intuitively,atestingsampleshouldbedefinedinasimilarwayasthatforthetrainingsamples(astestingsetB).However,sincewecannotforeseewhentheexacerbationwouldhappeninreal-worlddeployment,wecanonlymakefuturepredictionsateachvisit.Inourstudy,weselectedthe5thvisitfromasthmaindexasthepredictiondate(testingsetA),consideringbothleveragingmorevisitsandkeepingmorepatientsforexperiments(seeMultimediaAppendix1formoredetails).WealsodefinedtestingsetB:thepenultimatevisitasthepredictiondate,behavingastheupperboundoftheclassifierperformance,sinceitenablesmorecompletevisitinformation.TheTSANNmodelwastrainedtopredicttheonsetofasthmaexacerbationgiventheobservedtimewindow.Themainoutcomesofthemethodare:(1)ascorethatmeasurestheriskofasthmaexacerbationforeachpatient;(2)visualizationoftheresultsincludingapersonalizedheatmapidentifyingtheimportanceofeachclinicalvariableintheobservedtimewindow,cohort-levelriskfactorsandtheirtemporaldistributionsamongpatients.Basedontheoutcomes,furtherdataminingorclinicaltrialscanbecarriedoutforvalidation.TheworkflowofthisresearchisshowninFigure1.
Figure1.Theworkflowofriskpredictionofasthmaexacerbation.
Selection of Study Subjects Thesubjectsinthestudywerepatientswithadiagnosisofasthma.Inclusionandexclusioncriteriaweredecidedbasedonpreviouswork[29,30]includingthediagnosisofasthmaandasthmaexacerbation.Thecurrentstudyonlyfocusedonadultpatientswithagebetween18and80.Intheend,31,433individualsremained,including2,262casesand29,171controls(≈1:13).ThecohortselectionprocessisshowninFigure2.MoredetailsforthecohortselectionareshownintheMultimediaAppendix1.
Time-sensitive Attention Neural Network
Model Overview TSANNacceptsthewholesequenceofclinicalvariablesintheobservedtimewindowasinputs,andoutputstheprobabilityofasthmaexacerbation(seeFigure
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
3).ThearchitectureofTSANNisbasedonLSTMwiththeadditionofhierarchicalattentionandelapsedtimeembeddings.
Figure2.Thecohortselectionprocessforthestudyofasthmaexacerbation.
Foreachvisit,multipleclinicalvariablesareencodedintheinputlayerandaveragedthroughthecode-levelattentionmechanism.Theelapsedtimeembeddingisconcatenatedasthecomplementaryinformationtoindicatetherelativetimeintervalbetweeneachvisitdateandthepredictiondate.LSTMacceptsthesequenceofencodedvisitsasinputsandoutputsfurtherencodingsforeachvisit.Thevisit-levelattentionlayeristhenappliedontheoutputsofLSTMtosummarizeallthevisitsforeachpatient.Finally,byfeedingtheoutputofvisit-levelattentionintothesoftmaxfunction,aprobabilityindicatingtheriskofdiseaseonsetisgenerated.
Figure3.TheoverviewoftheTSANNmodelforasthmaexacerbationprediction.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
Input Theinputsofthemodelconsistoftwotypesoffeatures.OnetypeisclinicalvariablesincludingICDcodes,medications,anddemographicfeatures.TheICD-10codesareallconvertedintoICD-9basedonpredefinedmappings[31].Allthemedicationsarenormalizedtotheirgenericnames.Thedemographicfeaturesincludeage,gender,andrace,whichareonlytakenasinputsonthevisitofthepredictiondate.Usingaprojectionmatrix𝑊"#$% ∈ ℝ()×+) ,wemappedeachclinicalvariableintoaconceptembeddingvector:
𝐶-. = 𝑊"#$% ∙ 𝑥-. (1)
whereCijisthegeneratedconceptembeddingvectorand𝑥-. ∈ ℝ+) istheone-hotvectordenotingtheexistenceofeachvariable.Theotherfeaturetypeistimefeatures,whichindicatetheoccurrencetimeforeachclinicalvariable.Intuitively,variableswithdifferenttimestampswouldbehavedifferentlyinprediction.Forinstance,inmanycases,aclinicaleventthathappensseveraldaysagowouldplayamoreimportantrolethanonethathappenedseveralmonthsago.Meanwhile,duetothenatureofdatairregularityanddeficiencyofEHRs,successivevisitsalwayshavediversetimeintervals[23],whichmakesitindispensabletoconsiderthetimeelapsewhendoingpredictivemodeling.Inspiredbytheideaofpositionembeddingsinnaturallanguageprocessingwhichwereintroducedtomodelthepositionalinformationforeachwordine.g.relationclassification[32]andthetransformerstructureinneurallanguagemodeling[28,33,34],weintroducedelapsedtimeembeddingstorepresenttherelativetimegapforeachclinicalvariable.Specifically,takingthetimeofthepredictiondateT0asapivot,thetimefeatureofeachvariableistheabsolutedifferencebetweenitsoccurrencetimeTiandT0,i.e.T0-Ti.Sincetheobservedtimewindowhasanupperboundof365days,thevocabularysizeofthetimeembeddingswassettobe365.Weappliedamatrix𝑊2#$% ∈ ℝ(3×+3 toprojecteachtimevaluetoanm-dimensionvector.Unlikethecodeembeddings,elapsedtimeembeddingsarefedintothemodelafterthecode-levelattentionandassignedtoeachvisit.Theequationtogettheelapsedtimeembeddingforeachvisitisanalogoustothatforconceptembeddingswhere𝑡-. ∈ ℝ+3:
𝑇-. = 𝑊2#$% ∙ 𝑡-. (2)
Foreasierdescription,wedenoteeachvisitas andeachclinicalvariableineachvisitas ,whereTisthemaximumnumberofvisitsandMisthemaximumnumberofeventsineachvisit.
Code-level Attention Attentionisamechanismspecificallydesignedfordeepneuralnetworksthatactsasaninformationfilter,meanwhilehasthecapacityofalleviatinginformationlosswhendealingwithlongsequences.Itselectsimportantsequencespansbyassigningweightstodifferentelementsinasequence[35,36].Throughattention,eachvariableisassignedaweightsothatimportantvariableswouldhavelargerweightsthanthe
, {1,2,..., }iv i TÎ, {1,2,..., }ijv j MÎ
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
others.Weadoptedtheattentionmechanismfrom[37]inwhichtheweightofeachvariableisgeneratedaccordingtothesequenceandacontextvector.Concretely,giventhesetofcodes intheithvisit,theencodedrepresentationofvicanbegeneratedby:
(3)
(4)
(5)
whereWvandbvaretheweightandbiasformatrixtransformation,uijistheattentionvectorforeachcodejinvi,uvisthecontextvectorforviandisupdatedduringtraining,andbijistheattentionweightfortheeventvijbasedonwhichwecangeneratethefinalweightforthisvariable.
Visit-level Attentive LSTM Layer Takingtheencodedrepresentationofeachvisitasinput,LSTMmodelsthesequentialinformationintheobservedtimewindowandgetsthesummarizationatthefinalstep(thepredictiondate).TheadvantageofLSTMsoverbasicRNNsisthattheycanalleviatethevanishinggradientproblem,andarethusabletoretain“memories”ofpriortimestamps[38,39].LSTMsareimplementedbyseveralmatrixmultiplicationsandnonlineartransformationsthataimtomimicthememorymechanismofhumanbrains,inwhichtheseoperationsarecalledgates,signifyingthatthenetworkcanselecteffectiveinformationandabandonuselessinformation.TheequationsofLSTMsarelistedasfollows:
(6)
(7)
(8)
(9)
(10)(11)
whereWsandbsareweightsandbiasesfordifferentgatesorcells(ft:forgetgate,it:inputgate,Ct:memorycell,ot:outputgate,ht:hiddencell),andσistheactivationfunctionsuchastanhorsigmoid.
Weappliedself-attentionagainonthevisit-leveltomeasuretheriskscoresforeachclinicalvariableineachvisit.ByassigningattentionweightstotheoutputsofLSTMfromeachstep,wecanweighteachvisitintheobservedtimewindow.
(12)
{ }, {1,2,..., }ijv j MÎ
tanh( )ij v ij vu W v b= +
exp( )exp( )
Tij v
ij Tik v
k
u uu u
b =å
i ij ijj
v vb= ×å
1( [ , ] )t f t t ff W h v bs -= × +
1( [ , ] )t t t t ti W h v bs -= × +~
1tanh( [ , ] )t c t t cC W h v b-= × +
1* *t t t t tC f C i C-= + !
1( [ , ] )t o t t oo W h v bs -= × +*tanh( )t t th o C=
tanh( )i p i pu W v b= +
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
(13)
(14)
whereWpandbparetheweightandbiasformatrixtransformation,uiistheattentionvectorforeachvisitigivenvi,upisthecontextvector,andajistheattentionweightforvj.Thisprocesscanbeseenasasimulationtowardsthediagnosisprocedureofaclinicvisit,duringwhichaphysicianwouldlookbackintoapatient’sEHR,measuretheimpactsofeachhistoricalclinicaleventandmakeafinaldecision.
Output Thevisit-levelattentionlayercompressesalltheinformationintheobservedtimewindowintoafixedsizevector.Theoutputofattentionfurthergoesthroughafullyconnectedlayerwithanonlinearactivation.Asoftmaxfunctionisfinallyappliedtogeneratethepredictedprobabilityp
𝑝 = 𝑠𝑜𝑓𝑡𝑚𝑎𝑥(𝑊" ⋅ 𝑟? + 𝑏")(15)whererpstandsfortheoutputofvisit-levelattention-LSTM.Thevaluepisusedasthescorefortheriskofdevelopinganasthmaexacerbation.
Evaluation AUROC/AUC(AreaUndertheReceiverOperatingCurve)iswidelyusedasanevaluationmetricforpredictivemodelswhichreflectsabalancebetweensensitivityandspecificity[40].Accordingtothepredictedprobabilityp(between0and1)foreachinstance,theAUCvalueisgeneratedbysettingdifferentcut-offs.ThemethodslistedinTable2werecomparedinourexperiments.Table 2. The methods used for comparisons. Method Note LR-sparse A popular conventional machine learning algorithm[41], usually
behaves as a strong baseline in predictive modeling[42]. The input of LR-sparse for each sample is a fixed length feature vector, the length of which is the number of distinct variables (the vocabulary size) and the value of each dimension is the occurrence times of each variable. LR suffered from the data imbalance problem on this dataset, so we employed SMOTE[43] to do over-sampling and help reduce its impact.
LR-dense A simplified version of Multi-Layer Perceptron[44] with only one input layer and one softmax layer. In LR-dense, the representations of all the codes were averaged after being projected to the embedding space. The differences between LR-dense and LR-sparse are two-fold: a) the inputs of LR-dense is the average of code embeddings while the input of LR is a fixed length vector denoting the occurrence of each clinical variable; b) the input embeddings of LR-dense can be fine-tuned during training but the input of LR-sparse cannot.
exp( )exp( )
Ti p
i Tk p
l
u uu u
a =åp j j
jr va= ×å
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
LSTM The basic LSTM algorithm, taking the sequence of the clinical variables as input ordered by time; the variables in each visit are averaged.
ALSTM Attention LSTM with one layer of LSTM and one layer of attention. TLSTM[23] The time-aware LSTM model, which is one of the state-of-the-art
predictive models. In TLSTM, the time gap is used to compute the information decay in the LSTM unit.
RETAIN[21] A two-layer attention model, which is another state-of-the-art model for disease onset prediction. In RETAIN, the time features are not embedded as vectors, but real values denoting the gaps from the first visit.
TSANN-I The proposed TSANN model but with the second attention layer removed, the prediction is based on the final state of LSTM.
TSANN-I-step Use the time encoding method from [34] on TSANN-I, in which although time was also encoded using a vector, it only showed the order of each visit, but not the actual elapsed time, which is insufficient in modeling the irregularity in EHRs.
TSANN-II A complete version of the proposed TSANN model. Forevaluation,wefirstlysplitthedataintoatrainingsetandaheld-outtestingsetwitharatioof8:2.Further,5-foldcross-validationwasperformedonthetrainingdatasetforparametertuning.Duringcross-validation,gridsearchwasappliedtotunethehyperparameters.Finally,thehyperparametersforourbestmodelTSANN-Iwerebatchsize=32,codeembeddingsdimension=100,timeembeddingsdimension=20,learningrate=0.001,l2penalty=0.0001foralllayers,Leaky_ReLU[45]astheactivationfunctionforLSTM,addingbatchnormalizationbeforesoftmax,andAdam[46]astheoptimizer.AmoredetailedparametertuningprocessisshownintheMultimediaAppendix1.CodesforRETAINandTLSTMwereprovidedbytherespectiveauthors,andallotherdeeplearningmodelswereimplementedwithTensorFlow[47]andtestedonNvidiaTeslaV100,QuadroP6000andTitanXPGPUs.
Results
AUC Values SincethetimeinformationisofcriticalimportanceinmodelingEHRdata,weconductedexperimentsonsituationsbothwithandwithoutit.WedidnotimplementLR-sparsewithtimeembeddingsassociatedwithdayasthetimeunitsinceitwouldhaveintroducedagreaternumberofvariables(i.e.12,390*365),whichwouldhavebeentoosparseanddifficultforcomputation.Instead,wereducedthevocabularyofthetimevariablebysettingmonthasthetimeunitandfinally148,680distinctclinicalvariablesweregenerated.ForTLSTM,weonlyconsideredawith-timeversionsinceitisdefinedasatime-awarevariantofLSTM.ForLR-dense,LSTM,ALSTM,TSANN-IandTSANN-II,weusedtheelapsedtimeembeddingsintroducedinthisstudy.TheAUCvaluesonthetestingsetforallthemethodsareshowinTable3.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
Table 3. AUC values by the proposed models compared with baselines. (+/- stands for the improvement of adding time info).
Method Without time With time +/-(%) LR-sparse 0.5685 0.5825 +1.4
LR-dense 0.6545 0.6753 +2.08
LSTM 0.6045 0.6567 +5.22
ALSTM 0.6346 0.6714 +3.68
TLSTM - 0.6548 -
RETAIN 0.6455 0.6882 +4.27
TSANN-I 0.6692 0.7003 +3.11
TSANN-I-step 0.6463 - -
TSANN-II 0.6827 0.6855 +0.28
*the optimal value for each column is marked in bold. Comparedwithdifferentrows,wenoticethatTSANN-IwithtimeinformationachievestheoptimalAUCvalue,improvingthestrongestbaseline(RETAIN)by1.21%.TSANN-IIgetscomparableperformancewithRETAIN.AlltheresultsshowthattheconventionalmachinelearningmethodLR-sparsebehavesworsethanthedeeplearningmethods.ItisnoticedthatLR-denseperformsbetteronbothwithandwithouttimethanLSTMandALSTM.TSANN-I-step,whichonlyusedtimeembeddingstodenotetherelativepositionofeachvisit,doesnotgetgoodresult.
Whencomparingtheresultswithandwithouttimeinformation,considerableimprovementswereobservedafteraddingtimeinformationonmostmethods,signifyingthattheproblemofdatairregularityisobviousinthestudiedproblemandthetimeinformationplaysanimportantroleinmodelingvisitsforpredictingasthmaexacerbation.Forexample,TSANN-Iintegratedthetimeintervalintodecayfunctionsandobtaineda3.11%improvement.SimilarcasescanalsobeobservedinothermodelsincludingRETAIN.Surprisingly,TSANN-II,whenintegratingtimeembeddings,didnotgetmuchimprovement.Inaddition,ifwithouttimeinformation,ourproposedmethodsalsoperformmuchbetterthanothers,showingthattheyhavestrengthsevenincasessuchasthelossoftemporalinformation.ConsiderableimprovementscanalsobeobservedontestingsetB(RETAIN:0.7761,TSANN-I:0.8202)aswellasthecontributionofaddingthetimeinformation.Andasexpected,thegeneralresultsweremuchbetterontestingsetB(seeMultimediaAppendix1).
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
Personalized Heatmap Inourstudy,aheatmapconveystheinterpretationsandbehavesasavisualizationtoolinidentifyingthepersonalizedriskfactors.Aheatmapistoillustratehoweachcandidateriskfactorbehavesineachvisitintheprogressionofasthma.Eachgridintheheatmapiscoloredbasedontheattentionweightsderivedfromthemodel.Thedarkeranareais,themoreimportancetheclinicalvariablesignifies,andthemorepossibleitbehavesasariskfactor.Forexample,Figure4showsacasewherethesymptomsofhypoxemia,shortnessofbreathandwheezing(799.02,786.05and786.07inICD9),etc.arerecognizedaspossibleriskfactors.Apossibleexplanationmightbethepatient’sstatusofhypoxemiaworsenedtheconditionofasthma,followingsymptomsinbreath,andasthmaexacerbationwasthendiagnosed.
Figure4.Anexampleofheatmapwiththemostpossibleriskfactordenotedbytheclinicalvariables:hypoxemia(D_799.02),shortnessofbreath(D_786.05)andwheezing(D_786.07),etc.
Itishardtogetaclearoverviewofthediseaseprogressionsincewecanonlydependonstructureddatabutwithoutanyclinicalnotes.Asaresult,wecanhardlyconfirmthediscoveredfactorsarerealriskfactorsbutonlyknowthattheymightbeeitherpossiblefactorstriggeringexacerbationsorofhighassociationswiththeevent.However,ourmethodmaybehaveasanimportantcomplementindiseasepredictionandclinicaldecision.
Cohort-level Risk Factors Wealsoproposedamethodtodiscovercommonriskfactorsonthepopulationlevelsothatclinicianscanhaveabetterunderstandingtowardsthediseaseprogressionwhilepatientscanpaymoreattentiontoriskfactorsindailylives.Therecognizedtop-rankedriskfactorsareshowninTable4.ThedetailsofthemethodandclinicaldiscussiontothefactorsaredescribedinMultimediaAppendix1.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
Table 4. Clinical variables with the top-ranked weights (/N stands for the variable occurred in N months prior to the prediction date). We regard both * and ▲ containing valuable information.
ICD-9 (Diagnosis) Medication 1 493.9x/0-5 (asthma)* (meaning diagnosed with
asthma multiple times before exacerbation) methylprednisolone/0,1**
2 786.07/0-2 (wheezing)△ prednisone/0,1,2** 3 496.0/0,1 (chronic airway obstruction not
elsewhere classified)▲ ipratropium/0,1,2**
4 530.81/0 (esophageal reflux)* midazolam/0,1,2△ 5 V46.2/0 (dependence on supplemental
oxygen)△ hydromorphone/0-2▲
6 787.02/0 (nausea alone)△ heparin/0,1△ 7 786.50/0 (unspecified chest pain)△ acetaminophen-oxycodone/0* 8 V08/042/0 (HIV related)▲ fentanyl/0▲ 9 786.59/0 (other chest pain)△ methylprednisolone/2-4** 10 786.05/0 (shortness of breath)△ glycopyrrolate/0* 11 V58.69/0 (long-term (current) use of other
medications)▲ lidocaine/0△
12 784.0/0 (headache)▲ dexamethasone/0△ 13 346.90/0 (migraine, unspecified, without
mention of intractable migraine without mention of status migrainosus)▲
promethazine/0△
14 V58.66/0 (long-term (current) use of aspirin)* atorvastatin/0△ 15 491.21/0 (obstructive chronic bronchitis with
(acute) exacerbation)▲ furosemide/0**
*Possible risk factors of asthma exacerbations △These factors were comorbidities or combined medications. We believe they were not risk factors of asthma exacerbations. ▲It could hardly be determined whether these factors caused asthma exacerbations but they were with high associations. ** These medications can be used to treat asthma or control asthma symptoms. In the study, it could hardly be determined whether these medications are risk factors since we were unable to investigate the dosage of these medication in the current study. Inappropriate medications use Short-Acting Beta Agonists (SABA)/Inhaled Corticosteroids (ICS) could also lead to asthma exacerbations though.
Discussion
Principle Results OurproposedmethodobtainstheoptimalAUCvalueonthepredictiontask,withhierarchicalattentionandelapsedtimeembeddingsasitsbooster.Thevisualizationpartalsoprovidesusefultracksforbetterunderstandingthediseaseprogression.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
RegardingtheAUCvalues,sinceweonlyselected5visitsastheobservedtimewindow,LSTMandALSTMmaynotbeaspowerfulastheywereinmodelinglongersequences,whichcanbereflectedbytheunsatisfyingresultscomparedwithLR-dense.However,forTSANN,wheremorecomplexattentivestructureswereadded,theresultsgetcomparablewithorbetterthanLR-dense.TLSTM,althoughhasalsointegratedthetimedecayinformation,doesnotgetsatisfyingresults,perhapsduetotheimproperheuristicdecayingfunctionforthecurrentdataset.TSANN-Iand-IIobtainbetterresultsasRETAIN,signifyingtheeffectivenessofthemodelstructure.Besides,thehierarchicalattentionarchitecturemakesiteasierforfurtherinterpretations,e.g.thepersonalizedheatmap.LR-sparse,althoughhasbeentunedthoroughlyinourexperiment,stillbehavesworsethantheothers,whichmaypartlyduetoitsinsufficiencyinmodelingcomplexsequentialpatterns.AtypicalcharacteristicoftheEHRdataisirregularity,whichmeansthatclinicvisitsmayberandomlyandsparselydistributedalongthetimeline,andsometimesareevenmissing.Thus,thepredictivemodelisresponsibleofserializingthevisitsforeachpatientwiththeconsiderationoftimeelapsesbetweencontinuousvisits.ThecomparisonsbetweenresultswithandwithouttimeinformationinTable3demonstratetheeffectivenessofconsideringtimeelapsesonthisstudycohort.Itcanbeconcludedthatthepredictionofasthmaexacerbationisquitetime-sensitive,andmostofthecriticalriskfactorsshouldhavebeentimestamped.Forinstance,evenforavisitjustinfrontofthepredictiondate,iftheoccurrencetimeofthisvisitislongbefore,itsimpactwouldstillbereduced.Similarcasecanalsobefoundin[23]withanimprovementof6%fromLSTMtoTLSTM.ForTSANN-I-step,althoughtimeembeddingswerealsoused,theywereonlyusedtodenotetherelativepositionofeachvisitinthesequencebutlacktheabilitytorepresenttimedecay,whichcanhardlygetgoodresultshere.AddingtimetoTSANN-IIdidnotgetmuchimprovementasthoseinothermethods,whichwethinkmightbeattributedtothattheadditionofthevisit-levelattentionweakensthecontributionofthetimeembeddings.
Apartfromthepersonalizedheatmapsandcohort-levelriskfactors,makinguseoftheweightsgeneratedbyEq.2andEq.3inMultimediaAppendix1,wecanalsovisualizehoweachclinicalvariablecontributesacrosstime,e.g.avariablemaybehavedistinctlyamongindividuals,withdifferentactiontimeordifferentincidences.Figure5-6showtwoexamplesinwhichthetimedistributionsfortheclinicalvariablesaredisplayedthroughscatters.Inthesescatters,eachcirclerepresentsapatientanditssizeandcolordepthdenotetheimportanceofthecorrespondingvariableforthepatient.Inthefigures,thex-axisstandsforthetimegapbetweentheoccurrencedateofthevariabletothepredictiondate,whilethey-axiswasemployedmerelyforcosmesis.Werandomlyselectamaximumof2,000patientstoplotthisfigure.Figure5-6arederivedfromanICDcode(esophagealreflux:530.81inICD9)andamedication(fentanyl)respectively.Weobservedifferenteffectivetimerangesforthesetwofactors,wherethefirstfactortendstodistributemorebetweentheprevious250to50dayswhilethesecondbetweentheprevious150days.Wehope
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
thesevisualizationscanhelpfigureoutthedistributionsofmorepossibleriskfactorstoaidtheasthmacontrol.
Fig5.ThetimedistributionoftheclinicalvariableICD-9:530.81(gastro-esophagealreflux
disease)asariskfactor.
Fig6.Thetimedistributionoftheclinicalvariablemedication:fentanylasariskfactor.
Comparison with Prior Work Asfarasweknow,thisisthefirststudyindeeplearning-basedpredictiononasthmaexacerbation,andfromTable3,weobservedthatourproposedmethodoutperformsbothconventionalmachinelearninganddeeplearningmethods.Theperformanceboostsmainlycomefromthearchitectureofhierarchicalattentions,whichcanefficientlycaptureinformationfromdistinctmedicalelements,andthewayofencodingtimeusingelapsedtimeembeddings,whichenableslearningtemporalpatternsfromdifferentperspectives.Generally,deeplearning-based
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
clinicalpredictivemodelingappliedRNNstylestructures,withminormodificationsinstructures.Forexample,[20]usedaone-layerRNNand[21]usedatwo-layerRNNwiththeirattentionweightsbutoneattentionisforevaluatingeachembeddingdimension.Incomparison,weappliedattentionweightsonboththecodeandvisitlevel,whichisalsoeasyfortheinterpretationofresults.
StrategiesofencodingtimeinpreviousstudiescanbegenerallycategorizedintolearningasubspacedecompositionofthecellmemoryinRNNstoenabletimedecay[20,23]ortakingthetimevaluesasfeatures[21,48].Thesemethods,sinceonlyusedtimeasasinglevalue,limitedtherepresentationabilityoftimeifmultiplepossiblepatternsexistinthetimeline,e.g.aclinicaleventchappenedintimetmightbemodeledjointlyincausal-likepatternssuchasat1->ctandct->bt2,inwhichtmaybehavedifferently.
Limitations and Future Work Byusingdeeplearning,weofferedanovelwayofidentifyingpossibleriskfactorsandpredictingtheriskofasthmaexacerbation.However,thecurrentworkstillhassomelimitations.First,forthemodelinterpretationpart,howmultipleclinicalvariablesinteractwitheachotherneedsfurtherexploration,simplyconsideringeachvariableindependentlymaylossthedependencypatternsbetweenthem,e.g.theprescriptionofadrugmightbecloselyassociatedwithadiseaseorsymptom.Secondly,EHRshavetheirowndrawbackssuchasdatairregularity,sparsity,andnoise.SomepotentialriskfactorsofasthmaexacerbationsmightnotberecordedinEHRs.Asaresult,theinformationintegritycannotbewellguaranteed.Wemayneedtofindwaystomakethedatamorecompletesuchasincludinginformationfromtextualreportsorpatientsurveys.Finally,theperformanceofthemodelstillhasroomtoimprove.Itmightbeboostedfurtherbydesigningmorepowerfulstructuresorincludingbackgroundknowledge.
Conclusion Inthispaper,weproposedanattentivedeeplearningnetworkforasthmaexacerbationpredictionandemployedelapsedtimeembeddingstomodelthetimedecays.Byleveragingtheweightsofthemodel,wenotonlygeneratedpersonalizedheatmapsandspecificriskscoresattheindividual-level,butalsoidentifiedpossibleriskfactorsofasthmaexacerbationatthecohort-level.Comparedwithpreviousstudies,ourmodeliseffectiveinmodelingtimeinformationandobtainsbetteroverallAUCs.Sincethemodeliscompletelydata-drivenandrelieslittleonfeatureengineering,itcaneasilybegeneralizedtootherpredictiontasks.Toourbestknowledge,thisisthefirststudytopredictasthmaexacerbationrisksusingadeeplearningmodelandincludeselaspsedtimeembeddings.Someofthetop-rankedriskfactorsidentifiedhavegainedsupportingevidencefrompreviousmedicalresearches,whichprovedourmethodhasgoodreliabilityandaccuracy.
Acknowledgements CTconceivedtheresearchproject.YX,HJandCTdesignedthepipelineandmethod.YXimplementedthedeeplearningmodelofthestudyandpreparedthemanuscript.HJcompletedtheclinicalpartofthemanuscript.WJZandHXprovidedvaluable
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
suggestionsonthecohortselectionandexperimentdesign.YZhouandYZhangextracted,cleanedthedataanddidstatistics.LRhelpedreorganizethedataanddidnormalizationsfortherevisedversion.FL,JD,SW,DZandCTproofreadthepaperandprovidedvaluablesuggestions.Alltheauthorshavereadandapprovedthefinalmanuscript.
WethankDr.IrmgardWillcocksonforproofreading.Also,thankstoCernerforprovidingthevaluableHealthFactsEMRdata.WegratefullyacknowledgethesupportofNVIDIACorporationwiththedonationoftheQuadroP6000andTITANXPGPUsusedforthisresearch.ThisresearchwaspartiallysupportedbytheNationalLibraryofMedicineoftheNationalInstitutesofHealthunderawardnumberR01LM011829,theNationalInstituteofAllergyandInfectiousDiseasesoftheNationalInstitutesofHealthunderawardnumber1R01AI130460,NationalCenterforAdvancingTranslationalSciencesoftheNationalInstitutesofHealthunderawardnumberU01TR02062,andtheCancerPreventionResearchInstituteofTexas(CPRIT)TrainingGrant#RP160015.
Conflicts of Interest Nonedeclared.
Abbreviations ALSTM:AttentionLongShort-TermMemoryAUROC/AUC:AreaUndertheReceiverOperatingCurveEHRs:ElectronicHealthRecordsICS:InhaledCorticosteroidsLR:LogisticRegressionLSTM:LongShort-TermMemoryRETAIN:REverseTimeAttentIoNmodelRNNs:RecurrentNeuralNetworksSABA:Short-ActingBetaAgonistsSMOTE:SyntheticMinorityOver-samplingTechniqueTLSTM:Time-awareLongShort-TermMemoryTSANN:Time-SensitiveAttentiveNeuralNetwork
References 1 OrganizationWH.Asthma.
http://www.who.int/mediacentre/factsheets/fs307/en/.2 CDC.MostRecentAsthmaData.
https://www.cdc.gov/asthma/most_recent_data.htm.3 GINA.PocketGuideforAsthmaManagement.PocketGuidasthmaManagPrev
2018.4 NurmagambetovT,KuwaharaR,GarbeP.Theeconomicburdenofasthmain
theUnitedStates,2008-2013.AnnAmThoracSoc2018;15:348–56.doi:10.1513/AnnalsATS.201703-259OC
5 WarkPAB,GibsonPG.Asthmaexacerbations·3:Pathogenesis.Thorax
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
2006;61:909–15.doi:10.1136/thx.2005.0451876 LevyML.Thenationalreviewofasthmadeaths:Whatdidwelearnandwhat
needstochange?Breathe2015;11:15–24.doi:10.1183/20734735.0089147 FlemingL.Asthmaexacerbationprediction.CurrOpinAllergyClinImmunol
2018;:1.doi:10.1097/ACI.00000000000004288 AzizpourY,DelpishehA,MontazeriZ,etal.EffectofchildhoodBMIonasthma:
Asystematicreviewandmeta-analysisofcase-controlstudies.BMCPediatr2018;18:1–13.doi:10.1186/s12887-018-1093-z
9 LieuTA,QuesenberryCP,SorelME,etal.Computer-basedmodelstoidentifyhigh-riskchildrenwithasthma.AmJRespirCritCareMed1998;157:1173–80.doi:10.1164/ajrccm.157.4.9708124
10 StanfordRH,NagarS,LinX,etal.UseofICS/LABAonAsthmaExacerbationRiskinPatientsWithinaMedicalGroup.JManagcareSpecPharm2015;21:1014–9.doi:10.18553/jmcp.2015.21.11.1014
11 BzdokD,AltmanN,KrzywinskiM.Statisticsversusmachinelearning.NatPublGr2018;15:233–4.doi:10.1038/nmeth.4642
12 DexheimerJW,BrownLE,LeegonJ,etal.ComparingDecisionSupportMethodologiesforIdentifyingAsthmaExacerbations.2007;:880–4.
13 JeongJF,CheolI.Machinelearningapproachestopersonalizeearlypredictionofasthmaexacerbations.AnnNYAcadSci2017;1387:153–65.doi:10.1016/j.coviro.2015.09.001.Human
14 SandersDL,AronskyD.DetectingAsthmaExacerbationsinaPediatricEmergencyDepartmentUsingaBayesianNetwork.2006;:684–8.
15 LoymansRJB,DebrayTPA,HonkoopPJ,etal.ExacerbationsinAdultswithAsthma:ASystematicReviewandExternalValidationofPredictionModels.JAllergyClinImmunolPract2018;6:1942-1952.e15.doi:10.1016/j.jaip.2018.02.004
16 BaeSH,ChoiI,KimNS.AcousticSceneClassificationUsingParallelCombinationofLSTMandCNN.ProcDetectClassifAcoustScenesEvents2016Work2016;:11–5.
17 LecunY,BengioY,HintonG.Deeplearning.Nature2015;521:436–44.doi:10.1038/nature14539
18 BaytasIM,XiaoC,ZhangX,etal.PatientSubtypingviaTime-AwareLSTMNetworks.Proc23rdACMSIGKDDIntConfKnowlDiscovDataMin-KDD’172017;:65–74.doi:10.1145/3097983.3097997
19 XiaoC,ChoiE,SunJ.Opportunitiesandchallengesindevelopingdeeplearningmodelsusingelectronichealthrecordsdata:asystematicreview.JAmMedInformaticsAssoc2018;00:1–10.doi:10.1093/jamia/ocy068
20 RajkomarA,OrenE,ChenK,etal.Scalableandaccuratedeeplearningforelectronichealthrecords.PublishedOnlineFirst:2018.doi:10.1038/s41746-018-0029-1
21 ChoiE,BahadoriMT,KulasJA,etal.RETAIN:AnInterpretablePredictiveModelforHealthcareusingReverseTimeAttentionMechanism.PublishedOnlineFirst:2016.http://arxiv.org/abs/1608.05745
22 MaF.Dipole :DiagnosisPredictioninHealthcareviaAttention-basedBidirectionalRecurrentNeuralNetworks.2017.
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
23 BaytasIM,XiaoC,ZhangX,etal.PatientSubtypingviaTime-AwareLSTMNetworks.Proc23rdACMSIGKDDIntConfKnowlDiscovDataMin-KDD’172017;:65–74.doi:10.1145/3097983.3097997
24 WuS,LiuS,SohnS,etal.ModelingAsynchronousEventSequenceswithRNNs.JBiomedInformPublishedOnlineFirst:2018.doi:10.1016/j.jbi.2018.05.016
25 CheC.AnRNNArchitecturewithDynamicTemporalMatchingforPersonalizedPredictionsofParkinson’sDisease.
26 JinH,YoungH.Learningrepresentationsfortheearlydetectionofsepsiswithdeepneuralnetworks.ComputBiolMed2017;89:248–55.doi:10.1016/j.compbiomed.2017.08.015
27 JinH,YoungH.Learningrepresentationsfortheearlydetectionofsepsiswithdeepneuralnetworks.ComputBiolMed2017;89:248–55.doi:10.1016/j.compbiomed.2017.08.015
28 VaswaniA,ShazeerN,ParmarN,etal.AttentionIsAllYouNeed.PublishedOnlineFirst:2017.doi:10.1017/S0952523813000308
29 GINA.GlobalStrategyForAsthmaManagementandPrevention.GlobInitiatAsthma2017;:http://ginasthma.org/2017-gina-report-global-strat.doi:10.1183/09031936.00138707
30 BaiTR,VonkJM,PostmaDS,etal.Severeexacerbationspredictexcesslungfunctiondeclineinasthma.EurRespirJ2007;30:452–6.doi:10.1183/09031936.00165106
31 MappingbetweenICD-10andICD-9.https://www.health.govt.nz/nz-health-statistics/data-references/mapping-tools/mapping-between-icd-10-and-icd-9(accessed1Jan2019).
32 DaojianZeng,KangLiu,SiweiLaiGZandJZ.RelationClassificationviaConvolutionalDeepNeuralNetwork.In:ProceedingsofCOLING2014,the25thInternationalConferenceonComputationalLinguistics:TechnicalPapers.2014.2335–44.doi:10.1021/bi990527s
33 AlecR,KarthikN,TimS,etal.ImprovingLanguageUnderstandingbyGenerativePre-Training.OpenAI2018;:1–10.doi:10.1093/aob/mcp031
34 SongH,RajanD,ThiagarajanJJ,etal.Attendanddiagnose:Clinicaltimeseriesanalysisusingattentionmodels.32ndAAAIConfArtifIntellAAAI20182018;:4091–8.
35 BahdanauD,ChoK,BengioY.NeuralMachineTranslationbyJointlyLearningtoAlignandTranslate.2014;:1–15.doi:10.1146/annurev.neuro.26.041002.131047
36 XiangY,ChenQ,WangX,etal.AnswerSelectioninCommunityQuestionAnsweringviaAttentiveNeuralNetworks.IEEESignalProcessLett2017;24:505–9.doi:10.1109/LSP.2017.2673123
37 YangZ,YangD,DyerC,etal.HierarchicalAttentionNetworksforDocumentClassification.Proc2016ConfNorthAmChapterAssocComputLinguistHumLangTechnol2016;:1480–9.doi:10.18653/v1/N16-1174
38 HochreiterS,UrgenSchmidhuberJ.LongShort-TermMemory.NeuralComput1997;9:1735–80.doi:10.1162/neco.1997.9.8.1735
39 SakH,SeniorA,BeaufaysF.Longshort-termmemoryrecurrentneural
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint
networkarchitecturesforlargescaleacousticmodeling.Interspeech20142014;:338–42.doi:arXiv:1402.1128
40 MandicS,GoC,AggarwalI,etal.Relationshipofpredictivemodelingtoreceiveroperatingcharacteristics.JCardiopulmRehabilPrev2008;28:415–9.doi:10.1097/HCR.0b013e31818c3c78
41 HosmerJr,DavidW.,StanleyLemeshowandRXS.Appliedlogisticregression.2013.
42 ChoiE,BahadoriMT,KulasJA,etal.RETAIN:aninterpretablepredictivemodelforhealthcareusingreversetimeattentionmechanism.PublishedOnlineFirst:2016.http://arxiv.org/abs/1608.05745
43 Chawla,NiteshVandBowyer,KevinWandHall,LawrenceOandKegelmeyerWP.SMOTE:SyntheticMinorityOver-samplingTechniqueNitesh.JArtifIntellRes2002;16:321–57.doi:10.1613/jair.953
44 Pal,SankarKandMitraS.MultilayerPerceptron,FuzzySets,Classifiaction.IEEETransNeuralNetworks1992;3:683696.
45 XuB,WangN,ChenT.EmpiricalEvaluationofRectifiedActivationsinConvolutionNetwork.2015.
46 KingmaDP,BaJ.Adam:amethodforstochasticoptimization.In:Iclr.2015.doi:http://doi.acm.org.ezproxy.lib.ucf.edu/10.1145/1830483.1830503
47 Mart´ınAbadi,AshishAgarwalPBetal.TensorFlow:Large-ScaleMachineLearningonHeterogeneousDistributedSystems.2015.doi:10.1093/library/s4-X.3.339
48 RajkomarA,OrenE,ChenK,etal.Scalableandaccuratedeeplearningforelectronichealthrecords.npjDigitMed2018;:1–10.doi:10.1038/s41746-018-0029-1
. CC-BY-NC-ND 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. was not certified by peer review)
(whichThe copyright holder for this preprint this version posted November 29, 2019. ; https://doi.org/10.1101/19012161doi: medRxiv preprint