pacific symposium on biocomputing 2020 · pacific symposium on biocomputing 2020 abstract book...
TRANSCRIPT
![Page 1: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/1.jpg)
PACIFICSYMPOSIUMONBIOCOMPUTING2020
ABSTRACTBOOK
PosterPresenters:Posterspaceisassignedbyabstractpagenumber.Pleasefindthepagethatyourabstractisonandputyourposterontheposterboardwith
thecorrespondingnumber(e.g.,ifyourabstractisonpage50,putyourposteronboard#50).
Proceedingspaperswithoralpresentations#2-39arenotassignedposterspace.
Abstractsareorganizedfirstbysession,thenthelastnameofthefirstauthor.Presentingauthors’namesareunderlinedintheTableofContents
andinboldtextontheabstracts.
![Page 2: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/2.jpg)
PROCEEDINGSPAPERSWITHORALPRESENTATIONSATRIFICIALINTELLIGENCEFORENHANCINGCLINICALMEDICINE....................................................1PREDICTINGLONGITUDINALOUTCOMESOFALZHEIMER'SDISEASEVIAATENSOR-BASEDJOINT
.........................................................................................................................2CLASSIFICATIONANDREGRESSIONMODELLodewijkBrand,KaiNichols,HuaWang,HengHuang,LiShen,fortheADNI
ROBUSTLYEXTRACTINGMEDICALKNOWLEDGEFROMEHRS:ACASESTUDYOFLEARNINGAHEALTH................................................................................................................................................................3KNOWLEDGEGRAPH
IreneY.Chen,MonicaAgrawal,StevenHorng,DavidSontag...........................4INCREASINGCLINICALTRIALACCRUALVIAAUTOMATEDMATCHINGOFBIOMARKERCRITERIA
JessicaW.Chen,ChristianA.Kunder,NamBui,JamesL.Zehnder,HelioA.Costa,HenningStehrADDRESSINGTHECREDITASSIGNMENTPROBLEMINTREATMENTOUTCOMEPREDICTIONUSINGTEMPORAL
...........................................................................................................................................................5DIFFERENCELEARNINGSaharHarati,AndreaCrowell,HelenMayberg,ShamimNemati
FROMGENOMETOPHENOME:PREDICTINGMULTIPLECANCERPHENOTYPESBASEDONSOMATICGENOMIC.................................................................................................6ALTERATIONSVIATHEGENOMICIMPACTTRANSFORMER
YifengTao,ChunhuiCai,WilliamW.Cohen,XinghuaLuAUTOMATEDPHENOTYPINGOFPATIENTSWITHNON-ALCOHOLICFATTYLIVERDISEASEREVEALSCLINICALLY
................................................................................................................................................7RELEVANTDISEASESUBTYPESMaxenceVandromme,TomiJun,PonniPerumalswami,JoelT.Dudley,AndreaBranch,LiLi
...8MONITORINGICUMORTALITYRISKWITHALONGSHORT-TERMMEMORYRECURRENTNEURALNETWORKKeYu,MingdaZhang,TianyiCui,MilosHauskrecht
INTRINSICALLYDISORDEREDPROTEINS(IDPS)ANDTHEIRFUNCTIONS......................................9DISORDEREDFUNCTIONCONJUNCTION:ONTHEIN-SILICOFUNCTIONANNOTATIONOFINTRINSICALLY
............................................................................................................................................................10DISORDEREDREGIONSSinaGhadermarzi,AkilaKatuwawala,ChristopherJ.Oldfield,AmitaBarik,LukaszKurgan
DENOVOENSEMBLEMODELINGSUGGESTSTHATAP2-BINDINGTODISORDEREDREGIONSCANINCREASESTERIC.....................................................................................................................................11VOLUMEOFEPSINBUTNOTEPS15
N. SuhasJagannathan,ChristopherW.V.Hogue,LisaTucker-KelloggMODULATIONOFP53TRANSACTIVATIONDOMAINCONFORMATIONSBYLIGANDBINDINGANDCANCER-
......................................................................................................................................................12ASSOCIATEDMUTATIONSXiaorongLiu,JianhanChen
EXPLORINGRELATIONSHIPSBETWEENTHEDENSITYOFCHARGEDTRACTSWITHINDISORDEREDREGIONSAND...............................................................................................................................................................13PHASESEPARATION
RamizSomjee,DianaM.Mitrea,RichardW.Kriwacki
MUTATIONALSIGNATURES...........................................................................................................................14......................................................15PHYSIGS:PHYLOGENETICINFERENCEOFMUTATIONALSIGNATUREDYNAMICS
SarahChristensen,MarkD.M.Leiserson,MohammedEl-KebirTRACKSIGFREQ:SUBCLONALRECONSTRUCTIONSB ..16ASEDONMUTATIONSIGNATURESANDALLELEFREQUENCIESCaitlinF.Harrigan,YuliaRubanova,QuaidMorris,AlinaSelega
DNAREPAIRFOOTPRINTUNCOVERSCONTRIBUTIONOFDNAREPAIRMECHANISMTOMUTATIONAL.............................................................................................................................................................................17SIGNATURES
DamianWojtowicz,MarkD.M.Leiserson,RodedSharan,TeresaM.Przytycka
PATTERNRECOGNITIONINBIOMEDICALDATA:CHALLENGESINPUTTINGBIGDATATOWORK....................................................................................................................................................................18
.........19CLINICALCONCEPTEMBEDDINGSLEARNEDFROMMASSIVESOURCESOFMULTIMODALMEDICALDATAAndrewL.Beam,BenjaminKompa,AllenSchmaltz,InbarFried,GriffinWeber,NathanPalmer,XuShi,TianxiCai,IsaacS.Kohane
![Page 3: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/3.jpg)
ii
ASSESSMENTOFIMPUTATIONMETHODSFORMISSINGGENEEXPRESSIONDATAINMETA-ANALYSISOF...........................................................................................................20DISTINCTCOHORTSOFTUBERCULOSISPATIENTS
CarlyA.Bobak,LaurenMcDonnell,MatthewD.Nemesure,JustinLin,JaneE.HillTOWARDSIDENTIFYINGDRUGSIDEEFFECTSFROMSOCIALMEDIAUSINGACTIVELEARNINGANDCROWD
.................................................................................................................................................................................21SOURCINGSophieBurkhardt,JuliaSiekiera,JosuaGlodde,MiguelA.Andrade-Navarro,StefanKramer
.....................................22MICROVASCULARDYNAMICSFROM4DMICROSCOPYUSINGTEMPORALSEGMENTATIONShirGur,LiorWolf,LiorGolgher,PabloBlinder
.....................................................23USINGTRANSCRIPTIONALSIGNATURESTOFINDCANCERDRIVERSWITHLUREDavidHaan,RuikangTao,VerenaFriedl,IoannisN.Anastopoulos,ChristopherK.Wong,AlanaS.Weinstein,JoshuaM.Stuart
PAGE-NET:INTERPRETABLEANDINTEGRATIVEDEEPLEARNINGFORSURVIVALANALYSISUSING.......................................................................................................24HISTOPATHOLOGICALIMAGESANDGENOMICDATA
JieHao,SaiChandraKosaraju,NelsonZangeTsaku,DaeHyunSong,MingonKangMACHINELEARNINGALGORITHMSFORSIMULTANEOUSSUPERVISEDDETECTIONOFPEAKSINMULTIPLE
.....................................................................................................................................................25SAMPLESANDCELLTYPESTobyDylanHocking,GuillaumeBourque
GRAPH-BASEDINFORMATIONDIFFUSIONMETHODFORPRIORITIZINGFUNCTIONALLYRELATEDGENESIN...................................................................................................................26PROTEIN-PROTEININTERACTIONNETWORKS
MinhPham,OlivierLichtargeALITERATURE-BASEDKNOWLEDGEGRAPHEMBEDDINGMETHODFORIDENTIFYINGDRUGREPURPOSING
....................................................................................................................................27OPPORTUNITIESINRAREDISEASESDanielN.Sosa,AlexanderDerry,MargaretGuo,EricWei,ConnorBrinton,RussB.Altman
...............28TWO-STAGEMLCLASSIFIERFORIDENTIFYINGHOSTPROTEINTARGETSOFTHEDENGUEPROTEASEJacobT.Stanley,AlisonR.Gilchrist,AlexC.Stabell,MaryA.Allen,SaraL.Sawyer,RobinD.Dowell
ENHANCINGMODELINTERPRETABILITYANDACCURACYFORDISEASEPROGRESSIONPREDICTIONVIA....................................................................................................29PHENOTYPE-BASEDPATIENTSIMILARITYLEARNING
YueWang,TongWu,YunlongWang,GaoWangPRECISIONMEDICINE:ADDRESSINGTHECHALLENGESOFSHARING,ANALYSIS,ANDPRIVACYATSCALE...............................................................................................................................................................30
...............31INTEGRATEDCANCERSUBTYPINGUSINGHETEROGENEOUSGENOME-SCALEMOLECULARDATASETSSuzanArslanturk,SorinDraghici,TinNguyen
ASSESSMENTOFCOVERAGEFORENDOGENOUSMETABOLITESANDEXOGENOUSCHEMICALCOMPOUNDSUSING...................................................................................................................32ANUNTARGETEDMETABOLOMICSPLATFORM
SekWonKong,CarlesHernandez-FerrerCOVERAGEPROFILECORRECTIONOFSHALLOW-DEPTHCIRCULATINGCELL-FREEDNASEQUENCINGVIAMULTI-
..............................................................................................................................................................33DISTANCELEARNINGNicholasB.Larson,MelissaC.Larson,JieNa,CarlosP.Sosa,ChenWang,Jean-PierreKocher,RossRowsey
..............................................................................................34PGXMINE:TEXTMININGFORCURATIONOFPHARMGKBJakeLever,JuliaM.Barbarino,LiGong,RachelHuddart,KatrinSangkuhl,RyanWhaley,MichelleWhirl-Carrillo,MarkWoon,TeriE.Klein,RussB.Altman
....................................35THEPOWEROFDYNAMICSOCIALNETWORKSTOPREDICTINDIVIDUALS'MENTALHEALTHShikangLiu,DavidHachen,OmarLizardo,ChristianPoellabauer,AaronStriegel,TijanaMilenkovic
.............................36IMPLEMENTINGACLOUDBASEDMETHODFORPROTECTEDCLINICALTRIALDATASHARINGGauravLuthria,QingboWang
....................................37PATHWAYANDNETWORKEMBEDDINGMETHODSFORPRIORITIZINGPSYCHIATRICDRUGSYashPershad,MargaretGuo,RussB.Altman
ROBUST-ODAL:LEARNINGFROMHETEROGENEOUSHEALTHSYSTEMSWITHOUTSHARINGPATIENT-LEVEL..........................................................................................................................................................................................38DATA
JiayiTong,RuiDuan,RuowangLi,MartijnJ.Scheuemie,JasonH.Moore,YongChen
![Page 4: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/4.jpg)
iii
COMPUTATIONALLYEFFICIENT,EXACT,COVARIATE-ADJUSTEDGENETICPRINCIPALCOMPONENTANALYSISBY..................................................39LEVERAGINGINDIVIDUALMARKERSUMMARYSTATISTICSFROMLARGEBIOBANKS
JackWolf,MarthaBarnard,XuetingXia,NathanRyder,JasonWestra,NathanTintle
PROCEEDINGSPAPERSWITHPOSTERPRESENTATIONSARTIFICIALINTELLIGENCEFORENHANCINGCLINICALMEDICINE..................................................40
.......................41MULTICLASSDISEASECLASSIFICATIONFROMMICROBIALWHOLE-COMMUNITYMETAGENOMESSaadKhan,LibushaKelly
.....................................42LITGEN:GENETICLITERATURERECOMMENDATIONGUIDEDBYHUMANEXPLANATIONSAllenNie,ArturoL.Pineda,MattW.Wright,HannahWand,BryanWulf,HelioA.Costa,RonakY.Patel,CarlosD.Bustamante,JamesZou
...........................................43MULTILEVELSELF-ATTENTIONMODELANDITSUSEONMEDICALRISKPREDICTIONXianlongZeng,YunyiFeng,SoheilMoosavinasab,DeborahLin,SimonLin,ChangLiu
IDENTIFYINGTRANSITIONALHIGHCOSTUSERSFROMUNSTRUCTUREDPATIENTPROFILESWRITTENBY.................................................................................................................................................44PRIMARYCAREPHYSICIANS
HaoranZhang,ElisaCandido,AndrewS.Wilton,RaquelDuchen,LiisaJaakkimainen,WalterWodchis,QuaidMorris
OBTAININGDUAL-ENERGYCOMPUTEDTOMOGRAPHY(CT)INFORMATIONFROMASINGLE-ENERGYCTIMAGE.........................................45FORQUANTITATIVEIMAGINGANALYSISOFLIVINGSUBJECTSBYUSINGDEEPLEARNING
WeiZhao,TianlingLv,RenaLee,YangChen,LeiXingINTRINSICALLYDISORDEREDPROTEINS(IDPS)ANDTHEIRFUNCTIONS....................................46
............................................................47MANY-TO-ONEBINDINGBYINTRINSICALLYDISORDEREDPROTEINREGIONSWei-LunAlterovitz,EshelFaraggi,ChristopherJ.Oldfield,JingweiMeng,BinXue,FeiHuang,PedroRomero,AndrzejKloczkowski,VladimirN.Uversky,A.KeithDunker
MUTATIONALSIGNATURES...........................................................................................................................48......................................49IMPACTOFMUTATIONALSIGNATURESONMICRORNAANDTHEIRRESPONSEELEMENTS
EiriniStamoulakatou,PietroPinoli,StefanoCeri,RosarioPiroGENOMEGERRYMANDERING:OPTIMALDIVISONOFTHEGENOMEINTOREGIONSWITHCANCERTYPESPECIFIC
.....................................................................................................................................50DIFFERENCESINMUTATIONRATESAdamoYoung,JacobChmura,YoonsikPark,QuaidMorris,GurnitAtwal
PATTERNRECOGNITIONINBIOMEDICALDATA:CHALLENGESINPUTTINGBIGDATATOWORK....................................................................................................................................................................51
..........................................................52LEARNINGALATENTSPACEOFHIGHLYMULTIDIMENSIONALCANCERDATABenjaminKompa,BeauCoker
................53SCALINGSTRUCTURALLEARNINGWITHNO-BEARSTOINFERCAUSALTRANSCRIPTOMENETWORKSHao-ChihLee,MatteoDanieletto,RiccardoMiotto,SarahT.Cherng,JoelT.Dudley
PATHFLOWAI:AHIGH-THROUGHPUTWORKFLOWFORPREPROCESSING,DEEPLEARNINGAND.......................................................................................................................54INTERPRETATIONINDIGITALPATHOLOGY
JoshuaJ.Levy,LucasA.Salas,BrockC.Christensen,AravindhanSriharan,LouisJ.VaickusIMPROVINGSURVIVALPREDICTIONUSINGANOVELFEATURESELECTIONANDFEATUREREDUCTION
...................................................55FRAMEWORKBASEDONTHEINTEGRATIONOFCLINICALANDMOLECULARDATA*LisaNeums,RichardMeier,DevinC.Koestler,JeffreyA.Thompson
BAYESIANSEMI-NONNEGATIVEMATRIXTRI-FACTORIZATIONTOIDENTIFYPATHWAYSASSOCIATEDWITH.............................................................................................................................................................56CANCERPHENOTYPES
SunhoPark,NabhonilKar,Jae-HoCheong,TaeHyunHwang......................................................................................57TREE-WEIGHTINGFORMULTI-STUDYENSEMBLELEARNERS
MayaRamchandran,PrasadPatil,GiovanniParmigianiPTREXPLORER:ANAPPROACHTOIDENTIFYANDEXPLOREPOSTTRANSCRIPTIONALREGULATORY
.............................................................................................................................58MECHANISMSUSINGPROTEOGENOMICSArunimaSrivastava,MichaelSharpnack,KunHuang,ParagMallick,RaghuMachiraju
![Page 5: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/5.jpg)
iv
NETWORKREPRESENTATIONOFLARGE-SCALEHETEROGENEOUSRNASEQUENCESWITHINTEGRATIONOF............................................................................59DIVERSEMULTI-OMICS,INTERACTIONS,ANDANNOTATIONSDATA
NhatTran,JeanGao...............60HADOOPANDPYSPARKFORREPRODUCIBILITYANDSCALABILITYOFGENOMICSEQUENCINGSTUDIES
NicholasR.Wheeler,PenelopeBenchek,BrianW.Kunkle,KaraL.Hamilton-Nelson,MikeWarfe,JeremyR.Fondran,JonathanL.Haines,WilliamS.Bush
CERENKOV3:CLUSTERINGANDMOLECULARNETWORK-DERIVEDFEATURESIMPROVECOMPUTATIONAL..............................................................................................................61PREDICTIONOFFUNCTIONALNONCODINGSNPS
YaoYao,StephenA.RamseyPRECISIONMEDICINE:ADDRESSINGTHECHALLENGESOFSHARING,ANALYSIS,ANDPRIVACYATSCALE...............................................................................................................................................................62
.................63ANOMIGAN:GENERATIVEADVERSARIALNETWORKSFORANONYMIZINGPRIVATEMEDICALDATAHoBae,DahuinJung,Hyun-SooChoi,SungrohYoon
FREQUENCYOFCLINVARPATHOGENICVARIANTSINCHRONICKIDNEYDISEASEPATIENTSSURVEYEDFOR..........................................................................64RETURNOFRESEARCHRESULTSATACLEVELANDPUBLICHOSPITAL
DanaC.Crawford,JohnLin,JessicaN.CookeBailey,TylerKinzy,JohnR.Sedor,JohnF.O'Toole,WilliamsS.Bush
................65NETWORK-BASEDMATCHINGOFPATIENTSANDTARGETEDTHERAPIESFORPRECISIONONCOLOGYQingzhiLiu,MinJinHa,RupamBhattacharyya,LanaGarmire,VeerabhadranBaladandayuthapani
PHENOME-WIDEASSOCIATIONSTUDIESONCARDIOVASCULARHEALTHANDFATTYACIDSCONSIDERING..................................................................66PHENOTYPEQUALITYCONTROLPRACTICESFOREPIDEMIOLOGICALDATA
KristinPassero,XiHe,JiayanZhou,BertramMueller-Myhsok,MarcusE.Kleber,WinfriedMaerz,MollyA.Hall
.....................................67ATEMPO:PATHWAY-SPECIFICTEMPORALANOMALIESFORPRECISIONTHERAPEUTICSChristopherMichaelPietras,LiamPower,DonnaK.Slonim
.........................................................68FEATURESELECTIONANDDIMENSIONREDUCTIONOFSOCIALAUTISMDATAPeterWashington,KelleyMariePaskov,HaikKalantarian,NathanielStockham,CatalinVoss,AaronKline,RitikPatnaik,BriannaChrisman,MayaVarma,QandeelTariq,KaitlynDunlap,JesseySchwartz,NickHaber,DennisP.Wall
POSTERPRESENTATIONSATRIFICIALINTELLIGENCEFORENHANCINGCLINICALMEDICINE..................................................69PRIORITIZINGCOPYNUMBERVARIANTSUSINGPHENOTYPEANDGENEFUNCTIONALSIMILARITY.....................70AzzaAlthagafi,JunChen,RobertHoehndorf
INFERRINGTHEREWARDFUNCTIONSTHATGUIDECANCERPROGRESSION..............................................................71JohnKalantari,HeidiNelson,NicholasChia
PREDICTINGDISEASE-ASSOCIATEDMUTATIONOFMETAL-BINDINGSITESINPROTEINSUSINGADEEPLEARNINGAPPROACH................................................................................................................................................................................72MohamadKoohi-Moghadam,HaiboWang,YuchuanWang,XinmingYang,HongyanLi,JunwenWang,HongzheSun
GENERAL...............................................................................................................................................................73RANKINGRASPATHWAYMUTATIONSUSINGEVOLUTIONARYHISTORYOFMEK1...................................................74KatiaAndrianova,IgorJouline
INTEGRATIVEANALYSISOFCOPDANDLUNGCANCERMETADATAREVEALSSHAREDALTERATIONSINIMMUNERESPONSE,PTENANDPI3K-AKTPATHWAYS}.............................................................................................................75DannielleSkander,ArdaDurmaz,MohammedOrloff,GurkanBebek
INVESTIGATINGSOURCESOFIRREPRODUCIBILITYINANALYSISOFGENEEXPRESSIONDATA..................................76CarlyA.Bobak,JaneE.Hill
ETHEREUMANDMULTICHAINBLOCKCHAINSASSECURETOOLSFORINDIVIDUALIZEDMEDICINE........................77CharlotteBrannon,GamzeGursoy,SarahWagner,MarkGerstein
![Page 6: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/6.jpg)
v
GENOMICPREDICTORSOFL-ASPARAGINASE-INDUCEDPANCREATITISINPEDIATRICCANCERPATIENTS............78BrittDrogemoller,GalenE.B.Wright,ShahradRassekh,ShinyaIto,BruceCarleton,ColinRoss,TheCanadianPharmacogenomicsNetworkforDrugSafetyConsortium
NITECAP:ANOVELMETHODANDINTERFACEFORTHEIDENTIFICATIONOFCIRCADIANBEHAVIORINHIGHLYPARALLELTIME-COURSEDATA.............................................................................................................................................79ThomasG.Brooks,CrisW.Lawrence,NicholasF.Lahens,SoumyashantNayak,DimitraSarantopoulou,GarretA.FitzGerald,GregoryR.Grant
THEINTERPLAYOFOBESITYANDRACE/ETHNICITYONMAJORPERINATALCOMPLICATIONS.............................80YaadiraBrown,MPH;OlubodeA.Olufajo,MD,MPH;EdwardE.CornwellIII,MD;WilliamSoutherland,PhD
ACOMPARISONOFPHARMACOGENOMICINFORMATIONINFDA-APPROVEDDRUGLABELSANDCPICGUIDELINES..............................................................................................................................................................................81KatherineI.Carrillo,TeriE.Klein
XTEA:ATRANSPOSABLEELEMENTINSERTIONANALYZERFORGENOMESEQUENCINGDATAFROMMULTIPLETECHNOLOGIES........................................................................................................................................................................82ChongChu,RebecaMonroy,SoohyunLee,E.AliceLee,PeterJ.Park
GOGETDATA(GGD):SIMPLE,REPRODUCIBLEACCESSTOSCIENTIFICDATA............................................................83MichaelCormier,JonBelyeu,BrentPedersen,JoeBrown,JohannesKoster,AaronR.Quinlan
GLOBALEPIGENOMICREGULATIONOFGENEEXPRESSIONANDCELLULARPROLIFERATIONINT-CELLLEUKEMIA..84SinisaDovat,YaliDing,BoZhang,JonathonL.Payne,FengYue
APHARMACOGENOMICINVESTIGATIONOFTHECARDIACSAFETYPROFILEOFONDANSETRONINCHILDRENANDINPREGNANTWOMEN............................................................................................................................................................85GalenE.B.Wright,BrittI.Drögemöller,JessicaTrueman,KaitlynShaw,MichelleStaub,ShahnazChaudhry,SholehGhayoori,FudanMiao,MichelleHigginson,GabriellaS.S.Groeneweg,JamesBrown,LauraAMagee,SimonD.Whyte,NicholasWest,SoniaBrodie,Geert’tJong,HowardBerger,ShinyaIto,ShahradR.Rassekh,ShubhayanSanatani,ColinJ.D.Ross,BruceC.Carleton
TREND:APLATFORMFOREXPLORINGPROTEINFUNCTIONINPROKARYOTESUSINGPHYLOGENETICS,DOMAINARCHITECTURES,ANDGENENEIGHBORHOODSINFORMATION......................................................................................86VadimM.Gumerov,IgorB.Zhulin
TRACKSIGFREQ:SUBCLONALRECONSTRUCTIONSBASEDONMUTATIONSIGNATURESANDALLELEFREQUENCIES..87CaitlinF.Harrigan,YuliaRubanova,QuaidMorris,AlinaSelega
AFLEXIBLEPIPELINEFORTHEPREDICTIONOFBIOMARKERSRELEVANTTODRUGSENSITIVITY........................88V.KeithHughitt,SayehGorjifard,AleksandraM.Michalowski,JohnK.Simmons,RyanDale,EricC.Polley,JonathanJ.Keats,BeverlyA.Mock
CREATINGAMETABOLICSYNDROMERESEARCHRESOURCE(METSRR)...................................................................89WillyshaJenkins,ChristianRichardson,ClarLyndaWilliams-DeVanePhD
UTILIZINGCOHORTINFORMATIONTOFINDCAUSATIVEVARIANTS...............................................................................90SenayKafkas,RobertHoehndorf
INTEGRATEDANALYSISOFJAK-STATPATHWAYINHOMEOSTASIS,SIMULATEDINFLAMMATIONANDTUMOUR...91MilicaKrunic,AnzhelikaKarjalainen,MojoyinolaJoannaOla,StephenShoebridge,SabineMacho-Maschler,CarolineLassnig,AndreaPoelzl,MatthiasFarlik,NikolausFortelny,ChristophBock,BirgitStrobl,MathiasMueller
BEERS2:THENEXTGENERATIONOFRNA-SEQSIMULATOR....................................................................................92NicholasF.Lahens,ThomasG.Brooks,DimitraSarantopoulou,SoumyashantNayak,CrisLawrence,AnandSrinivasan,JonathanSchug,GarretA.FitzGerald,JohnB.Hogenesch,YosephBarash,GregoryR.Grant
EFFECTMODIFICATIONBYAGEONADIAGNOSTICTHREE-GENE-SIGNATUREINPATIENTSWITHACTIVETUBERCULOSIS........................................................................................................................................................................93LaurenMcDonnell,CarlyBobak,MatthewNemesure,JustinLin,JaneHill
CLASSIFICATIONANDMUTATIONPREDICTIONFROMGASTROINTESTINALCANCERHISTOPATHOLOGYIMAGESUSINGDEEPLEARNING...........................................................................................................................................................94SungHakLee,Hyun-JongJang
![Page 7: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/7.jpg)
vi
MAPPINGTHEEMERGENCEANDMIGRATIONOFHEMATOPOIETICSTEMCELLSANDPROGENITORSDURINGHUMANDEVELOPMENTATSINGLECELLRESOLUTION..................................................................................................95FeiyangMa,VincenzoCalvanese,SandraCapellera-Garcia,SophiaEkstrand,MatteoPellegrini,HannaK.A.Mikkola
LARGE-SCALEMACHINELEARNINGANDGRAPHANALYTICSFORFUNCTIONALPREDICTIONOFPATHOGENPROTEINS.................................................................................................................................................................................96JasonMcDermott,SongFeng,WilliamNelson,Joon-YongLee,SayanGhosh,ArifulKhan,MahanteshHalappanavar,JustineNguyen,JonathanPruneda,DavidBaltrus,JoshuaAdkins
GENE-SETANALYSISUSINGGWASSUMMARYSTATISTICSANDGTEXDATABASE....................................................97MasahiroNakatochi
TARGETINGCANCERVIASIGNALINGPATHWAYS:ANOVELAPPROACHTOTHEDISCOVERYOFGENECCDC191'SDOUBLE-AGENTFUNCTIONUSINGDIFFERENTIALGENEEXPRESSION,HEATMAPANALYSESTHROUGHAIDEEPLEARNING,ANDMATHEMATICALMODELING................................................................................98AnnieOstojic
RFEX:SIMPLERANDOMFORESTMODELANDSAMPLEEXPLAINERFORNON-MACHINELEARNINGEXPERTS..99DragutinPetkovic,AliAlavi,DanDanCai,JizhouYang,SabihaBarlaskar
APPARENTBIASTOWARDLONGGENEMISREGULATIONINMECP2SYNDROMESDISAPPEARSAFTERCONTROLLINGFORBASELINEVARIATIONS.....................................................................................................................100AyushT.Raman,AmyEPohodich,Ying-WooiWan,HariKrishnaYalamanchili,WilliamE.Lowry,HudaY.Zoghbi,ZhandongLiu
PREDICTIONOFCHRONOLOGICALANDBIOLOGICALAGEFROMLABORATORYDATA..............................................101LukeSagers,LukeMelas-Kyriazi,ChiragJ.Patel,ArjunK.Manrai
WHOLEGENOMESEQUENCINGANALYSISOFINFLUENZACVIRUSINKOREA...........................................................102SooyeonLim,HanSolLee,JiYunNoh,JoonYoungSong,HeeJinCheong,WooJooKim
MININGTHEHUMUHUMUNUKUNUKUAPUAANDTHESHAKAOFAUTISMWITHBIGDATABIOMEDICALDATASCIENCE.................................................................................................................................................................................103PeterWashington,BriannaChrisman,KaitiDunlap,AaronKline,ArmanHusic,MichaelNing,KelleyPaskov,NateStockham,MayaVarma,EmilieLeBlanc,JackKent,YordanPenev,MinWooSun,Jae-YoonJung,CatalinVoss,NickHaber,DennisP.Wall
DEVELOPMENTOFARECURRENCEPREDICTIONMODELFOREARLYLUNGADENOCARCINOMAUSINGRADIOMICS-BASEDARTIFICIALINTELLIGENCE.....................................................................................................................................104HeeChulYang,GunseokPark,JiEunOh
DRLPC:DIMENSIONREDUCTIONOFSEQUENCINGDATAUSINGLOCALPRINCIPALCOMPONENTS...................105YunJooYoo,FatemehYavartanu,ShelleyB.Bull
META-ANALYSISINEXHAUSTEDTCELLSFROMHOMOSAPIENSANDMUSMUSCULUSPROVIDESNOVELTARGETSFORIMMUNOTHERAPY........................................................................................................................................................106LinZhang,YichengGuo,HafumiNishi
INTRINSICALLYDISORDEREDPROTEINS(IDPS)ANDTHEIRFUNCTIONS.................................107DISORDEREDFUNCTIONCONJUNCTION:ONTHEIN-SILICOFUNCTIONANNOTATIONOFINTRINSICALLYDISORDEREDREGIONS.........................................................................................................................................................108SinaGhadermarzi,AkilaKatuwawala,ChristopherJ.Oldfield,AmitaBarik,LukaszKurgan
MUTATIONALSIGNATURES........................................................................................................................109TRANSCRIPTION-ASSOCIATEDREGIONALMUTATIONRATESANDSIGNATURESINREGULATORYELEMENTSACROSS2,500WHOLECANCERGENOMES......................................................................................................................110JüriReimand
COMPLEXMOSAICSTRUCTURALVARIATIONSINHUMANFETALBRAINS...................................................................111ShobanaSekar,LiviaTomasini,MariaKalyva,TaejeongBae,LoganManlove,BoZhou,JessicaMariani,FritzSedlazeck,AlexanderE.Urban,ChristosProukakis,FloraM.Vaccarino,AlexejAbyzov
![Page 8: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/8.jpg)
vii
PATTERNRECOGNITIONINBIOMEDICALDATA:CHALLENGESINPUTTINGBIGDATATOWORK.................................................................................................................................................................112STRATIFICATIONOFKIDNEYTRANSPLANTRECIPIENTSBASEDONTEMPORALDISEASETRAJECTORIES............113IsabellaFriisJørgensenPhD,SørenSchwartzSørensenPhD,SørenBrunakPhD
MODELINGGENEEXPRESSIONLEVELSFROMEPIGENETICMARKERSUSINGADYNAMICALSYSTEMSAPPROACH114JamesBrunner,JacobKim,KordM.Kober
TRANSLATINGBIGDATANEUROIMAGINGFINDINGSINTOMEASUREMENTSOFINDIVIDUALVULNERABILITY..115PeterKochunov,PaulThompson,NedaJahanshad,ElliotHong
AUTOMATINGNEW-USERCOHORTCONSTRUCTIONWITHINDICATIONEMBEDDINGS............................................116RachelD.Melamed
REPRODUCIBILITY-OPTIMIZEDSTATISTICALTESTINGFOROMICSSTUDIES.............................................................117TomiSuomi,LauraElo
DATAINTEGRATIONEXPECTATIONMAPS:TOWARDSMOREINFORMED'OMICDATAINTEGRATION.................118TiaTate,ChristainRichardson,ClarLyndaWIlliams-DeVane
PRECISIONMEDICINE:ADDRESSINGTHECHALLENGESOFSHARING,ANALYSIS,ANDPRIVACYATSCALE............................................................................................................................................................119INTEGRATEDOMICSDATAMININGOFSYNERGISTICGENEPAIRSFORCANCERPRECISIONMEDICINE.................120EunaJeong,ChoaPark,SukjoonYoon
THEPOWEROFDYNAMICSOCIALNETWORKSTOPREDICTINDIVIDUALS'MENTALHEALTH.................................121ShikangLiu,DavidHachen,OmarLizardo,ChristianPoellabauer,AaronStriegel,TijanaMilenkovic
ROBUST-ODAL:LEARNINGFROMHETEROGENEOUSHEALTHSYSTEMSWITHOUTSHARINGPATIENT-LEVELDATA.......................................................................................................................................................................................122JiayiTong,RuiDuan,RuowangLi,MartijnJ.Scheuemie,JasonH.Moore,YongChen
PHARMGKB:AUTOMATEDLITERATUREANNOTATIONS............................................................................................123MichelleWhirl-Carrillo,LiGong,RachelHuddart,KatrinSangkuhl,RyanWhaley,MarkWoon,JuliaBarbarino,JakeLever,RussB.Altman,TeriE.Klein
WORKSHOPSWITHPOSTERPRESENTATIONSPACKAGINGBIOCOMPUTINGSOFTWARETOMAXIMIZEDISTRIBUTIONANDREUSE...........124APOLLOPROVIDESCOLLABORATIVEGENOMEANNOTATIONEDITINGWITHTHEPOWEROFJBROWSE...........125NathanDunn,ColinDiesh,RobertBuels,HelenaRasche,AnthonyBretaudeau,NomiHarris,IanHolmes
G:PROFILER-ONEFUNCTIONALENRICHMENTANALYSISTOOL,MANYINTERFACESSERVINGLIFESCIENCECOMMUNITIES.......................................................................................................................................................................126LiisKolberg,UkuRaudvere,IvanKuzmin,JaakVilo,HediPeterson
INCREASINGUSABILITYANDDISSEMINATIONOFTHEPATHFXALGORITHMUSINGWEBAPPLICATIONSANDDOCKERSYSTEMS.................................................................................................................................................................127JenniferWilson,NicholasStepanov,AjinkyaChalke,MikeWong,DragutinPetkovic,RussB.Altman
TRANSLATIONALBIOINFORMATICSWORKSHOP:BIOBANKSINTHEPRECISIONMEDICINEERA......................................................................................................................................................................128IDENTIFICATIONOFBIOMARKERSRELATEDTOAUTISMSPECTRUMDISORDERUSINGGENOMICINFORMATION.................................................................................................................................................................................................129LeenaSait,MarthaGizaw,andIosifVaisman
APAN-CANCER3-GENESIGNATURETOPREDICTDORMANCY.....................................................................................130IvyTran,AnchalSharma,SubhajyotiDe
AUTHORINDEX.......................................................................................................................................131
![Page 9: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/9.jpg)
1
ATRIFICIALINTELLIGENCEFORENHANCINGCLINICALMEDICINE
PROCEEDINGSPAPERSWITHORALPRESENTATIONS
![Page 10: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/10.jpg)
2
ArtificialIntelligenceforEnhancingClinicalMedicine
PredictingLongitudinalOutcomesofAlzheimer'sDiseaseviaaTensor-BasedJointClassificationandRegressionModel
LodewijkBrand1,KaiNichols1,HuaWang1,HengHuang2,LiShen3,fortheADNI
1ColoradoSchoolofMines,2UniversityofPittsburgh,3UniversityofPennsylvaniaHuaWangAlzheimer'sdisease(AD)isaseriousneurodegenerativeconditionthataffectsmillionsofpeopleacrosstheworld.RecentlymachinelearningmodelshavebeenusedtopredicttheprogressionofAD,althoughtheyfrequentlydonottakeadvantageofthelongitudinalandstructuralcomponentsassociatedwithmulti-modalmedicaldata.Toaddressthis,wepresentanewalgorithmthatusesthemulti-blockalternatingdirectionmethodofmultiplierstooptimizeanovelobjectivethatcombinesmulti-modallongitudinalclinicaldataofvariousmodalitiestosimultaneouslypredictthecognitivescoresanddiagnosesoftheparticipantsintheAlzheimer'sDiseaseNeuroimagingInitiativecohort.Ournewmodelisdesignedtoleveragethestructureassociatedwithclinicaldatathatisnotincorporatedintostandardmachinelearningoptimizationalgorithms.Thisnewapproachshowsstate-of-the-artpredictiveperformanceandvalidatesacollectionofbrainandgeneticbiomarkersthathavebeenrecordedpreviouslyinADliterature.
![Page 11: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/11.jpg)
3
ArtificialIntelligenceforEnhancingClinicalMedicine
RobustlyExtractingMedicalKnowledgefromEHRs:ACaseStudyofLearningaHealthKnowledgeGraph
IreneY.Chen1,MonicaAgrawal1,StevenHorng2,DavidSontag1
1MassachusettsInstituteofTechnology,2BethIsraelDeaconessMedicalCenter
IreneChenIncreasinglylargeelectronichealthrecords(EHRs)provideanopportunitytoalgorithmicallylearnmedicalknowledge.Inoneprominentexample,acausalhealthknowledgegraphcouldlearnrelationshipsbetweendiseasesandsymptomsandthenserveasadiagnostictooltoberefinedwithadditionalclinicalinput.Priorresearchhasdemonstratedtheabilitytoconstructsuchagraphfromover270,000emergencydepartmentpatientvisits.Inthiswork,wedescribemethodstoevaluateahealthknowledgegraphforrobustness.Movingbeyondprecisionandrecall,weanalyzeforwhichdiseasesandforwhichpatientsthegraphismostaccurate.Weidentifysamplesizeandunmeasuredconfoundersasmajorsourcesoferrorinthehealthknowledgegraph.Weintroduceamethodtoleveragenon-linearfunctionsinbuildingthecausalgraphtobetterunderstandexistingmodelassumptions.Finally,toassessmodelgeneralizability,weextendtoalargersetofcompletepatientvisitswithinahospitalsystem.WeconcludewithadiscussiononhowtorobustlyextractmedicalknowledgefromEHRs.
![Page 12: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/12.jpg)
4
ArtificialIntelligenceforEnhancingClinicalMedicine
IncreasingClinicalTrialAccrualviaAutomatedMatchingofBiomarkerCriteria
JessicaW.Chen,ChristianA.Kunder,NamBui,JamesL.Zehnder,HelioA.Costa,HenningStehr
StanfordUniversitySchoolofMedicine
Successfulimplementationofprecisiononcologyrequiresboththedeploymentofnucleicacidsequencingpanelstoidentifyclinicallyactionablebiomarkers,andtheefficientscreeningofpatientbiomarkereligibilitytoon-goingclinicaltrialsandtherapies.Thisprocessistypicallyperformedmanuallybybiocurators,geneticists,pathologists,andoncologists;however,thisisatime-intensive,andinconsistentprocessamongsthealthcareproviders.WepresentthedevelopmentofafeaturematchingalgorithmicpipelinethatidentifiespatientswhomeeteligibilitycriteriaofprecisionmedicineclinicaltrialsviageneticbiomarkersandapplyittopatientsundergoingtreatmentattheStanfordCancerCenter.Thisstudydemonstrates,throughourpatienteligibilityscreeningalgorithmthatleveragesclinicalsequencingderivedbiomarkerswithprecisionmedicineclinicaltrials,thesuccessfuluseofanautomatedalgorithmicpipelineasafeasible,accurateandeffectivealternativetothetraditionalmanualclinicaltrialcuration.
![Page 13: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/13.jpg)
5
ArtificialIntelligenceforEnhancingClinicalMedicine
AddressingtheCreditAssignmentProbleminTreatmentOutcomePredictionusingTemporalDifferenceLearning
SaharHarati1,AndreaCrowell2,HelenMayberg3,ShamimNemati4
1StanfordUniversity,2EmoryUniversity,3MountSinai,4UniversityofCaliforniaSanDiego
SaharHaratiMentalhealthpatientsoftenundergoavarietyoftreatmentsbeforefindinganeffectiveone.Improvedpredictionoftreatmentresponsecanshortenthedurationoftrials.Akeychallengeofapplyingpredictivemodelingtothisproblemisthatoftentheeffectivenessofatreatmentregimenremainsunknownforseveralweeks,andthereforeimmediatefeedbacksignalsmaynotbeavailableforsupervisedlearning.HereweproposeaMachineLearningapproachtoextractingaudio-visualfeaturesfromweeklyvideointerviewrecordingsforpredictingthelikelyoutcomeofDeepBrainStimulation(DBS)treatmentseveralweeksinadvance.Intheabsenceofimmediatetreatment-responsefeedback,weutilizeajointstate-estimationandtemporaldifferencelearningapproachtomodelboththetrajectoryofapatient'sresponseandthedelayednatureoffeedbacks.Ourresultsbasedonlongitudinalrecordingsfrom12patientswithdepressionshowthatthelearnedstatevaluesarepredictiveofthelong-termsuccessofDBStreatments.Weachieveanareaunderthereceiveroperatingcharacteristiccurveof0.88,beatingallbaselinemethods.
![Page 14: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/14.jpg)
6
ArtificialIntelligenceforEnhancingClinicalMedicine
Fromgenometophenome:Predictingmultiplecancerphenotypesbasedonsomaticgenomicalterationsviathegenomicimpacttransformer
YifengTao1,ChunhuiCai2,WilliamW.Cohen1,XinghuaLu2
1CarnegieMellonUniversity,2UniversityofPittsburgh
YifengTaoCancersaremainlycausedbysomaticgenomicalterations(SGAs)thatperturbcellularsignalingsystemsandeventuallyactivateoncogenicprocesses.Therefore,understandingthefunctionalimpactofSGAsisafundamentaltaskincancerbiologyandprecisiononcology.Here,wepresentadeepneuralnetworkmodelwithencoder-decoderarchitecture,referredtoasgenomicimpacttransformer(GIT),toinferthefunctionalimpactofSGAsoncellularsignalingsystemsthroughmodelingthestatisticalrelationshipsbetweenSGAeventsanddifferentiallyexpressedgenes(DEGs)intumors.Themodelutilizesamulti-headself-attentionmechanismtoidentifySGAsthatlikelycauseDEGs,orinotherwords,differentiatingpotentialdriverSGAsfrompassengeronesinatumor.GITmodellearnsavector(geneembedding)asanabstractrepresentationoffunctionalimpactforeachSGA-affectedgene.GivenSGAsofatumor,themodelcaninstantiatethestatesofthehiddenlayer,providinganabstractrepresentation(tumorembedding)reflectingcharacteristicsofperturbedmolecular/cellularprocessesinthetumor,whichinturncanbeusedtopredictmultiplephenotypes.WeapplytheGITmodelto4,468tumorsprofiledbyTheCancerGenomeAtlas(TCGA)project.TheattentionmechanismenablesthemodeltobettercapturethestatisticalrelationshipbetweenSGAsandDEGsthanconventionalmethods,anddistinguishescancerdriversfrompassengers.ThelearnedgeneembeddingscapturethefunctionalsimilarityofSGAsperturbingcommonpathways.Thetumorembeddingsareshowntobeusefulfortumorstatusrepresentation,andphenotypepredictionincludingpatientsurvivaltimeanddrugresponseofcancercelllines.
![Page 15: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/15.jpg)
7
ArtificialIntelligenceforEnhancingClinicalMedicine
Automatedphenotypingofpatientswithnon-alcoholicfattyliverdiseaserevealsclinicallyrelevantdiseasesubtypes
MaxenceVandromme,TomiJun,PonniPerumalswami,JoelT.Dudley,AndreaBranch,LiLi
IcahnSchoolofMedicineatMountSinai,Sema4MaxenceVandrommeNon-alcoholicfattyliverdisease(NAFLD)isacomplexheterogeneousdiseasewhichaffectsmorethan20%ofthepopulationworldwide.SomesubtypesofNAFLDhavebeenclinicallyidentifiedusinghypothesis-drivenmethods.Inthisstudy,weuseddataminingtechniquestosearchforsubtypesinanunbiasedfashion.Usingelectronicsignaturesofthedisease,weidentifiedacohortof13,290patientswithNAFLDfromahospitaldatabase.Wegatheredclinicaldatafrommultiplesourcesandappliedunsupervisedclusteringtoidentifyfivesubtypesamongthiscohort.Descriptivestatisticsandsurvivalanalysisshowedthatthesubtypeswereclinicallydistinctandwereassociatedwithdifferentratesofdeath,cirrhosis,hepatocellularcarcinoma,chronickidneydisease,cardiovasculardisease,andmyocardialinfarction.Noveldiseasesubtypesidentifiedinthismannercouldbeusedtorisk-stratifypatientsandguidemanagement.
![Page 16: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/16.jpg)
8
ArtificialIntelligenceforEnhancingClinicalMedicine
MonitoringICUMortalityRiskwithALongShort-TermMemoryRecurrentNeuralNetwork
KeYu1,MingdaZhang2,TianyiCui2,MilosHauskrecht2
1IntelligentSystemsProgram,UniversityofPittsburgh;2DepartmentofComputerScience,
UniversityofPittsburghKeYuInintensivecareunits(ICU),mortalitypredictionisacriticalfactornotonlyforeffectivemedicalinterventionbutalsoforallocationofclinicalresources.Structuredelectronichealthrecords(EHR)containvaluableinformationforassessingmortalityriskinICUpatients,butcurrentmortalitypredictionmodelsusuallyrequirelaborioushuman-engineeredfeatures.Furthermore,substantialmissingdatainEHRisacommonproblemforboththeconstructionandimplementationofapredictionmodel.Inspiredbylanguage-relatedmodels,wedesignanewframeworkfordynamicmonitoringofpatients’mortalityrisk.Ourframeworkusesthebag-of-wordsrepresentationforallrelevantmedicaleventsbasedonmostrecenthistoryasinputs.Bydesign,itisrobusttomissingdatainEHRandcanbeeasilyimplementedasaninstantscoringsystemtomonitorthemedicaldevelopmentofallICUpatients.Specifically,ourmodeluseslatentsemanticanalysis(LSA)toencodethepatients’statesintolow-dimensionalembeddings,whicharefurtherfedtolongshort-termmemorynetworksformortalityriskprediction.Ourresultsshowthatthedeeplearningbasedframeworkperformsbetterthantheexistingseverityscoringsystem,SAPS-II.Weobservethatbidirectionallongshort-termmemorydemonstratessuperiorperformance,probablyduetothesuccessfulcaptureofbothforwardandbackwardtemporaldependencies.
![Page 17: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/17.jpg)
9
INTRINSICALLYDISORDEREDPROTEINS(IDPS)ANDTHEIRFUNCTIONS
PROCEEDINGSPAPERSWITHORALPRESENTATIONS
![Page 18: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/18.jpg)
10
IntrinsicallyDisorderedProteins(IDPs)andTheirFunctions
DisorderedFunctionConjunction:Onthein-silicofunctionannotationofintrinsicallydisorderedregions
SinaGhadermarzi,AkilaKatuwawala,ChristopherJ.Oldfield,AmitaBarik,LukaszKurgan
DepartmentofComputerScience,VirginiaCommonwealthUniversity,401WestMainStreet,Richmond,VA23284,USA
LukaszKurganIntrinsicallydisorderregions(IDRs)lackastablestructure,yetperformbiologicalfunctions.ThefunctionsofIDRsincludemediatinginteractionswithothermolecules,includingproteins,DNA,orRNAandentropicfunctions,includingdomainlinkers.Computationalpredictorsprovideresidue-levelindicationsoffunctionfordisorderedproteins,whichcontrastswiththeneedtofunctionallyannotatethethousandsofexperimentallyandcomputationallydiscoveredIDRs.Inthiswork,weinvestigatethefeasibilityofusingresidue-levelpredictionmethodsforregion-levelfunctionpredictions.Foraninitialexaminationofthemultiplefunctionregion-levelpredictionproblem,weconstructedadatasetof(likely)singlefunctionIDRsinproteinsthataredissimilartothetrainingdatasetsoftheresidue-levelfunctionpredictors.Wefindthatavailableresidue-levelpredictionmethodsareonlymodestlyusefulinpredictingmultipleregion-levelfunctions.Classificationisenhancedbysimultaneoususeofmultipleresidue-levelfunctionpredictionsandisfurtherimprovedbyinclusionofaminoacidscontentextractedfromtheproteinsequence.WeconcludethatmultifunctionpredictionforIDRsisfeasibleandbenefitsfromtheresultsproducedbycurrentresidue-levelfunctionpredictors,however,ithastoaccommodateinaccuracyinfunctionalannotations.
![Page 19: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/19.jpg)
11
IntrinsicallyDisorderedProteins(IDPs)andTheirFunctions
DenovoensemblemodelingsuggeststhatAP2-bindingtodisorderedregionscanincreasestericvolumeofEpsinbutnotEps15
N.SuhasJagannathan1,ChristopherW.V.Hogue2,LisaTucker-Kellogg3
1Duke-NUSMedicalSchool;2NationalUniversityofSingapore,600EpicWayUnit345SanJoseCA95134;3Cancer&StemCellBiologyandCentreforComputationalBiologyDuke-NUSMedical
SchoolLisaTucker-KelloggProteinswithintrinsicallydisorderedregions(IDRs)havelargehydrodynamicradii,comparedwithglobularproteinsofequivalentweight.RecentexperimentsshowedthatIDRswithlargeradiicancreatestericpressuretodrivemembranecurvatureduringClathrin-mediatedendocytosis(CME).EpsinandEps15aretwoCMEproteinswithIDRsthatcontainmultiplemotifsforbindingtheadaptorproteinAP2,buttheimpactofAP2-bindingontheseIDRsisunknown.SomeIDRsacquirebinding-inducedfunctionbyformingafoldedquaternarystructure,butwehypothesizethattheIDRsofEpsinand/orEps15acquirebinding-inducedfunctionbyincreasingtheirstericvolume.WeexplorethishypothesisinsilicobygeneratingconformationalensemblesoftheIDRsofEpsin(4millionstructures)orEps15(3millionstructures),thenestimatingtheimpactofAP2-bindingonRadiusofGyration(RG).ResultsshowthattheensembleofEpsinIDRconformationsthataccommodateAP2bindinghasaright-shifteddistributionofRG(largerradii)thantheunboundEpsinensemble.Incontrast,theensembleofEps15IDRconformationshascomparableRGdistributionbetweenAP2-boundandunbound.WespeculatethatAP2triggerstheEpsinIDRtofunctionthroughbinding-induced-expansion,whichcouldincreasestericpressureandmembranebendingduringCME.
![Page 20: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/20.jpg)
12
IntrinsicallyDisorderedProteins(IDPs)andTheirFunctions
Modulationofp53TransactivationDomainConformationsbyLigandBindingandCancer-AssociatedMutations
XiaorongLiu,JianhanChen
UniversityofMassachusettsAmherstJianhanChenIntrinsicallydisorderedproteins(IDPs)areimportantfunctionalproteins,andtheirderegulationarelinkedtonumeroushumandiseasesincludingcancers.Understandinghowdisease-associatedmutationsordrugmoleculescanperturbthesequence-disorderedensemble-function-diseaserelationshipofIDPsremainschallenging,becauseitrequiresdetailedcharacterizationoftheheterogeneousstructuralensemblesofIDPs.Inthiswork,wecombinethelatestatomisticforcefielda99SB-disp,enhancedsamplingtechniquereplicaexchangewithsolutetempering,andGPU-acceleratedmoleculardynamicssimulationstoinvestigatehowfourcancer-associatedmutations,K24N,N29K/N30D,D49Y,andW53G,andbindingofananti-cancermolecule,epigallocatechingallate(EGCG),modulatethedisorderedensembleofthetransactivationdomain(TAD)oftumorsuppressorp53.Throughextensivesampling,inexcessof1.0μsperreplica,well-convergedstructuralensemblesofwild-typeandmutantp53-TADaswellasWTp53-TADinthepresenceofEGCGweregenerated.Theresultsrevealthatmutantscouldinducelocalstructuralchangesandaffectsecondarystructuralproperties.Interestingly,bothEGCGbindingandN29K/N30Dcouldalsoinducelong-rangestructuralreorganizationsandleadtomorecompactstructuresthatcouldshieldkeybindingsitesofp53-TADregulators.FurtheranalysisrevealsthattheeffectsofEGCGbindingaremainlyachievedthroughnonspecificinteractions.Theseobservationsaregenerallyconsistentwithon-goingNMRstudiesandbindingassays.OurstudiessuggestthatinducedconformationalcollapseofIDPsmaybeageneralmechanismforshieldingfunctionalsites,thusinhibitingrecognitionoftheirtargets.Thecurrentstudyalsodemonstratesthatatomisticsimulationsprovideaviableapproachforstudyingthesequence-disorderedensemble-function-diseaserelationshipsofIDPsanddevelopingnewdrugdesignstrategiestargetingregulatoryIDPs.
![Page 21: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/21.jpg)
13
IntrinsicallyDisorderedProteins(IDPs)andTheirFunctions
ExploringRelationshipsbetweentheDensityofChargedTractswithinDisorderedRegionsandPhaseSeparation
RamizSomjee1,2,DianaM.Mitrea1,RichardW.Kriwacki1,3
1St.JudeChildren'sResearchHospital;2RhodesCollege,3UniversityofTennesseeHealthSciences
CenterRamizSomjeeBiomolecularcondensatesformthroughaprocesstermedphaseseparationandplaydiverserolesthroughoutthecell.Proteinsthatundergophaseseparationoftenhavedisorderedregionsthatcanengageinweak,multivalentinteractions;however,ourunderstandingofthesequencegrammarthatdefineswhichproteinsphaseseparateisfarfromcomplete.Here,weshowthatproteinsthatdisplayahighdensityofchargedtractswithinintrinsicallydisorderedregionsarelikelytobeconstituentsofelectrostaticallyorganizedbiomolecularcondensates.WescoredthehumanproteomeusinganalgorithmtermedABTdensitythatquantifiesthedensityofchargedtractsandobservedthatproteinswithmorechargedtractsareenrichedinparticularGeneOntologyannotationsand,baseduponanalysisofinteractionnetworks,clusterintodistinctbiomolecularcondensates.Theseresultssuggestthatelectrostatically-driven,multivalentinteractionsinvolvingchargedtractswithindisorderedregionsservetoorganizecertainbiomolecularcondensatesthroughphaseseparation.
![Page 22: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/22.jpg)
14
MUTATIONALSIGNATURES
PROCEEDINGSPAPERSWITHORALPRESENTATIONS
![Page 23: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/23.jpg)
15
MutationalSignatures
PhySigs:PhylogeneticInferenceofMutationalSignatureDynamics
SarahChristensen1,MarkD.M.Leiserson2,MohammedEl-Kebir1
1UniversityofIllinoisatUrbana-Champaign,2UniversityofMaryland
SarahChristensenDistinctmutationalprocessesshapethegenomesoftheclonescomprisingatumor.Theseprocessesresultindistinctmutationalpatterns,summarizedbyasmallnumberofmutationalsignatures.Currentanalysesofclone-specificexposurestomutationalsignaturesdonotfullyincorporateatumor’sevolutionarycontext,eitherinferringidenticalexposuresforalltumorclones,orinferringexposuresforeachcloneindependently.Here,weintroducetheTree-constrainedExposureproblemtoinferasmallnumberofexposureshiftsalongtheedgesofagiventumorphylogeny.Ouralgorithm,PhySigs,solvesthisproblemandincludesmodelselectiontoidentifythenumberofexposureshiftsthatbestexplainthedata.Wevalidateourapproachonsimulateddataandidentifyexposureshiftsinlungcancerdata,includingatleastoneshiftwithamatchingsubclonaldrivermutationinthemismatchrepairpathway.Moreover,weshowthatourapproachenablestheprioritizationofalternativephylogeniesinferredfromthesamesequencingdata.PhySigsispubliclyavailableathttps://github.com/elkebir-group/PhySigs
![Page 24: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/24.jpg)
16
MutationalSignatures
TrackSigFreq:subclonalreconstructionsbasedonmutationsignaturesandallelefrequencies
CaitlinF.Harrigan1,2,4,YuliaRubanova1,2,4,QuaidMorris1,2,3,4,5,6,AlinaSelega2,4
1DepartmentofComputerScience,UniversityofToronto,Toronto,Canada;2DonnellyCentreforCellularandBiomolecularResearch,UniversityofToronto,Toronto,Canada;3Departmentof
MolecularGenetics,UniversityofToronto,Toronto,Canada;4VectorInstitute,Toronto,Canada;5OntarioInstituteforCancerResearch,Toronto,Canada;6MemorialSloanKetteringCancer
Centre,NewYork,USA(pending)CaitHarriganMutationalsignaturesarepatternsofmutationtypes,manyofwhicharelinkedtoknownmutagenicprocesses.Signatureactivityrepresentstheproportionofmutationsasignaturegenerates.Incancer,cellsmaygainadvantageousphenotypesthroughmutationaccumulation,causingrapidgrowthofthatsubpopulationwithinthetumour.Thepresenceofmanysubclonescanmakecancershardertotreatandhaveotherclinicalimplications.Recon-structingchangesinsignatureactivitiescangiveinsightintotheevolutionofcellswithinatumour.Recently,weintroducedanewmethod,TrackSig,todetectchangesinsignatureactivitiesacrosstimefromsinglebulktumoursample.Bydesign,TrackSigisunabletoidentifymutationpopulationswithdifferentfrequenciesbutlittletonodifferenceinsignatureactivity.Herewepresentanextensionofthismethod,TrackSigFreq,whichenablestrajectoryreconstructionbasedonbothobserveddensityofmutationfrequenciesandchangesinmutationalsignatureactivities.TrackSigFreqpreservestheadvantagesofTrackSig,namelyoptimalandrapidmutationclusteringthroughsegmentation,whileextendingitsothatitcanidentifydistinctmutationpopulationsthatsharesimilarsignatureactivities.
![Page 25: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/25.jpg)
17
MutationalSignatures
DNARepairFootprintUncoversContributionofDNARepairMechanismtoMutationalSignatures
DamianWojtowicz1,MarkD.M.Leiserson2,RodedSharan3,TeresaM.Przytycka1
1NIH,2UniversityofMaryland,3TelAvivUniversityTeresaPrzytyckaCancergenomesaccumulatealargenumberofsomaticmutationsresultingfromimperfectionofDNAprocessingduringnormalcellcycleaswellasfromcarcinogenicexposuresorcancerrelatedaberrationsofDNAmaintenancemachinery.Theseprocessesoftenleadtodistinctivepatternsofmutations,calledmutationalsignatures.Severalcomputationalmethodshavebeendevelopedtouncoversuchsignaturesfromcatalogsofsomaticmutations.However,cancermutationalsignaturesaretheend-effectofseveralinterplayingfactorsincludingcarcinogenicexposuresandpotentialdeficienciesoftheDNArepairmechanism.Tofullyunderstandthenatureofeachsignature,itisimportanttodisambiguatetheatomiccomponentsthatcontributetothefinalsignature.Here,weintroduceanewdescriptorofmutationalsignatures,DNARepairFootPrint(RePrint),andshowthatitcancapturecommonpropertiesofdeficienciesinrepairmechanismscontributingtodiversesignatures.WevalidatethemethodwithpublishedmutationalsignaturesfromcelllinestargetedwithCRISPR-Cas9-basedknockoutsofDNArepairgenes.
![Page 26: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/26.jpg)
18
PATTERNRECOGNITIONINBIOMEDICALDATA:CHALLENGESINPUTTINGBIGDATATOWORK
PROCEEDINGSPAPERSWITHORALPRESENTATIONS
![Page 27: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/27.jpg)
19
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
ClinicalConceptEmbeddingsLearnedfromMassiveSourcesofMultimodalMedicalData
AndrewL.Beam1,BenjaminKompa2,AllenSchmaltz1,InbarFried3,GriffinWeber2,NathanPalmer2,XuShi1,TianxiCai1,IsaacS.Kohane3
1HarvardT.H.ChanSchoolofPublicHealth,2HarvardMedicalSchool,3UniversityofNorth
CarolinaSchoolofMedicineBenjaminKompaWordembeddingsareapopularapproachtounsupervisedlearningofwordrelationshipsthatarewidelyusedinnaturallanguageprocessing.Inthisarticle,wepresentanewsetofembeddingsformedicalconceptslearnedusinganextremelylargecollectionofmultimodalmedicaldata.Leaningonrecenttheoreticalinsights,wedemonstratehowaninsuranceclaimsdatabaseof60millionmembers,acollectionof20millionclinicalnotes,and1.7millionfulltextbiomedicaljournalarticlescanbecombinedtoembedconceptsintoacommonspace,resultinginthelargesteversetofembeddingsfor108,477medicalconcepts.Toevaluateourapproach,wepresentanewbenchmarkmethodologybasedonstatisticalpowerspecificallydesignedtotestembeddingsofmedicalconcepts.Ourapproach,calledcui2vec,attainsstate-of-the-artperformancerelativetopreviousmethodsinmostinstances.Finally,weprovideadownloadablesetofpre-trainedembeddingsforotherresearcherstouse,aswellasanonlinetoolforinteractiveexplorationofthecui2vecembeddings.
![Page 28: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/28.jpg)
20
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
AssessmentofImputationMethodsforMissingGeneExpressionDatainMeta-AnalysisofDistinctCohortsofTuberculosisPatients
CarlyA.Bobak,LaurenMcDonnell,MatthewD.Nemesure,JustinLin,JaneE.Hill
DartmouthCollegeCarlyBobakThegrowthofpubliclyavailablerepositories,suchastheGeneExpressionOmnibus,hasallowedresearcherstoconductmeta-analysisofgeneexpressiondataacrossdistinctcohorts.Inthiswork,weassesseightimputationmethodsfortheirabilitytoimputegeneexpressiondatawhenvaluesaremissingacrossanentirecohortofTuberculosis(TB)patients.Weinvestigatehowvaryingproportionsofmissingdata(across10%,20%,and30%ofpatientsamples)influencetheimputationresults,andtestforsignificantlydifferentiallyexpressedgenesandenrichedpathwaysinpatientswithactiveTB.Ourresultsindicatethattruncatingtocommongenesobservedacrosscohorts,whichisthecurrentmethodusedbyresearchers,resultsintheexclusionofimportantbiologyandsuggestthatLASSOandLLSimputationmethodologiescanreasonablyimputegenesacrosscohortswhentotalmissingnessratesarebelow20%.
![Page 29: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/29.jpg)
21
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
Towardsidentifyingdrugsideeffectsfromsocialmediausingactivelearningandcrowdsourcing
SophieBurkhardt,JuliaSiekiera,JosuaGlodde,MiguelA.Andrade-Navarro,StefanKramer
UniversityofMainzSophieBurkhardtMotivation:Socialmediaisalargelyuntappedsourceofinformationonsideeffectsofdrugs.Twitterinparticulariswidelyusedtoreportoneverydayeventsandpersonalailments.However,labelingthisnoisydataisadifficultproblembecauselabeledtrainingdataissparseandautomaticlabelingiserror-prone.Crowdsourcingcanhelpinsuchascenariotoobtainmorereliablelabels,butisexpensiveincomparisonbecauseworkershavetobepaid.Toremedythis,semi-supervisedactivelearningmayreducethenumberoflabeleddataneededandfocusthemanuallabelingprocessonimportantinformation.Results:WeextracteddatafromTwitterusingthepublicAPI.WesubsequentlyuseAmazonMechanicalTurkincombinationwithastate-of-the-artsemi-supervisedactivelearningmethodtolabeltweetswiththeirassociateddrugsandsideeffectsintwostages.Ourresultsshowthatourmethodisaneffectivewayofdiscoveringsideeffectsintweetswithanimprovementfrom53%F-measureto67%F-measureascomparedtoaonestageworkflow.Additionally,weshowtheeffectivenessoftheactivelearningschemeinreducingthelabelingcostincomparisontoanon-activebaseline.
![Page 30: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/30.jpg)
22
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
MicrovascularDynamicsfrom4DMicroscopyUsingTemporalSegmentation
ShirGur,LiorWolf,LiorGolgher,PabloBlinder
TelAvivUniversityLiorWolfRecentlydevelopedmethodsforrapidcontinuousvolumetrictwo-photonmicroscopyfacilitatetheobservationofneuronalactivityinhundredsofindividualneuronsandchangesinbloodflowinadjacentbloodvesselsacrossalargevolumeoflivingbrainatunprecedentedspatio-temporalresolution.However,thehighimagingratenecessitatesfullyautomatedimageanalysis,whereastissueturbidityandphoto-toxicitylimitationsleadtoextremelysparseandnoisyimagery.Inthiswork,weextendarecentlyproposeddeeplearningvolumetricbloodvesselsegmentationnetwork,suchthatitsupportstemporalanalysis.Withthistechnology,weareabletotrackchangesincerebralbloodvolumeovertimeandidentifyspontaneousarterialdilationsthatpropagatetowardsthepialsurface.Thisnewcapabilityisapromisingsteptowardscharacterizingthehemodynamicresponsefunctionuponwhichfunctionalmagneticresonanceimaging(fMRI)isbased.
![Page 31: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/31.jpg)
23
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
UsingTranscriptionalSignaturestoFindCancerDriverswithLURE
DavidHaan,RuikangTao,VerenaFriedl,IoannisN.Anastopoulos,ChristopherK.Wong,AlanaS.Weinstein,JoshuaM.Stuart
Dept.ofBiomolecularEngineeringandUCSantaCruzGenomicsInstitute,UniversityOf
CaliforniaSantaCruz,SantaCruz,CA95064USADavidHaanCancergenomeprojectshaveproducedmultidimensionaldatasetsonthousandsofsamples.Yet,dependingonthetumortype,5-50%ofsampleshavenoknowndrivingevent.Weintroduceasemi-supervisedmethodcalledLearningUnRealizedEvents(LURE)thatusesaprogressivelabellearningframeworkandminimumspanninganalysistopredictcancerdriversbasedontheiralteredsamplessharingageneexpressionsignaturewiththesamplesofaknownevent.WedemonstratetheutilityofthemethodontheTCGAPan-CancerAt-lasdatasetforwhichitproducedahigh-confidenceresultrelating59newconnectionsto18knownmutationeventsincludingalterationsinthesamegene,family,andpathway.WegiveexamplesofpredicteddriversinvolvedinTP53,telomeremaintenance,andMAPK/RTKsignalingpathways.LUREidentifiesconnectionsbetweengeneswithnoknownpriorrela-tionship,someofwhichmayoffercluesfortargetingspecificformsofcancer.CodeandSup-plementalMaterialareavailableontheLUREwebsite:https://sysbiowiki.soe.ucsc.edu/lure.
![Page 32: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/32.jpg)
24
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
PAGE-Net:InterpretableandIntegrativeDeepLearningforSurvivalAnalysisUsingHistopathologicalImagesandGenomicData
JieHao1,SaiChandraKosaraju2,NelsonZangeTsaku3,DaeHyunSong4,MingonKang2
1UniversityofPennsylvania,2UniversityofNevadaLasVegas,3KennesawStateUniversity,
4GyeongsangNationalUniversityChangwonHospitalJieHaoTheintegrationofmulti-modaldata,suchashistopathologicalimagesandgenomicdata,isessentialforunderstandingcancerheterogeneityandcomplexityforpersonalizedtreatments,aswellasforenhancingsurvivalpredictionsincancerstudy.Histopathology,asaclinicalgold-standardtoolfordiagnosisandprognosisincancers,allowsclinicianstomakeprecisedecisionsontherapies,whereashigh-throughputgenomicdatahavebeeninvestigatedtodissectthegeneticmechanismsofcancers.Weproposeabiologicallyinterpretabledeeplearningmodel(PAGE-Net)thatintegrateshistopathologicalimagesandgenomicdata,notonlytoimprovesurvivalprediction,butalsotoidentifygeneticandhistopathologicalpatternsthatcausedifferentsurvivalratesinpatients.PAGE-Netconsistsofpathology/genome/demography-specificlayers,eachofwhichprovidescomprehensivebiologicalinterpretation.Inparticular,weproposeanovelpatch-wisetexture-basedconvolutionalneuralnetwork,withapatchaggregationstrategy,toextractglobalsurvival-discriminativefeatures,withoutmanualannotationforthepathology-specificlayers.Weadaptedthepathway-basedsparsedeepneuralnetwork,namedCox-PASNet,forthegenome-specificlayers.TheproposeddeeplearningmodelwasassessedwiththehistopathologicalimagesandthegeneexpressiondataofGlioblastomaMultiforme(GBM)atTheCancerGenomeAtlas(TCGA)andTheCancerImagingArchive(TCIA).PAGE-NetachievedaC-indexof0.702,whichishigherthantheresultsachievedwithonlyhistopathologicalimages(0.509)andCox-PASNet(0.640).Moreimportantly,PAGE-Netcansimultaneouslyidentifyhistopathologicalandgenomicprognosticfactorsassociatedwithpatients’survivals.ThesourcecodeofPAGE-Netispubliclyavailableathttps://github.com/DataX-JieHao/PAGE-Net
![Page 33: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/33.jpg)
25
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
Machinelearningalgorithmsforsimultaneoussuperviseddetectionofpeaksinmultiplesamplesandcelltypes
TobyDylanHocking1,GuillaumeBourque2
1NorthernArizonaUniversity,2McGillUniversity
TobyHockingJointpeakdetectionisacentralproblemwhencomparingsamplesinepigenomicdataanalysis,butcurrentalgorithmsforthistaskareunsupervisedandlimitedtoatmosttwosampletypes.WeproposePeakSegPipeline,anewgenome-widemulti-samplepeakcallingpipelineforepigenomicdatasets.Itperformspeakdetectionusingaconstrainedmaximumlikelihoodsegmentationmodelwithessentiallyonlyonefreeparameterthatneedstobetuned:thenumberofpeaks.Toselectthenumberofpeaks,weproposetolearnapenaltyfunctionbasedonuser-providedlabelsthatindicategenomicregionswithorwithoutpeaksinspecificsamples.Incomparisonswithstate-of-the-artpeakdetectionalgorithms,PeakSegPipelineachievessimilarorbetteraccuracy,andamoreinterpretablemodelwithoverlappingpeaksthatoccurinexactlythesamepositionsacrossallsamples.Ournovelapproachisabletolearnthatpredictedpeaksizesvarybyexperimenttype.
![Page 34: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/34.jpg)
26
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
Graph-basedinformationdiffusionmethodforprioritizingfunctionallyrelatedgenesinprotein-proteininteractionnetworks
MinhPham,OlivierLichtarge
BaylorCollegeofMedicineMinhPhamShortestpathlengthmethodsareroutinelyusedtovalidatewhethergenesofinterestarefunctionallyrelatedtoeachotherbasedonbiologicalnetworkinformation.However,themethodsarecomputationallyintensive,impedingextensiveutilizationofnetworkinformation.Inaddition,non-weightedshortestpathlengthapproach,whichismorefrequentlyused,oftentreatallnetworkconnectionsequallywithouttakingintoaccountofconfidencelevelsoftheassociations.Ontheotherhand,graph-basedinformationdiffusionmethod,whichemploysboththepresenceandconfidenceweightsofnetworkedges,canefficientlyexplorelargenetworksandhaspreviouslydetectedmeaningfulbiologicalpatterns.Therefore,inthisstudy,wehypothesizedthatthegraph-basedinformationdiffusionmethodcouldprioritizegeneswithrelevantfunctionsmoreefficientlyandaccuratelythantheshortestpathlengthapproaches.Wedemonstratedthatthegraph-basedinformationdiffusionmethodsubstantiallydifferentiatednotonlygenesparticipatinginsamebiologicalpathways(p<<0.0001)butalsogenesassociatedwithspecifichumandrug-inducedclinicalsymptoms(p<<0.0001)fromrandom.Furthermore,thediffusionmethodprioritizedthesefunctionallyrelatedgenesfasterandmoreaccuratelythantheshortestpathlengthapproaches(pathways:p=2.7e-28,clinicalsymptoms:p=0.032).Thesedatashowthegraph-basedinformationdiffusionmethodcanberoutinelyusedforrobustprioritizationoffunctionallyrelatedgenes,facilitatingefficientnetworkvalidationandhypothesisgeneration,especiallyforhumanphenotype-specificgenes.
![Page 35: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/35.jpg)
27
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
ALiterature-BasedKnowledgeGraphEmbeddingMethodforIdentifyingDrugRepurposingOpportunitiesinRareDiseases
DanielN.Sosa,AlexanderDerry,MargaretGuo,EricWei,ConnorBrinton,RussB.Altman
StanfordUniversityDanielSosaMillionsofAmericansareaffectedbyrarediseases,manyofwhichhavepoorsurvivalrates.However,thesmallmarketsizeofindividualrarediseases,combinedwiththetimeandcapitalrequirementsofpharmaceuticalR&D,havehinderedthedevelopmentofnewdrugsforthesecases.Apromisingalternativeisdrugrepurposing,wherebyexistingFDA-approveddrugsmightbeusedtotreatdiseasesdifferentfromtheiroriginalindications.Inordertogeneratedrugrepurposinghypothesesinasystematicandcomprehensivefashion,itisessentialtointegrateinformationfromacrosstheliteratureofpharmacology,genetics,andpathology.Tothisend,weleverageanewlydevelopedknowledgegraph,theGlobalNetworkofBiomedicalRelationships(GNBR).GNBRisalarge,heterogeneousknowledgegraphcomprisingdrug,disease,andgene(orprotein)entitieslinkedbyasmallsetofsemantic“themes”derivedfromtheabstractsofbiomedicalliterature.Weapplyaknowledgegraphembeddingmethodthatexplicitlymodelstheuncertaintyassociatedwithliterature-derivedrelationshipsanduseslinkpredictiontogeneratedrugrepurposinghypotheses.Thisapproachachieveshighperformanceonagold-standardtestsetofknowndrugindications(AUROC=0.89)andiscapableofgeneratingnovelrepurposinghypotheses,whichweindependentlyvalidateusingexternalliteraturesourcesandproteininteractionnetworks.Finally,wedemonstratetheabilityofourmodeltoproduceexplanationsofitspredictions.
![Page 36: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/36.jpg)
28
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
Two-stageMLClassifierforIdentifyingHostProteinTargetsoftheDengueProtease
JacobT.Stanley,AlisonR.Gilchrist,AlexC.Stabell,MaryA.Allen,SaraL.Sawyer,RobinD.Dowell
DepartmentofMolecular,CellularandDevelopmentalBiology;BioFrontiersInstitute;University
ofColoradoBoulder(allauthorshavethesameaffiliation)JacobStanleyFlavivirusessuchasdengueencodeaproteasethatisessentialforviralreplication.Theproteasefunctionsbycleavingwell-conservedpositionsintheviralpolyprotein.Inadditiontotheviralpolyprotein,thedengueproteasecleavesatleastonehostproteininvolvedinimmuneresponse.Thisraisesthequestion,whatotherhostproteinsaretargetedandcleaved?Herewepresentanewcomputationalmethodforidentifyingputativehostproteintargetsofthedenguevirusprotease.Ourmethodreliesonbiochemicalandsecondarystructurefeaturesattheknowncleavagesitesintheviralpolyproteininatwo-stageclassificationprocesstoidentifyputativecleavagetargets.Theaccuracyofourpredictionsscaledinverselywithevolutionarydistancewhenweappliedittotheknowncleavagesitesofseveralotherflaviviruses---agoodindicationofthevalidityofourpredictions.Ultimately,ourclassifieridentified257humanproteinsitespossessingbothasimilartargetmotifandaccessiblelocalstructure.Theseproteinsarepromisingcandidatesforfurtherinvestigation.Asthenumberofviralsequencesexpands,ourmethodcouldbeadoptedtopredicthosttargetsofotherflaviviruses.
![Page 37: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/37.jpg)
29
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
EnhancingModelInterpretabilityandAccuracyforDiseaseProgressionPredictionviaPhenotype-BasedPatientSimilarityLearning
YueWang1,TongWu1,2,YunlongWang1,GaoWang3
1IQVIAInc.,2UniversityofMinnesota,3UniversityofChicago
YueWangModelshavebeenproposedtoextracttemporalpatternsfromlongitudinalelectronichealthrecords(EHR)forclinicalpredictivemodels.However,thecommonrelationsamongpatients(e.g.,receivingthesamemedicaltreatments)wererarelyconsidered.Inthispaper,weproposetolearnpatientsimilarityfeaturesasphenotypesfromtheaggregatedpatient-medicalservicematrixusingnon-negativematrixfactorization.Onreal-worldmedicalclaimdata,weshowthatthelearnedphenotypesarecoherentwithineachgroup,andalsoexplanatoryandindicativeoftargeteddiseases.WeconductedexperimentstopredictthediagnosesforChronicLymphocyticLeukemia(CLL)patients.Resultsshowthatthephenotype-basedsimilarityfeaturescanimprovepredictionovermultiplebaselines,includinglogisticregression,randomforest,convolutionalneuralnetwork,andmore.
![Page 38: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/38.jpg)
30
PRECISIONMEDICINE:ADDRESSINGTHECHALLENGESOFSHARING,ANALYSIS,ANDPRIVACYATSCALE
PROCEEDINGSPAPERSWITHORALPRESENTATIONS
![Page 39: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/39.jpg)
31
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
IntegratedCancerSubtypingusingHeterogeneousGenome-ScaleMolecularDatasets
SuzanArslanturk1,SorinDraghici1,TinNguyen2
1WayneStateUniversity,2UniversityofNevada
SorinDraghiciVastrepositoriesofheterogeneousdatafromexistingsourcespresentuniqueopportunities.Takenindividually,eachofthedatasetsofferssolutionstoimportantdomainandsource-specificquestions.Collectively,theyrepresentcomplementaryviewsofrelateddataentitieswithanaggregateinformationvalueoftenwellexceedingthesumofitsparts.Integrationofheterogeneousdataisthereforeparamounttoi)obtainamoreunifiedpictureandcomprehensiveviewoftherelations,ii)achievemorerobustresults,iii)improvetheaccuracyandintegrity,andiv)illuminatethecomplexinteractionsamongdatafeatures.Inthispaper,wehaveproposedadataintegrationmethodologytoidentifysubtypesofcancerusingmultipledatatypes(mRNA,methylation,microRNAandsomaticvariants)anddifferentdatascalesthatcomefromdifferentplatforms(microarray,sequencing,etc.).TheCancerGenomeAtlas(TCGA)datasetisusedtobuildthedataintegrationandcancersubtypingframework.Theproposeddataintegrationanddiseasesubtypingapproachaccuratelyidentifiesnovelsubgroupsofpatientswithsignificantlydifferentsurvivalprofiles.Withcurrentavailabilityofvastgenomics,andvariantdataforcancer,theproposeddataintegrationsystemwillbetterdifferentiatecancerandpatientsubtypesforriskandoutcomepredictionandtargetedtreatmentplanningwithoutadditionalcostandpreciouslosttime.
![Page 40: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/40.jpg)
32
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
Assessmentofcoverageforendogenousmetabolitesandexogenouschemicalcompoundsusinganuntargetedmetabolomicsplatform
SekWonKong1,CarlesHernandez-Ferrer2
1ComputationalHealthInformaticsProgram,BostonChildren’sHospital,300LongwoodAvenueBoston,MA02115,USA;2DepartmentofPediatrics,HarvardMedicalSchool,Boston,MA02115,
USASekWonKongPhysiologicalstatusandpathologicalchangesinanindividualcanbecapturedbymetabolicstatethatreflectstheinfluenceofbothgeneticvariantsandenvironmentalfactorssuchasdiet,lifestyleandgutmicrobiome.Thetotalityofenvironmentalexposurethroughoutlifetime–i.e.,exposome–isdifficulttomeasurewithcurrenttechnologies.However,targetedmeasurementofexogenouschemicalsanduntargetedprofilingofendogenousmetaboliteshavebeenwidelyusedtodiscoverbiomarkersofpathophysiologicchangesandtounderstandfunctionalimpactsofgeneticvariants.Toinvestigatethecoverageofchemicalspaceandinterindividualvariationrelatedtodemographicandpathologicalconditions,weprofiled169plasmasamplesusinganuntargetedmetabolomicsplatform.Onaverage,1,009metaboliteswerequantifiedineachindividual(range906–1,038)outof1,244totalchemicalcompoundsdetectedinourcohort.Ofnote,agewaspositivelycorrelatedwiththetotalnumberofdetectedmetabolitesinbothmalesandfemales.UsingtherobustQnestimator,wefoundmetaboliteoutliersineachsample(mean22,rangefrom7to86).Atotalof50metaboliteswereoutliersinapatientwithphenylketonuriaincludingtheonesknownforphenylalaninepathwaysuggestingmultiplemetabolicpathwaysperturbedinthispatient.Thelargestnumberofoutliers(N=86)wasfoundina5-year-oldboywithalpha-1-antitrypsindeficiencywhowerewaitingforlivertransplantationduetocirrhosis.Xenobioticsincludingdrugs,dietsandenvironmentalchemicalsweresignificantlycorrelatedwithdiverseendogenousmetabolitesandtheuseofantibioticssignificantlychangedgutmicrobialproductsdetectedinhostcirculation.Severalchallengessuchasannotationoffeatures,referencerangeandvarianceforeachfeatureperagegroupandgender,andpopulationscalereferencedatasetsneedtobeaddressed;however,untargetedmetabolomicscouldbeimmediatelydeployedasabiomarkerdiscoveryplatformandtoevaluatetheimpactofgenomicvariantsandexposuresonmetabolicpathwaysforsomediseases.
![Page 41: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/41.jpg)
33
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
Coverageprofilecorrectionofshallow-depthcirculatingcell-freeDNAsequencingviamulti-distancelearning
NicholasB.Larson,MelissaC.Larson,JieNa,CarlosP.Sosa,ChenWang,Jean-PierreKocher,RossRowsey
MayoClinicCollegeofMedicineandSciences
NicholasLarsonShallow-depthwhole-genomesequencing(WGS)ofcirculatingcell-freeDNA(ccfDNA)isapopularapproachfornon-invasivegenomicscreeningassays,includingliquidbiopsyforearlydetectionofinvasivetumorsaswellasnon-invasiveprenatalscreening(NIPS)forcommonfetaltrisomies.IncontrasttonuclearDNAWGS,ccfDNAWGSexhibitsextensiveinter-andintra-samplecoveragevariabilitythatisnotfullyexplainedbytypicalsourcesofvariationinWGS,suchasGCcontent.Thisvariabilitymayinflatefalsepositiveandfalsenegativescreeningratesofcopy-numberalterationsandaneuploidy,particularlyifthesefeaturesarepresentatarelativelylowproportionoftotalsequencedcontent.Herein,weproposeanempirically-drivencoveragecorrectionstrategythatleveragespriorannotationinformationinamulti-distancelearningcontexttoimprovewithin-samplecoverageprofilecorrection.Specifically,wetrainaweightedk-nearestneighbors-stylemethodonnon-pregnantfemaledonorccfDNAWGSsamples,andapplyittoNIPSsamplestoevaluatecoverageprofilevariabilityreduction.Weadditionallycharacterizeimprovementinthediscriminationofpositivefetaltrisomycasesrelativetonormalcontrols,andcompareourresultsagainstamoretraditionalregression-basedapproachtoprofilecoveragecorrectionbasedonGCcontentandmappability.Undercross-validation,performancemeasuresindicatedbenefittocombiningthetwofeaturesetsrelativetoeitherinisolation.Wealsoobservedsubstantialimprovementincoverageprofilevariabilityreductioninleave-outclinicalNIPSsamples,withvariabilityreducedby26.5-53.5%relativetothestandardregression-basedmethodasquantifiedbymedianabsolutedeviation.Finally,weobservedimprovementdiscriminationforscreeningpositivetrisomycasesreducingccfDNAWGScoveragevariabilitywhileadditionallyimprovingNIPStrisomyscreeningassayperformance.Overall,ourresultsindicatethatmachinelearningapproachescansubstantiallyimproveccfDNAWGScoverageprofilecorrectionanddownstreamanalyses.
![Page 42: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/42.jpg)
34
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
PGxMine:TextminingforcurationofPharmGKB
JakeLever1,JuliaM.Barbarino2,LiGong2,RachelHuddart2,KatrinSangkuhl2,RyanWhaley2,MichelleWhirl-Carrillo2,MarkWoon2,TeriE.Klein2,3,RussB.Altman1,2,3
1DepartmentofBioengineering,StanfordUniversity,Stanford,CA,94305;2Departmentof
BiomedicalDataScience,StanfordUniversity,Stanford,CA,94305;3DepartmentofMedicine,StanfordUniversity,Stanford,CA,94305
JakeLeverPrecisionmedicinetailorstreatmenttoindividualspersonaldataincludingdifferencesintheirgenome.ThePharmacogenomicsKnowledgebase(PharmGKB)provideshighlycuratedinformationontheeffectofgeneticvariationondrugresponseandsideeffectsforawiderangeofdrugs.PharmGKB’sscientificcuratorstriage,reviewandannotatealargenumberofpaperseachyearbutthetaskischallenging.WepresentthePGxMineresource,atext-minedresourceofpharmacogenomicassociationsfromallaccessiblepublishedliteraturetoassistinthecurationofPharmGKB.Wedevelopedasupervisedmachinelearningpipelinetoextractassociationsbetweenavariant(DNAandproteinchanges,starallelesanddbSNPidentifiers)andachemical.PGxMinecovers452chemicalsand2,426variantsandcontains19,930mentionsofpharmacogenomicassociationsacross7,170papers.AnevaluationbyPharmGKBcuratorsfoundthat57ofthetop100associationsnotfoundinPharmGKBledto83curatablepapersandafurther24associationswouldlikelyleadtocuratablepapersthroughcitations.Theresultscanbeviewedathttps://pgxmine.pharmgkb.org/andcodecanbedownloadedathttps://github.com/jakelever/pgxmine.
![Page 43: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/43.jpg)
35
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
Thepowerofdynamicsocialnetworkstopredictindividuals'mentalhealth
ShikangLiu1,DavidHachen1,OmarLizardo2,ChristianPoellabauer1,AaronStriegel1,TijanaMilenkovic1
1UniversityofNotreDame,2UniversityofCaliforniaLosAngeles
ShikangLiuPrecisionmedicinehasreceivedattentionbothinandoutsidetheclinic.Wefocusonthelatter,byexploitingtherelationshipbetweenindividuals'socialinteractionsandtheirmentalhealthtopredictone'slikelihoodofbeingdepressedoranxiousfromrichdynamicsocialnetworkdata.Existingstudiesdifferfromourworkinatleastoneaspect:theydonotmodelsocialinteractiondataasanetwork;theydosobutanalyzestaticnetworkdata;theyexamine''correlation''betweensocialnetworksandhealthbutwithoutmakinganypredictions;ortheystudyotherindividualtraitsbutnotmentalhealth.Inacomprehensiveevaluation,weshowthatourpredictivemodelthatusesdynamicsocialnetworkdataissuperiortoitsstaticnetworkaswellasnon-networkequivalentswhenrunonthesamedata.
![Page 44: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/44.jpg)
36
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
ImplementingaCloudBasedMethodforProtectedClinicalTrialDataSharing
GauravLuthria,QingboWang
HarvardUniversityGauravLuthriaClinicaltrialsgeneratealargeamountofdatathathavebeenunderutilizedduetoobstaclesthatpreventdatasharingincludingriskingpatientprivacy,datamisrepresentation,andinvalidsecondaryanalyses.Inordertoaddresstheseobstacles,wedevelopedanoveldatasharingmethodwhichensurespatientprivacywhilealsoprotectingtheinterestsofclinicaltrialinvestigators.Ourflexibleandrobustapproachinvolvestwocomponents:(1)anadvancedcloud-basedqueryinglanguagethatallowsuserstotesthypotheseswithoutdirectaccesstotherealclinicaltrialdataand(2)correspondingsyntheticdataforthequeryofinterestthatallowsforexploratoryresearchandmodeldevelopment.Bothcomponentscanbemodifiedbytheclinicaltrialinvestigatordependingonfactorssuchasthetypeoftrialornumberofpatientsenrolled.Totesttheeffectivenessofoursystem,wefirstimplementasimpleandrobustpermutationbasedsyntheticdatagenerator.Wethenusethesyntheticdatageneratorcoupledwithourqueryinglanguagetoidentifysignificantrelationshipsamongvariablesinarealisticclinicaltrialdataset.
![Page 45: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/45.jpg)
37
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
Pathwayandnetworkembeddingmethodsforprioritizingpsychiatricdrugs
YashPershad1,MargaretGuo2,RussB.Altman3
1StanfordUniversityDepartmentofBioengineering,2StanfordUniversityBiomedicalInformatics
Program,3StanfordUniversityDepartmentsofBioengineering,Genetics,&MedicineYashPershad
OneinfiveAmericansexperiencementalillness,androughly75%ofpsychiatricprescriptionsdonotsuccessfullytreatthepatient’scondition.Extensiveevidenceimplicatesgeneticfactorsandsignalingdisruptioninthepathophysiologyofthesediseases.Changesintranscriptionoftenunderliethismolecularpathwaydysregulation;individualpatienttranscriptionaldatacanimprovetheefficacyofdiagnosisandtreatment.Recentlarge-scalegenomicstudieshaveuncoveredsharedgeneticmodulesacrossmultiplepsychiatricdisorders—providinganopportunityforanintegratedmulti-diseaseapproachfordiagnosis.Moreover,network-basedmodelsinformedbygeneexpressioncanrepresentpathologicalbiologicalmechanismsandsuggestnewgenesfordiagnosisandtreatment.Here,weusepatientgeneexpressiondatafrommultiplestudiestoclassifypsychiatricdiseases,integrateknowledgefromexpert-curateddatabasesandpubliclyavailableexperimentaldatatocreateaugmenteddisease-specificgenesets,andusethesetorecommenddisease-relevantdrugs.FromGeneExpressionOmnibus,weextractexpressiondatafrom145casesofschizophrenia,82casesofbipolardisorder,190casesofmajordepressivedisorder,and307sharedcontrols.Weusepathway-basedapproachestopredictpsychiatricdiseasediagnosiswitharandomforestmodel(78%accuracy)andderiveimportantfeaturestoaugmentavailabledruganddiseasesignatures.Usingprotein-protein-interactionnetworksandembedding-basedmethods,webuildapipelinetoprioritizetreatmentsforpsychiatricdiseasesthatachievesa3.4-foldimprovementoverabackgroundmodel.Thus,wedemonstratethatgene-expression-derivedpathwayfeaturescandiagnosepsychiatricdiseasesandthatmolecularinsightsderivedfromthisclassificationtaskcaninformtreatmentprioritizationforpsychiatricdiseases.
![Page 46: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/46.jpg)
38
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
Robust-ODAL:Learningfromheterogeneoushealthsystemswithoutsharingpatient-leveldata
JiayiTong1,RuiDuan1,RuowangLi1,MartijnJ.Scheuemie2,JasonH.Moore1,YongChen1
1UniversityofPennsylvania,2JanssenResearchandDevelopmentLLC
JiayiTongElectronicHealthRecords(EHR)containextensivepatientdataonvarioushealthoutcomesandriskpredictors,providinganefficientandwide-reachingsourceforhealthresearch.IntegratedEHRdatacanprovidealargersamplesizeofthepopulationtoimproveestimationandpredictionaccuracy.Toovercometheobstacleofsharingpatient-leveldata,distributedalgorithmsweredevelopedtoconductstatisticalanalysesacrossmultipleclinicalsitesthroughsharingonlyaggregatedinformation.However,theheterogeneityofdataacrosssitesisoftenignoredbyexistingdistributedalgorithms,whichleadstosubstantialbiaswhenstudyingtheassociationbetweentheoutcomesandexposures.Inthisstudy,weproposeaprivacy-preservingandcommunication-efficientdistributedalgorithmwhichaccountsfortheheterogeneitycausedbyasmallnumberoftheclinicalsites.Weevaluatedouralgorithmthroughasystematicsimulationstudymotivatedbyreal-worldscenariosandappliedouralgorithmtomultipleclaimsdatasetsfromtheObservationalHealthDataSciencesandInformatics(OHDSI)network.TheresultsshowedthattheproposedmethodperformedbetterthantheexistingdistributedalgorithmODALandameta-analysismethod.
![Page 47: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/47.jpg)
39
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
Computationallyefficient,exact,covariate-adjustedgeneticprincipalcomponentanalysisbyleveragingindividualmarkersummarystatisticsfromlargebiobanks
JackWolf1,MarthaBarnard1,XuetingXia2,NathanRyder3,JasonWestra4,NathanTintle4
1St.OlafCollege,2TexasTechUniversity,3ColoradoStateUniversity,4DordtUniversity
NathanTintle
Thepopularizationofbiobanksprovidesanunprecedentedamountofgeneticandphenotypicinformationthatcanbeusedtoresearchtherelationshipbetweengeneticsandhumanhealth.Despitetheopportunitiesthesedatasetsprovide,theyalsoposemanyproblemsassociatedwithcomputationaltimeandcosts,datasizeandtransfer,andprivacyandsecurity.Thepublishingofsummarystatisticsfromthesebiobanks,andtheuseoftheminavarietyofdownstreamstatisticalanalyses,alleviatesmanyoftheselogisticalproblems.However,majorquestionsremainabouthowtousesummarystatisticsinallbutthesimplestdownstreamapplications.Here,wepresentanovelapproachtoutilizebasicsummarystatistics(estimatesfromsinglemarkerregressionsonsinglephenotypes)toevaluatemorecomplexphenotypesusingmultivariatemethods.Inparticular,wepresentacovariate-adjustedmethodforconductingprincipalcomponentanalysis(PCA)utilizingonlybiobanksummarystatistics.Wevalidateexactformulasforthismethod,aswellasprovideaframeworkofestimationwhenspecificsummarystatisticsarenotavailable,throughsimulation.Weapplyourmethodtoarealdatasetoffattyacidandgenomicdata.
![Page 48: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/48.jpg)
40
ARTIFICIALINTELLIGENCEFORENHANCINGCLINICALMEDICINE
PROCEEDINGSPAPERSWITHPOSTERPRESENTATIONS
![Page 49: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/49.jpg)
41
ArtificialIntelligenceforEnhancingClinicalMedicine
MulticlassDiseaseClassificationfromMicrobialWhole-CommunityMetagenomes
SaadKhan,LibushaKelly
AlbertEinsteinCollegeofMedicineSaadKhanThemicrobiome,thecommunityofmicroorganismslivingwithinanindividual,isapromisingavenuefordevelopingnon-invasivemethodsfordiseasescreeninganddiagnosis.Here,weutilize5643aggregated,annotatedwhole-communitymetagenomestoimplementthefirstmulticlassmicrobiomediseaseclassifierofthisscale,abletodiscriminatebetween18differentdiseasesandhealthy.Wecomparedthreedifferentmachinelearningmodels:randomforests,deepneuralnets,andanovelgraphconvolutionalarchitecturewhichexploitsthegraphstructureofphylogenetictreesasitsinput.Weshowthatthegraphconvolutionalmodeloutperformsdeepneuralnetsintermsofaccuracy(achieving75%averagetest-setaccuracy),receiver-operator-characteristics(92.1%averagearea-under-ROC(AUC)),andprecision-recall(50%averagearea-under-precision-recall(AUPR)).Additionally,theconvolutionalnet'sperformancecomplementsthatoftherandomforest,showingalowerpropensityforType-Ierrors(false-positives)whiletherandomforestmakeslessType-IIerrors(false-negatives).Lastly,weareabletoachieveover90%averagetop-3accuracyacrossallofourmodels.Together,theseresultsindicatethattherearepredictive,disease-specificsignaturesacrossmicrobiomesthatcanbeusedfordiagnosticpurposes.
![Page 50: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/50.jpg)
42
ArtificialIntelligenceforEnhancingClinicalMedicine
LitGen:GeneticLiteratureRecommendationGuidedbyHumanExplanations
AllenNie1,ArturoL.Pineda1,MattW.Wright1,HannahWand1,BryanWulf1,HelioA.Costa1,RonakY.Patel2,CarlosD.Bustamante1,JamesZou1
1StanfordUniversity,2BaylorCollegeofMedicine
AllenNieAsgeneticsequencingcostsdecrease,thelackofclinicalinterpretationofvariantshasbecomethebottleneckinusinggeneticsdata.Amajorratelimitingstepinclinicalinterpretationisthemanualcurationofevidenceinthegeneticliteraturebyhighlytrainedbiocurators.Whatmakescurationparticularlytime-consumingisthatthecuratorneedstoidentifypapersthatstudyvariantpathogenicityusingdifferenttypesofapproachesandevidences---e.g.biochemicalassaysorcasecontrolanalysis.IncollaborationwiththeClinicalGenomicResource(ClinGen)---theflagshipNIHprogramforclinicalcuration---weproposethefirstmachinelearningsystem,LitGen,thatcanretrievepapersforaparticularvariantandfilterthembyspecificevidencetypesusedbycuratorstoassessforpathogenicity.LitGenusessemi-superviseddeeplearningtopredictthetypeofevi+denceprovidedbyeachpaper.ItistrainedonpapersannotatedbyClinGencuratorsandsystematicallyevaluatedonnewtestdatacollectedbyClinGen.LitGenfurtherleveragesrichhumanexplanationsandunlabeleddatatogain7.9%-12.6%relativeperformanceimprovementovermodelslearnedonlyontheannotatedpapers.Itisausefulframeworktoimproveclinicalvariantcuration.
![Page 51: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/51.jpg)
43
ArtificialIntelligenceforEnhancingClinicalMedicine
MultilevelSelf-AttentionModelanditsUseonMedicalRiskPrediction
XianlongZeng1,2,YunyiFeng1,2,SoheilMoosavinasab2,DeborahLin2,SimonLin2,ChangLiu1
1SchoolofElectricalEngineeringandComputerScience,OhioUniversity,Athens,OH,USA;2The
ResearchInstituteatNationwideChildren’sHospital,Columbus,OH,USAxianlongzengVariousdeeplearningmodelshavebeendevelopedfordifferenthealthcarepredictivetasksusingElectronicHealthRecordsandhaveshownpromisingperformance.Inthesemodels,medicalcodesareoftenaggregatedintovisitrepresentationwithoutconsideringtheirheterogeneity,e.g.,thesamediagnosismightimplydifferenthealthcareconcernswithdifferentproceduresormedications.Thenthevisitsareoftenfedintodeeplearningmodels,suchasrecurrentneuralnetworks,sequentiallywithoutconsideringtheirregulartemporalinformationanddependenciesamongvisits.Toaddresstheselimitations,wedevelopedaMultilevelSelf-AttentionModel(MSAM)thatcancapturetheunderlyingrelationshipsbetweenmedicalcodesandbetweenmedicalvisits.WecomparedMSAMwithvariousbaselinemodelsontwopredictivetasks,i.e.,futurediseasepredictionandfuturemedicalcostprediction,withtwolargedatasets,i.e.,MIMIC-3andPFK.Intheexperiments,MSAMconsistentlyoutperformedbaselinemodels.Additionally,forfuturemedicalcostprediction,weuseddiseasepredictionasanauxiliarytask,whichnotonlyguidesthemodeltoachieveastrongerandmorestablefinancialprediction,butalsoallowsmanagedcareorganizationstoprovideabettercarecoordination.
![Page 52: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/52.jpg)
44
ArtificialIntelligenceforEnhancingClinicalMedicine
IdentifyingTransitionalHighCostUsersfromUnstructuredPatientProfilesWrittenbyPrimaryCarePhysicians
HaoranZhang1,2,3,ElisaCandido3,AndrewS.Wilton3,RaquelDuchen3,LiisaJaakkimainen3,WalterWodchis3,4,5,QuaidMorris1,2,6,7
1DepartmentofComputerScience,UniversityofToronto;2VectorInstituteforArtificial
Intelligence,Toronto,Ontario,Canada;3ICES,Toronto,Ontario,Canada;4InstituteofHealthPolicy,Management,andEvaluation,UniversityofToronto;5InstituteforBetterHealth,Trillium
HealthPartners,Mississauga,Ontario,Canada;6TerrenceDonnellyCenterforCellularandBiomolecularResearch,UniversityofToronto;7DepartmentofMolecularGenetics,Universityof
TorontoHaoranZhangIdentificationandsubsequentinterventionofpatientsatriskofbecomingHighCostUsers(HCUs)presentstheopportunitytoimproveoutcomeswhilealsoprovidingsignificantsavingsforthehealthcaresystem.Inthispaper,the2016HCUstatusofpatientswaspredictedusingfree-formtextdatafromthe2015cumulativepatientprofileswithintheelectronicmedicalrecordsoffamilycarepracticesinOntario.Theseunstructurednotesmakesubstantialuseofdomain-specificspellingsandabbreviations;weshowthatwordembeddingsderivedfromthesamecontextprovidemoreinformativefeaturesthanpre-trainedonesbasedonWikipedia,MIMIC,andPubmed.Wefurtherdemonstratethatamodelusingfeaturesderivedfromaggregatedwordembeddings(EmbEncode)providesasignificantperformanceimprovementoverthebag-of-wordsrepresentation(82.48±0.35%versus81.85±0.36%held-outAUROC,p=3.2E-4),usingfarfewerinputfeatures(5,492versus214,750)andfewernon-zerocoefficients(1,177versus4,284).ThefutureHCUsofgreatestinterestarethetransitionaloneswhoarenotalreadyHCUs,becausetheyprovidethegreatestscopeforinterventions.PredictingthesenewHCUischallengingbecausemostHCUsrecur.WeshowthatremovingrecurrentHCUsfromthetrainingsetimprovestheabilityofEmbEncodetopredictnewHCUs,whileonlyslightlydecreasingitsabilitytopredictrecurrentones.
![Page 53: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/53.jpg)
45
ArtificialIntelligenceforEnhancingClinicalMedicine
Obtainingdual-energycomputedtomography(CT)informationfromasingle-energyCTimageforquantitativeimaginganalysisoflivingsubjectsbyusingdeeplearning
WeiZhao1,TianlingLv2,RenaLee3,YangChen2,LeiXing1
1StanfordUniversity,2SoutheastUniversity,3EhwaWomensUniversity
LeiXingComputedtomographic(CT)isafundamentalimagingmodalitytogeneratecross-sectionalviewsofinternalanatomyinalivingsubjectorinterrogatematerialcompositionofanobject,andithasbeenroutinelyusedinclinicalapplicationsandnondestructivetesting.InastandardCTimage,pixelshavingthesameHounsfieldUnits(HU)cancorrespondtodifferentmaterials,anditisthereforechallengingtodifferentiateandquantifymaterials.Dual-energyCT(DECT)isdesirabletodifferentiatemultiplematerials,butthecostlyDECTscannersarenotwidelyavailableassingle-energyCT(SECT)scanners.Recentadvancementindeeplearningprovidesanenablingtooltomapimagesbetweendifferentmodalitieswithincorporatedpriorknowledge.HerewedevelopadeeplearningapproachtoperformDECTimagingbyusingthestandardSECTdata.Theendpointoftheapproachisamodelcapableofprovidingthehigh-energyCTimageforagiveninputlow-energyCTimage.Thefeasibilityofthedeeplearning-basedDECTimagingmethodusingaSECTdataisdemonstratedusingcontrast-enhancedDECTimagesandevaluatedusingclinicalrelevantindexes.ThisworkopensnewopportunitiesfornumerousDECTclinicalapplicationswithastandardSECTdataandmayenablesignificantlysimplifiedhardwaredesign,scanningdose,andimagecostreductionforfutureDECTsystems.
![Page 54: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/54.jpg)
46
INTRINSICALLYDISORDEREDPROTEINS(IDPS)ANDTHEIRFUNCTIONS
PROCEEDINGSPAPERSWITHPOSTERPRESENTATIONS
![Page 55: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/55.jpg)
47
IntrinsicallyDisorderedProteins(IDPs)andTheirFunctions
Many-to-onebindingbyintrinsicallydisorderedproteinregions
Wei-LunAlterovitz1*,EshelFaraggi1,2,3*,ChristopherJ.Oldfield1,JingweiMeng1,BinXue1,FeiHuang1,PedroRomero1,AndrzejKloczkowski2,VladimirN.Uversky1,A.KeithDunker1
1CenterforComputationalBiologyandBioinformatics,DepartmentofBiochemistryand
MolecularBiology,IndianaUniversitySchoolofMedicine,410W.10thSt,HS5000,Indianapolis,IN46202,USA([email protected]);2BattelleCenterforMathematicalMedicine,andthe
NationwideChildren’sHospital,DepartmentofPediatrics,TheOhioStateUniversity,Columbus,OH43210,USA;3ResearchandInformationSystems,LLC,1620E.72ndSt.Indianapolis,IN
46240USA*Contributedequally([email protected],[email protected])
KeithDunkerDisorderedbindingregions(DBRs),whichareembeddedwithinintrinsicallydisorderedproteinsorregions(IDPsorIDRs),enableIDPsorIDRstomediatemultipleprotein-proteininteractions.DBR-proteincomplexeswerecollectedfromtheProteinDataBankforwhichtwoormoreDBRshavingdifferentaminoacidsequencesbindtothesame(100%sequenceidentical)globularproteinpartner,atypeofinteractionhereincalledmany-to-onebinding.Twodistinctbindingprofileswereidentified:independentandoverlapping.Fortheoverlappingbindingprofiles,thedistinctDBRsinteractbymeansofalmostidenticalbindingsites(hereincalled“similar”),orthebindingsitescontainbothcommonanddivergentinteractionresidues(hereincalled“intersecting”).FurtheranalysisofthesequenceandstructuraldifferencesamongthesethreegroupsindicatehowIDPflexibilityallowsdifferentsegmentstoadjusttosimilar,intersecting,andindependentbindingpockets.
![Page 56: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/56.jpg)
48
MUTATIONALSIGNATURES
PROCEEDINGSPAPERSWITHPOSTERPRESENTATIONS
![Page 57: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/57.jpg)
49
MutationalSignatures
ImpactofmutationalsignaturesonmicroRNAandtheirresponseelements
EiriniStamoulakatou1,PietroPinoli1,StefanoCeri1,RosarioPiro2
1PolitecnicodiMilano,2FreieUniversitatBerlin
EiriniStamoulakatouMicroRNAsareaclassofsmallnon-codingRNAmoleculeswithgreatimportanceforregulatingalargenumberofdiversebiologicalprocessesinhealthanddisease,mostlybybindingtocomplementarymicroRNAresponseelements(MREs)onprotein-codingmessengerRNAsandothernon-codingRNAsandsubsequentlyinducingtheirdegradation.AgrowingbodyofevidenceindicatesthatthedysregulationofcertainmicroRNAsmayeitherdriveorsuppressoncogenesis.TheseedregionofamicroRNAisofcrucialimportanceforitstargetrecognition.MutationsintheseseedregionsmaydisruptthebindingofmicroRNAstotheirtargetgenes.Inthisstudy,weinvestigatethetheoreticalimpactofcancer-associatedmutagenicprocessesandtheirmutationalsignaturesonmicroRNAseedsandtheirMREs.Toourknowledge,thisisthefirststudywhichprovidesaprobabilisticframeworkformicroRNAandMREsequencealterationanalysisbasedonmutationalsignaturesandcomputationallyassessingthedisruptiveimpactofmutationalsignaturesonhumanmicroRNA–targetinteractions.
![Page 58: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/58.jpg)
50
MutationalSignatures
GenomeGerrymandering:optimaldivisonofthegenomeintoregionswithcancertypespecificdifferencesinmutationrates
AdamoYoung,JacobChmura,YoonsikPark,QuaidMorris,GurnitAtwal
UniversityofTorontoAdamoYoungTheactivityofmutationalprocessesdiffersacrossthegenome,andisinfluencedbychromatinstateandspatialgenomeorganization.Atthescaleofonemegabase-pair(Mb),regionalmutationdensitycorrelatestronglywithchromatinfeaturesandmutationdensityatthisscalecanbeusedtoaccuratelyidentifycancertype.Here,weexploretherelationshipbetweengenomicregionandmutationratebydevelopinganinformationtheorydriven,dynamicprogrammingalgorithmfordividingthegenomeintoregionswithdifferingrelativemutationratesbetweencancertypes.Ouralgorithmimprovesmutualinformationwhencomparedtothenaiveapproach,effectivelyreducingtheaveragenumberofmutationsrequiredtoidentifycancertype.Ourapproachprovidesanefficientmethodforassociatingregionalmutationdensitywithmutationlabels,andhasfutureapplicationsinexploringtheroleofsomaticmutationsinanumberofdiseases.
![Page 59: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/59.jpg)
51
PATTERNRECOGNITIONINBIOMEDICALDATA:CHALLENGESINPUTTINGBIGDATATOWORK
PROCEEDINGSPAPERSWITHPOSTERPRESENTATIONS
![Page 60: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/60.jpg)
52
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
LearningaLatentSpaceofHighlyMultidimensionalCancerData
BenjaminKompa1,BeauCoker2
1HarvardMedicalSchool,2HarvardSchoolofPublicHealth
BenjaminKompaWeintroduceaUnifiedDisentanglementNetwork(UFDN)trainedonTheCancerGenomeAtlas(TCGA),whichwerefertoasUFDN-TCGA.WedemonstratethatUFDN-TCGAlearnsabiologicallyrelevant,low-dimensionallatentspaceofhigh-dimensionalgeneexpressiondatabyapplyingournetworktotwoclassificationtasksofcancerstatusandcancertype.UFDN-TCGAperformscomparablytorandomforestmethods.TheUFDNallowsforcontinuous,partialinterpolationbetweendistinctcancertypes.Furthermore,weperformananalysisofdifferentiallyexpressedgenesbetweenskincutaneousmelanoma(SKCM)samplesandthesamesamplesinterpolatedintoglioblastoma(GBM).Wedemonstratethatourinterpolationsconsistofrelevantmetagenesthatrecapitulateknownglioblastomamechanisms.
![Page 61: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/61.jpg)
53
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
ScalingstructurallearningwithNO-BEARStoinfercausaltranscriptomenetworks
Hao-ChihLee1,3,MatteoDanieletto1,2,3,RiccardoMiotto1,2,3,SarahT.Cherng1,3,JoelT.Dudley1,2,3
1InstituteforNextGenerationHealthcare,2HassoPlattnerInstituteforDigitalHealth,
3DepartmentofGeneticsandGenomicSciencesIcahnSchoolofMedicineatMountSinaiNewYork,NY10065,USA
Hao-ChihLeeConstructinggeneregulatorynetworksisacriticalstepinrevealingdiseasemechanismsfromtranscriptomicdata.Inthiswork,wepresentNO-BEARS,anovelalgorithmforestimatinggeneregulatorynetworks.TheNO-BEARSalgorithmisbuiltonthebasisoftheNO-TEARSalgorithmwithtwoimprovements.First,weproposeanewconstraintanditsfastapproximationtoreducethecomputationalcostoftheNO-TEARSalgorithm.Next,weintroduceapolynomialregressionlosstohandlenon-linearityingeneexpressions.OurimplementationutilizesmodernGPUcomputationthatcandecreasethetimeofhours-longCPUcomputationtoseconds.Usingsyntheticdata,wedemonstrateimprovedperformance,bothinprocessingtimeandaccuracy,oninferringgeneregulatorynetworksfromgeneexpressiondata.
![Page 62: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/62.jpg)
54
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
PathFlowAI:AHigh-ThroughputWorkflowforPreprocessing,DeepLearningandInterpretationinDigitalPathology
JoshuaJ.Levy1,LucasA.Salas1,BrockC.Christensen1,AravindhanSriharan2,LouisJ.Vaickus2
1GeiselSchoolofMedicineatDartmouth,2DartmouthHitchcockMedicalCenter
JoshuaLevyThediagnosisofdiseaseoftenrequiresanalysisofabiopsy.Manydiagnosesdependnotonlyonthepresenceofcertainfeaturesbutontheirlocationwithinthetissue.Recently,anumberofdeeplearningdiagnosticaidshavebeendevelopedtoclassifydigitizedbiopsyslides.Clinicalworkflowsofteninvolveprocessingofmorethan500slidesperday.But,clinicaluseofdeeplearningdiagnosticaidswouldrequireapreprocessingworkflowthatiscost-effective,flexible,scalable,rapid,interpretable,andtransparent.Here,wepresentsuchaworkflow,optimizedusingDaskandmixedprecisiontrainingviaAPEX,capableofhandlinganypatch-levelorslidelevelclassificationandpredictionproblem.Theworkflowusesaflexibleandfastpreprocessinganddeeplearninganalyticspipeline,incorporatesmodelinterpretationandhasahighlystorage-efficientaudittrail.Wedemonstratetheutilityofthispackageontheanalysisofaprototypicalanatomicpathologyspecimen,liverbiopsiesforevaluationofhepatitisfromaprospectivecohort.ThepreliminarydataindicatethatPathFlowAImaybecomeacost-effectiveandtime-efficienttoolforclinicaluseofArtificialIntelligence(AI)algorithms.
![Page 63: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/63.jpg)
55
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
Improvingsurvivalpredictionusinganovelfeatureselectionandfeaturereductionframeworkbasedontheintegrationofclinicalandmoleculardata*
LisaNeums,RichardMeier,DevinC.Koestler,JeffreyA.Thompson
DepartmentofBiostatisticsandDataScience,UniversityofKansasMedicalCenter,andUniversityofKansasCancerCenter
LisaNeumsTheaccuratepredictionofacancerpatient’sriskofprogressionordeathcanguidecliniciansintheselectionoftreatmentandhelppatientsinplanningpersonalaffairs.Predictivemodelsbasedonpatient-leveldatarepresentatoolfordeterminingrisk.Ideally,predictivemodelswillusemultiplesourcesofdata(e.g.,clinical,demographic,molecular,etc.).However,therearemanychallengesassociatedwithdataintegration,suchasoverfittingandredundantfeatures.Inthispaperweaimtoaddressthosechallengesthroughthedevelopmentofanovelfeatureselectionandfeaturereductionframeworkthatcanhandlecorrelateddata.Ourmethodbeginsbycomputingasurvivaldistancescoreforgeneexpression,whichincombinationwithascoreforclinicalindependence,resultsintheselectionofhighlypredictivegenesthatarenon-redundantwithclinicalfeatures.Thesurvivaldistancescoreisameasureofvariationofgeneexpressionovertime,weightedbythevarianceofthegeneexpressionoverallpatients.Selectedgenes,incombinationwithclinicaldata,areusedtobuildapredictivemodelforsurvival.Webenchmarkourapproachagainstcommonlyusedmethods,namelylasso-aswellasridge-penalizedCoxproportionalhazardsmodels,usingthreepubliclyavailablecancerdatasets:kidneycancer(521samples),lungcancer(454samples)andbladdercancer(335samples).Acrossalldatasets,ourapproachbuiltonthetrainingsetoutperformedtheclinicaldataaloneinthetestsetintermsofpredictivepowerwithac.Indexof0.773vs0.755forkidneycancer,0.695vs0.664forlungcancerand0.648vs0.636forbladdercancer.Further,wewereabletoshowincreasedpredictiveperformanceofourmethodcomparedtolasso-penalizedmodelsfittobothgeneexpressionandclinicaldata,whichhadac.Indexof0.767,0.677,and0.645,aswellasincreasedorcomparablepredictivepowercomparedtoridgemodels,whichhadac.Indexof0.773,0.668and0.650forthekidney,lung,andbladdercancerdatasets,respectively.Therefore,ourscoreforclinicalindependenceimprovesprognosticperformanceascomparedtomodelingapproachesthatdonotconsidercombiningnon-redundantdata.Futureworkwillconcentrateonoptimizingthesurvivaldistancescoreinordertoachieveimprovedresultsforalltypesofcancer.
![Page 64: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/64.jpg)
56
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
Bayesiansemi-nonnegativematrixtri-factorizationtoidentifypathwaysassociatedwithcancerphenotypes
SunhoPark1,NabhonilKar1,Jae-HoCheong2,TaeHyunHwang1
1ClevelandClinic,2YonseiUniversityCollegeofMedicine
SunhoParkAccurateidentificationofpathwaysassociatedwithcancerphenotypes(e.g.,cancersubtypesandtreatmentoutcome)couldleadtodiscoveringreliableprognosticand/orpredictivebiomarkersforbetterpatientsstratificationandtreatmentguidance.Inourpreviouswork,wehaveshownthatnon-negativematrixtri-factorization(NMTF)canbesuccessfullyappliedtoidentifypathwaysassociatedwithspecificcancertypesordiseaseclassesasaprognosticandpredictivebiomarker.However,onekeylimitationofnon-negativefactorizationmethods,includingvariousnon-negativebi-factorizationmethods,istheirlimitedabilitytohandlenegativeinputdata.Forexample,manymoleculardatathatconsistofreal-valuescontainingbothpositiveandnegativevalues(e.g.,normalized/logtransformedgeneexpressiondatawherenegativevaluerepresentsdown-regulatedexpressionofgenes)arenotsuitableinputforthesealgorithms.Inaddition,mostpreviousmethodsprovidejustasinglepointestimateandhencecannotdealwithuncertaintyeffectively.Toaddresstheselimitations,weproposeaBayesiansemi-nonnegativematrixtri-factorizationmethodtoidentifypathwaysassociatedwithcancerphenotypesfromareal-valuedinputmatrix,e.g.,geneexpressionvalues.Motivatedbysemi-nonnegativefactorization,weallowoneofthefactormatrices,thecentroidmatrix,tobereal-valuedsothateachcentroidcanexpresseithertheup-ordown-regulationofthemembergenesinapathway.Inaddition,weplacestructuredspike-and-slabpriors(whichareencodedwiththepathwaysandagene-geneinteraction(GGI)network)onthecentroidmatrixsothatevenasetofgenesthatisnotinitiallycontainedinthepathways(duetotheincompletenessofthecurrentpathwaydatabase)canbeinvolvedinthefactorizationinastochasticwayspecifically,ifthosegenesareconnectedtothemembergenesofthepathwaysontheGGInetwork.Wealsopresentupdaterulesfortheposteriordistributionsintheframeworkofvariationalinference.AsafullBayesianmethod,ourproposedmethodhasseveraladvantagesoverthecurrentNMTFmethods,whicharedemonstratedusingsyntheticdatasetsinexperiments.UsingtheTheCancerGenomeAtlas(TCGA)gastriccancerandmetastaticgastriccancerimmunotherapyclinical-trialdatasets,weshowthatourmethodcouldidentifybiologicallyandclinicallyrelevantpathwaysassociatedwiththemolecularsubtypesandimmunotherapyresponse,respectively.Finally,weshowthatthosepathwaysidentifiedbytheproposedmethodcouldbeusedasprognosticbiomarkerstostratifypatientswithdistinctsurvivaloutcomeintwoindependentvalidationdatasets.Additionalinformationandcodescanbefoundathttps://github.com/parks-cs-ccf/BayesianSNMTF.
![Page 65: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/65.jpg)
57
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
Tree-WeightingforMulti-StudyEnsembleLearners
MayaRamchandran1,PrasadPatil1,2,GiovanniParmigiani1,2
1DepartmentofBiostatistics,HarvardT.H.ChanSchoolofPublicHealth;Departmentof
Biostatistics,HarvardT.H.ChanSchoolofPublicHealth;2DepartmentofDataSciences,Dana-FarberCancerInstitute
MayaRamchandranMulti-studylearningusesmultipletrainingstudies,separatelytrainsclassifiersoneach,andformsanensemblewithweightsrewardingmemberswithbettercross-studypredictionability.Thisarticleconsidersnovelweightingapproachesforconstructingtree-basedensemblelearnersinthissetting.UsingRandomForestsasasingle-studylearner,wecompareweightingeachforesttoformtheensemble,toextractingtheindividualtreestrainedbyeachRandomForestandweightingthemdirectly.Wefindthatincorporatingmultiplelayersofensemblinginthetrainingprocessbyweightingtreesincreasestherobustnessoftheresultingpredictor.Furthermore,weexplorehowensemblingweightscorrespondtotreestructure,toshedlightonthefeaturesthatdeterminewhetherweightingtreesdirectlyisadvantageous.Finally,weapplyourapproachtogenomicdatasetsandshowthatweightingtreesimprovesuponthebasicmulti-studylearningparadigm.Codeandsupplementarymaterialareavailableathttps://github.com/m-ramchandran/tree-weighting.
![Page 66: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/66.jpg)
58
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
PTRExplorer:AnapproachtoidentifyandexplorePostTranscriptionalRegulatorymechanismsusingproteogenomics
ArunimaSrivastava1,MichaelSharpnack1,KunHuang2,ParagMallick3,RaghuMachiraju1
1TheOhioStateUniversity,2IndianaUniversitySchoolofMedicine,3StanfordUniversity
ArunimaSrivastavaIntegrationoftranscriptomicandproteomicdatashouldrevealmulti-layeredregulatoryprocessesgoverningcancercellbehaviors.Traditionalcorrelation-basedanalyseshavedemonstratedlimitedabilitytoidentifythepost-transcriptionalregulatory(PTR)processesthatdrivethenon-linearrelationshipbetweentranscriptandproteinabundances.Inthiswork,weideateanintegrativeapproachtoexplorethevarietyofpost-transcriptionalmechanismsthatdictaterelationshipsbetweengenesandcorrespondingproteins.Theproposedworkflowutilizestheintuitivetechniqueofscatterplotdiagnosticsorscagnostics,tocharacterizeandexaminethediversescatterplotsbuiltfromtranscriptandproteinabundancesinaproteogenomicexperiment.Theworkflowincludesrepresentinggene-proteinrelationshipsasscatterplots,clusteringongeometricscagnosticfeaturesofthesescatterplots,andfinallyidentifyingandgroupingthepotentialgene-proteinrelationshipsaccordingtotheirdispositiontovariousPTRmechanisms.Ourstudyverifiestheefficacyoftheimplementedapproachtoexcavatepossibleregulatorymechanismsbyutilizingcomprehensivetestsonasyntheticdataset.Wealsoproposeavarietyof2Dpattern-specificdownstreamanalysesmethodologiessuchasmixturemodeling,andmappingmiRNApost-transcriptionaleffectstoexploreeachmechanismfurther.Thisworksuggeststhattheproposedmethodologyhasthepotentialfordiscoveringandcategorizingpost-transcriptionalregulatorymechanisms,manifestinginproteogenomictrends.Thesetrendssubsequentlyprovideevidenceforcancerspecificity,miRNAtargeting,andidentificationofregulationimpactedbybiologicalfunctionalityanddifferenttypesofdegradation.
![Page 67: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/67.jpg)
59
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
NetworkRepresentationofLarge-ScaleHeterogeneousRNASequenceswithIntegrationofDiverseMulti-omics,Interactions,andAnnotationsData
NhatTran,JeanGao
TheUniversityofTexasatArlingtonJeanGaoLongnon-codingRNA(lncRNA),microRNA,andmessengerRNAenablekeyregulationsofvariousbiologicalprocessesthroughavarietyofdiverseinteractionmechanisms.Identifyingtheinteractionsandcross-talkbetweentheseheterogeneousRNAclassesisessentialinordertouncoverthefunctionalroleofindividualRNAtranscripts,especiallyforunannotatedandsparselydiscoveredRNAsequenceswithnoknowninteractions.Recently,sequence-baseddeeplearningandnetworkembeddingmethodsaregainingtractionashigh-performingandflexibleapproachesthatcaneitherpredictRNA-RNAinteractionsfromsequenceorinfermissinginteractionsfrompatternsthatmayexistinthenetworktopology.However,mostofthecurrentmethodshaveseverallimitations,e.g.,theinabilitytoperforminductivepredictions,todistinguishthedirectionalityofinteractions,ortointegratevarioussequence,interaction,expression,andgenomicannotationdatasets.Weproposedanoveldeeplearningframework,rna2rna,whichlearnsfromRNAsequencestoproducealow-dimensionalembeddingthatpreservesproximitiesinboththeinteractiontopologyandthefunctionalaffinitytopology.Inthisproposedembeddingspace,thetwo-part"sourceandtargetcontexts"capturethereceptivefieldsofeachRNAtranscripttoencapsulateheterogeneouscross-talkinteractionsbetweenlncRNAsandmicroRNAs.TheproximitybetweenRNAsinthisembeddingspacealsouncoversthesecond-orderrelationshipsthatallowforaccurateinferenceofnoveldirectedinteractionsorfunctionalsimilaritiesbetweenanytwoRNAsequences.Inaprospectiveevaluation,ourmethodexhibitssuperiorperformancecomparedtostate-of-artapproachesatpredictingmissinginteractionsfromseveralRNA-RNAinteractiondatabases.AdditionalresultssuggestthatourproposedframeworkcancaptureamanifoldforheterogeneousRNAsequencestodiscovernovelfunctionalannotations.
![Page 68: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/68.jpg)
60
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
HadoopandPySparkforreproducibilityandscalabilityofgenomicsequencingstudies
NicholasR.Wheeler1,PenelopeBenchek1,BrianW.Kunkle2,KaraL.Hamilton-Nelson2,MikeWarfe1,JeremyR.Fondran1,JonathanL.Haines1,WilliamS.Bush1
1CaseWesternReserveUniversity,2UniversityofMiami
WilliamBushModerngenomicstudiesarerapidlygrowinginscale,andtheanalyticalapproachesusedtoanalyzegenomicdataareincreasingincomplexity.Genomicdatamanagementposeslogisticandcomputationalchallenges,andanalysesareincreasinglyreliantongenomicannotationresourcesthatcreatetheirowndatamanagementandversioningissues.Asaresult,genomicdatasetsareincreasinglyhandledinwaysthatlimittherigorandreproducibilityofmanyanalyses.Inthiswork,weexaminetheuseoftheSparkinfrastructureforthemanagement,access,andanalysisofgenomicdataincomparisontotraditionalgenomicworkflowsontypicalclusterenvironments.WevalidatetheframeworkbyreproducingpreviouslypublishedresultsfromtheAlzheimer’sDiseaseSequencingProject.UsingtheframeworkandanalysesdesignedusingJupyternotebooks,Sparkprovidesimprovedworkflows,reducesuser-drivendatapartitioning,andenhancestheportabilityandreproducibilityofdistributedanalysesrequiredforlarge-scalegenomicstudies.
![Page 69: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/69.jpg)
61
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
CERENKOV3:Clusteringandmolecularnetwork-derivedfeaturesimprovecomputationalpredictionoffunctionalnoncodingSNPs
YaoYao,StephenA.Ramsey
OregonStateUniversityYaoYaoIdentificationofcausalnoncodingsinglenucleotidepolymorphisms(SNPs)isimportantformaximizingtheknowledgedividendfromhumangenome-wideassociationstudies(GWAS).Recently,diversemachinelearning-basedmethodshavebeenusedforfunctionalSNPidentification;however,thistaskremainsafundamentalchallengeincomputationalbiology.WereportCERENKOV3,amachinelearningpipelinethatleveragesclustering-derivedandmolecularnetwork-derivedfeaturestoimprovepredictionaccuracyofregulatorySNPs(rSNPs)inthecontextofpost-GWASanalysis.Theclustering-derivedfeature,locussize(numberofSNPsinthelocus),derivesfromourlocuspartitioningprocedureandrepresentsthesizesofclustersbasedonSNPlocations.Wegeneratedtwomolecularnetwork-derivedfeaturesfromrepresentationlearningonanetworkrepresentingSNP-geneandgene-generelations.Basedonempiricalstudiesusingaground-truthSNPdataset,CERENKOV3significantlyimprovesrSNPrecognitionperformanceinAUPRC,AUROC,andAVGRANK(alocus-wiserank-basedmeasureofclassificationaccuracywepreviouslyproposed).
![Page 70: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/70.jpg)
62
PRECISIONMEDICINE:ADDRESSINGTHECHALLENGESOFSHARING,ANALYSIS,ANDPRIVACYATSCALE
PROCEEDINGSPAPERSWITHPOSTERPRESENTATIONS
![Page 71: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/71.jpg)
63
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
AnomiGAN:GenerativeAdversarialNetworksforAnonymizingPrivateMedicalData
HoBae,DahuinJung,Hyun-SooChoi,SungrohYoon
SeoulNationalUniversityHoBae
Typicalpersonalmedicaldatacontainssensitiveinformationaboutindividuals.Storingorsharingthepersonalmedicaldataisthusoftenrisky.Forexample,ashortDNAsequencecanprovideinformationthatcanidentifynotonlyanindividual,butalsohisorherrelatives.Nonetheless,mostcountriesandresearchersagreeonthenecessityofcollectingpersonalmedicaldata.Thisstemsfromthefactthatmedicaldata,includinggenomicdata,areanindispensableresourceforfurtherresearchanddevelopmentregardingdiseasepreventionandtreatment.Topreventpersonalmedicaldatafrombeingmisused,techniquestoreliablypreservesensitiveinformationshouldbedevelopedforrealworldapplications.Inthispaper,weproposeaframeworkcalledanonymizedgenerativeadversarialnetworks(AnomiGAN),topreservetheprivacyofpersonalmedicaldata,whilealsomaintaininghighpredictionperformance.Wecomparedourmethodtostate-of-the-arttechniquesandobservedthatourmethodpreservesthesamelevelofprivacyasdifferentialprivacy(DP)andprovidesbetterpredictionresults.Wealsoobservedthatthereisatrade-offbetweenprivacyandpredictionresultsthatdependsonthedegreeofpreservationoftheoriginaldata.Here,weprovideamathematicaloverviewofourproposedmodelanddemonstrateitsvalidationusingUCImachinelearningrepositorydatasetsinordertohighlightitsutilityinpractice.Thecodeisavailableathttps://github.com/hobae/AnomiGAN/
![Page 72: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/72.jpg)
64
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
FrequencyofClinVarpathogenicvariantsinchronickidneydiseasepatientssurveyedforreturnofresearchresultsataClevelandpublichospital
DanaC.Crawford1,2,3,,JohnLin1,JessicaN.CookeBailey1,2,TylerKinzy1,JohnR.Sedor4,5,JohnF.O'Toole5,WilliamS.Bush1,2,3
1ClevelandInstituteforComputationalBiology,2DepartmentsofPopulationandQuantitative
HealthSciences,and3GeneticsandGenomeSciences,CaseWesternReserveUniversity4DepartmentofPhysiologyandBiophysics,CaseWesternReserveUniversity;and5DepartmentofNephrologyandHypertension,GlickmanUrologyandKidneyandLernerResearchInstitute,
ClevelandClinicDanaCrawfordReturnofresultsisnotcommoninresearchsettingsasstandardsarenotyetinplaceforwhattoreturn,howtoreturn,andtowhom.Asapioneeroflarge-scaleofreturnofresearchresults,thePrecisionMedicineInitiativeCohortnowknownofAllofUsplanstoreturnpharmacogenomicresultsandvariantsofclinicalsignificancetoitsparticipantsstartinglate2019.Tobetterunderstandthelocallandscapeofpossibilitiesregardingreturnofresearchresults,weassessedthefrequencyofpathogenicvariantsandAPOL1renalriskvariantsinasmalldiversecohortofchronickidneydiseasepatients(CKD)ascertainedfromapublichospitalinCleveland,OhiogenotypedontheIlluminaInfiniumMegaEX.Ofthe23,720ClinVar-designatedvariantsdirectlyassayedbytheMegaEX,8,355(35%)hadatleastonealternatealleleinthe130participantsgenotyped.Ofthese,18ClinVarvariantsdeemedpathogenicbymultiplesubmitterswithnoconflictsininterpretationweredistributedacross27participants.ThemajorityofthesepathogenicClinVarvariants(14/18)wereassociatedwithautosomalrecessivedisorders.OfnotewerefourAfricanAmericancarriersofTTRrs76992529associatedwithamyloidogenictransthyretinamyloidosis,otherwiseknownasfamilialtransthyretinamyloidosis(FTA).FTA,anautosomaldominantdisorderwithvariablepenetrance,ismorecommonamongAfrican-descentpopulationscomparedwithEuropean-descentpopulations.AlsocommoninthisCKDpopulationwereAPOL1renalriskallelesG1(rs73885319)andG2(rs71785313)with60%ofthestudypopulationcarryingatleastonerenalriskallele.BothpathogenicClinVarvariantsandAPOL1renalriskallelesweredistributedamongparticipantswhowantedactionablegeneticresultsreturned,wantedgeneticresultsreturnedregardlessofactionability,andwantednoresultsreturned.Resultsfromthislocalgeneticstudyhighlightchallengesinwhichvariantstoreport,howtointerpretthem,andtheparticipant’spotentialforfollow-up,onlysomeofthechallengesinreturnofresearchresultslikelyfacinglargerstudiessuchasAllofUs.
![Page 73: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/73.jpg)
65
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
Network-BasedMatchingofPatientsandTargetedTherapiesforPrecisionOncology
QingzhiLiu1,MinJinHa2,RupamBhattacharyya1,LanaGarmire3,VeerabhadranBaladandayuthapani1
1DepartmentofBiostatistics,UniversityofMichigan;2DepartmentofBiostatistics,The
UniversityofTexasMDAndersonCancerCenter;3DepartmentofComputationalMedicineandBioinformaticsUniversityofMichigan
QingzhiLiuTheextensiveacquisitionofhigh-throughputmolecularprofilingdataacrossmodelsystems(humantumorsandcancercelllines)anddrugsensitivitydata,makesprecisiononcologypossible–allowingclinicianstomatchtherightdrugtotherightpatient.Currentsupervisedmodelsfordrugsensitivityprediction,oftenusecelllinesasexemplarsofpatienttumorsandformodeltraining.However,thesemodelsarelimitedintheirabilitytoaccuratelypredictdrugsensitivityofindividualcancerpatientstoalargesetofdrugs,giventhepaucityofpatientdrugsensitivitydatausedfortestingandhighvariabilityacrossdifferentdrugs.Toaddressthesechallenges,wedevelopedamultilayernetwork-basedapproachtoimputeindividualpatients’responsestoalargesetofdrugs.Thisapproachconsidersthetripletofpatients,celllinesanddrugsasoneinter-connectedholisticsystem.Wefirstusetheomicsprofilestoconstructapatient-celllinenetworkanddeterminebestmatchingcelllinesforpatienttumorsbasedonrobustmeasuresofnetworksimilarity.Subsequently,theseresultsareusedtoimputethe“missinglink”betweeneachindividualpatientandeachdrug,calledPersonalizedImputedDrugSensitivityScore(PIDS-Score),whichcanbeconstruedasameasureofthetherapeuticpotentialofadrugortherapy.Weappliedourmethodtotwosubtypesoflungcancerpatients,matchedthesepatientswithcancercelllinesderivedfrom19tissuetypesbasedontheirfunctionalproteomicsprofiles,andcomputedtheirPIDS-Scoresto251drugsandexperimentalcompounds.Weidentifiedthebestrepresentativecelllinesthatconservelungcancerbiologyandmoleculartargets.ThePIDS-Scorebasedtopsensitivedrugsfortheentirepatientcohortaswellasindividualpatientsarehighlyrelatedtolungcancerintermsoftheirtargets,andtheirPIDS-Scoresaresignificantlyassociatedwithpatientclinicaloutcomes.Thesefindingsprovideevidencethatourmethodisusefultonarrowthescopeofpossibleeffectivepatient-drugmatchingsforimplementingevidence-basedpersonalizedmedicinestrategies.
![Page 74: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/74.jpg)
66
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
Phenome-wideassociationstudiesoncardiovascularhealthandfattyacidsconsideringphenotypequalitycontrolpracticesforepidemiologicaldata
KristinPassero1,XiHe1,JiayanZhou1,BertramMueller-Myhsok2,3,4,MarcusE.Kleber5,WinfriedMaerz5,6,7,MollyA.Hall1
1PennState;2MaxPlanckInstituteofPsychiatry;3MunichClusterofSystemsBiology;4University
ofLiverpool;5HeidelbergUniversity;6SYNLABAcademy;7MedicalUniversityofGrazKristinPasseroPhenome-wideassociationstudies(PheWAS)allowagnosticinvestigationofcommongeneticvariantsinrelationtoavarietyofphenotypesbutpreservingthepowerofPheWASrequirescarefulphenotypicqualitycontrol(QC)procedures.WhileQCofgeneticdataiswell-defined,noestablishedQCpracticesexistformulti-phenotypicdata.Manuallyimposingsamplesizerestrictions,identifyingvariabletypes/distributions,andlocatingproblemssuchasmissingdataoroutliersisarduousinlarge,multivariatedatasets.Inthispaper,weperformtwoPheWASonepidemiologicaldataand,utilizingthenovelsoftwareCLARITE(CLeaningtoAnalysis:Reproducibility-basedInterfaceforTraitsandExposures),showcaseatransparentandreplicablephenomeQCpipelinewhichwebelieveisanecessityforthefield.UsingdatafromtheLudwigshafenRiskandCardiovascular(LURIC)HealthStudywerantwoPheWAS,oneoncardiac-relateddiseasesandtheotheronpolyunsaturatedfattyacidslevels.Thesephenotypesunderwentastringentqualitycontrolscreenandwereregressedonagenome-widesampleofsinglenucleotidepolymorphisms(SNPs).SevenSNPsweresignificantinassociationwithdihomo-γ-linolenicacid,ofwhichfivewerewithinfattyaciddesaturasesFADS1andFADS2.PheWASisausefultooltoelucidatethegeneticarchitectureofcomplexdiseasephenotypeswithinasingleexperimentalframework.However,toreducecomputationalandmultiple-comparisonsburden,carefulassessmentofphenotypequalityandremovaloflow-qualitydataisprudent.HereinweperformtwoPheWASwhileapplyingadetailedphenotypeQCprocess,forwhichweprovideareplicablepipelinethatismodifiableforapplicationtootherlargedatasetswithheterogenousphenotypes.Asinvestigationofcomplextraitscontinuesbeyondtraditionalgenomewideassociationstudies(GWAS),suchQCconsiderationsandtoolssuchasCLARITEarecrucialtotheintheanalysisofnon-geneticbigdatasuchasclinicalmeasurements,lifestylehabits,andpolygenictraits.
![Page 75: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/75.jpg)
67
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
aTEMPO:Pathway-SpecificTemporalAnomaliesforPrecisionTherapeutics
ChristopherMichaelPietras,LiamPower,DonnaK.Slonim
TuftsUniversityChristopherPietrasDynamicprocessesareinherentlyimportantindisease,andidentifyingdisease-relateddisruptionsofnormaldynamicprocessescanprovideinformationaboutindividualpatients.Wehavepreviouslycharacterizedindividuals'diseasestatesviapathway-basedanomaliesinexpressiondata,andwehaveidentifieddisease-correlateddisruptionofpredictabledynamicpatternsbymodelingavirtualtimeseriesinstaticdata.Herewecombinethetwoapproaches,usingananomalydetectionmodelandvirtualtimeseriestoidentifyanomaloustemporalprocessesinspecificdiseasestates.Wedemonstratethatthisapproachcaninformativelycharacterizeindividualpatients,suggestingpersonalizedtherapeuticapproaches.
![Page 76: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/76.jpg)
68
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
FeatureSelectionandDimensionReductionofSocialAutismData
PeterWashington1,KelleyMariePaskov1,HaikKalantarian1,NathanielStockham1,CatalinVoss1,AaronKline1,RitikPatnaik2,BriannaChrisman1,MayaVarma1,QandeelTariq1,Kaitlyn
Dunlap1,JesseySchwartz1,NickHaber1,DennisP.Wall1
1StanfordUniversity,2MassachusettsInstituteofTechnology
PeterWashingtonAutismSpectrumDisorder(ASD)isacomplexneuropsychiatricconditionwithahighlyheterogeneousphenotype.FollowingtheworkofDudaetal.,whichusesareducedfeaturesetfromtheSocialResponsivenessScale,SecondEdition(SRS)todistinguishASDfromADHD,weperformeditem-levelquestionselectiononanswerstotheSRStodeterminewhetherASDcanbedistinguishedfromnon-ASDusingasimilarlysmallsubsetofquestions.ToexplorefeatureredundanciesbetweentheSRSquestions,weperformedfilter,wrapper,andembeddedfeatureselectionanalyses.ToexplorethelinearityoftheSRS-relatedASDphenotype,wethencompressedthe65-questionSRSintolow-dimensionrepresentationsusingPCA,t-SNE,andadenoisingautoencoder.Wemeasuredtheperformanceofamulti-layerperceptron(MLP)classifierwiththetop-rankingquestionsasinput.Classificationusingonlythetop-ratedquestionresultedinanAUCofover92%forSRS-deriveddiagnosesandanAUCofover83%fordataset-specificdiagnoses.Highredundancyoffeatureshaveimplicationstowardsreplacingthesocialbehaviorsthataretargetedinbehavioraldiagnosticsandinterventions,wheredigitalquantificationofcertainfeaturesmaybeobfuscatedduetoprivacyconcerns.WesimilarlyevaluatedtheperformanceofanMLPclassifiertrainedonthelow-dimensionrepresentationsoftheSRS,findingthatthedenoisingautoencoderachievedslightlyhigherperformancethanthePCAandt-SNErepresentations.
![Page 77: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/77.jpg)
69
ATRIFICIALINTELLIGENCEFORENHANCINGCLINICALMEDICINE
POSTERPRESENTATIONS
![Page 78: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/78.jpg)
70
ArtificialIntelligenceforEnhancingClinicalMedicine
PrioritizingCopyNumberVariantsusingPhenotypeandGeneFunctionalSimilarity
AzzaAlthagafi,JunChen,RobertHoehndorf
Computer,Electrical&MathematicalScienceandEngineeringDivision(CEMSE),ComputationalBioscienceResearchCenter(CBRC),KingAbdullahUniversityofScienceandTechnology
(KAUST),4700KAUST,23955-6900,Thuwal,KingdomofSaudiArabia
AzzaAlthagafiTherearemanytypesofgeneticvariationinthehumangenome,rangingfromlargechromosomeanomaliestoSingleNucleotideVariant(SNV).Itisbecomingnecessarytodevelopmethodsfordistinguishingdisease-causingvariantsfromalargenumberofneutralgeneticvariationinanindividual.ThisproblemisalsorelevanttoCopyNumberVariants(CNVs),whichisaclassofgeneticvariationwherelargesegmentsofthegenomedifferincopynumberamongstvariousindividuals.Overthepastseveralyears,muchprogresshasbeenmadeintheareaofCNVsdetectionandunderstandingtheirroleinhumandiseases.WenowunderstandthatCNVsaccountformuchofhumanvariability.Correspondingly,therehavebeenseveralmethodsintroducedtofinddisease-associatedgenesandSNVs.DifferentmethodshavebeendevelopedforpredictingandprioritizingpathogenicityofSNVsfoundwithinagenome.ConstructingsimilarmethodsforCNVischallengingduetotheheterogeneityinvariantsize,typeandthepossibilityofmultiplegenesbeingaffectedbylargeCNVs.CNVimpactpredictionmethodsshouldconsiderthesefactorsinordertorobustlyprioritizepathogenicvariants.Wehavebuiltamethodthatincorporatesbiologicalbackgroundknowledgeabouttherelationbetweenphenotypesresultingfromalossoffunctioninmousegenes,genefunctionsasdescribedusingtheGeneOntology(GO),aswellastheanatomicalsiteofgeneexpressionalongwithascorethatpredictsthepathogenicityofCNVSVScore.WeusethisinformationtobuildamachinelearningmodelthatranksCNVsbasedontheirpredictedpathogenicityandtherelationbetweengenesaffectedbytheCNVandthephenotypeweobserveinaffectedindividuals.Additionally,ourapproachconsidersseveralgenomicfeaturesofeachCNVs,suchasthelengthofthecodingsequenceoverlappingwiththeCNV,haploinsufficiencyandtriplosensitivityscorestomeasurethedosage-sensitivityforgenes/regions,andGCcontent.Ourresultsshowthatincorporatingthisinformationleadstoimprovementoverabaselinemodelwhichusesonlysimilarityscoresbetweengene--phenotypeassociationsanddisease-associatedphenotypes,aswellasimprovementoverusingonlypathogenicitypredictionmethodsforCNVs.OurmethodachievesanF-scoreof80.85%,with82.05%precisionand79.67%recallinourevaluationset.Theresultsdemonstratethatincorporatingphenotype,functional,andgeneexpressioninformationmaybeutilizedtoidentifycausativeCNVs.Futureworkisrequiredtoevaluateandimproveourmodelusingpatient-derivedWGSdata.
![Page 79: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/79.jpg)
71
ArtificialIntelligenceforEnhancingClinicalMedicine
InferringtheRewardFunctionsthatGuideCancerProgression
JohnKalantari1,HeidiNelson2,NicholasChia3
1MicrobiomeProgram,CenterforIndividualizedMedicine,MayoClinic,Rochester,MN,USA;2ColonandRectalSurgery,MayoClinic,Rochester,MN,USA;3DivisionofSurgicalResearch,
DepartmentofSurgery,MayoClinic,Rochester,MN,USA
JohnKalantariCancercanoccurinpatientswithdifferentgeneticbackgroundsviaamulti-stepevolutionaryprocess,i.e.,drivenbymodificationandselection,thatcanaccumulatedifferentgeneticalterations.Despitethesedifferences,manycancersubtypesareunifiedbysimilarmechanismsortypesofgeneticchanges.Inotherwords,therearemultipleetiologicalpathstiedtogetherbyspecificeventsthatsharecommonalityintheircausalmechanism.Understandingthesecommonmechanismswillenablethedevelopmentofbettertherapiesandpreventativemeasures.Itwillalsoenableimprovedpredictionofrecurrenceandmetastaticadvancementofcancer,directlyimpactingthe606,880annualcancerdeathsintheUnitedStatesalone.OurworkisbuiltuponthecentralpropositionthattheMarkovDecisionProcess(MDP)canbetterrepresenttheprocessbywhichcancerarisesandprogresses.Morespecifically,byencodingacancercell'scomplexbehaviorasaMDP,weseektomodeltheseriesofgeneticchanges,orevolutionarytrajectory,thatleadstocancerasanoptimaldecisionprocess.WepositthatusinganInverseReinforcementLearning(IRL)approachwillenableustoreverseengineeranoptimalpolicyandrewardfunctionbasedonasetofexpertdemonstrationsextractedfromtheDNAofpatienttumors.Theinferredrewardfunctionandoptimalpolicycansubsequentlybeusedtoextrapolatetheevolutionarytrajectoryofanytumor.Weintroduceanoveldata-agnosticartificialintelligenceframeworkwhichcaninferrewardfunctionsdescribingthecausalmechanismsthatbestexplaintheobservedbehaviorofan'optimally-behavingagent'–thecancercell.Usingmulti-omicdatafrom27colorectalcancer(CRC)patientsasproof-of-principle,weshowthatIRLprovidesasystematicandscalableapproachtoformallystatingandsolvingtheproblemofcancerevolution.Byprovidingalineagepath(i.e.,sequencesofalterations)obtainedviasubclonalreconstructionforeachtumor,weareabletoreducethiscomplexproblemtotherecoveryofanassociatedreinforcementlearningrewardfunction.Theserewardfunctionshavethepotentialtomodelunknownmolecularmechanismsdrivingintratumorheterogeneityandtoelucidatecanceretiologies.
![Page 80: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/80.jpg)
72
ArtificialIntelligenceforEnhancingClinicalMedicine
Predictingdisease-associatedmutationofmetal-bindingsitesinproteinsusingadeeplearningapproach
MohamadKoohi-Moghadam,HaiboWang,YuchuanWang,XinmingYang,HongyanLi,JunwenWang,HongzheSun
DepartmentofChemistry,TheUniversityofHongKong,HongKong,China;
DepartmentofHealthSciences,MayoClinic,Scottsdale,AZ,USA;DepartmentofMolecularPharmacologyandExperimentalTherapeutics,MayoClinic,
Scottsdale,AZ,USA;CenterforIndividualizedMedicine,MayoClinic,Scottsdale,AZ,USA;
CollegeofHealthSolutions,ArizonaStateUniversity,Scottsdale,AZ,USA
JunwenWangMetalloproteinsplayimportantrolesinmanybiologicalprocesses.Mutationsatthemetal-bindingsitesmayfunctionallydisruptmetalloproteins,initiatingseverediseases;however,thereseemedtobenoeffectiveapproachtopredictsuchmutationsuntilnow.Herewedevelopadeeplearningapproachtosuccessfullypredictdisease-associatedmutationsthatoccuratthemetal-bindingsitesofmetalloproteins.Wegenerateenergy-basedaffinitygridmapsandphysiochemicalfeaturesofthemetalbindingpockets(obtainedfromdifferentdatabasesasspatialandsequentialfeatures)andsubsequentlyimplementthesefeaturesintoamultichannelconvolutionalneuralnetwork.Aftertrainingthemodel,thenetworkcansuccessfullypredictdisease-associatedmutationsthatoccuratthefirstandsecondcoordinationspheresofzinc-bindingsiteswithanareaunderthecurveof0.90andanaccuracyof0.82.Ourapproachstandsforthefirstdeeplearningapproachforthepredictionofdisease-associatedmetal-relevantsitemutationsinmetalloproteins,providinganewplatformtotacklehumandiseases.
![Page 81: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/81.jpg)
73
ArtificialIntelligenceforEnhancingClinicalMedicine
GENERAL
POSTERPRESENTATIONS
![Page 82: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/82.jpg)
74
General
RankingRASpathwaymutationsusingevolutionaryhistoryofMEK1
KatiaAndrianova,IgorJouline
OhioStateUniversity,DepartmentofMicrobiology,Columbus,Ohio43210
KatiaAndrianovaTheRas/MAPK(ratsarcoma/mitogen-activatedproteinkinase)signalingpathwayisinvolvedinessentiallyallaspectsoforganismaldevelopment,fromthefirstcelldivisionsintheearlyembryotopostnataldevelopmentandgrowth.Givenitscriticalfunction,itisnotsurprisingthatderegulatedRas/MAPKsignaling,resultingfromeithergeneticorenvironmentalperturbations,canleadtocanceranddevelopmentalabnormalities.Alargeclassofsuchabnormalities,knownasRASopathies,isassociatedwithactivatinggerm-linemutationsinmanycomponentsoftheRaspathway.Overthepastdecadewhennextgenerationsequencing(NGS)hasbecomevaluableandcost-effectivetoolforresearchapplicationsandclinicaldiagnosticsofMendeliandiseases,simultaneoussequencingofmultiplegenesinMAPKsignalingpathwayshaveyieldedmanyreportswithhundredsofmutationspossiblyassociatedwithRASopathiesandcancer.Inparticular,multiplenewmutationswereidentifiedinMEK1kinase.Themajorityofnewlydiscoveredcodingvariationsneitherhavebeendescribedinotherindividualsnorhavebeenstudiedorfunctionallyanalyzedincellularoranimalmodels,thusleavingclinicianstorelyoninsilicopredictionsofthe“variantsofuncertainsignificance”consequenceswithcomputationalsoftware,suchasPolyPhenandSIFT.Automatedsequencesearchesusedinthesemethodsdonotdistinguishpossibleduplicationeventsinthegenes’histories,hencemultiplesequencealignment(MSA)setsusuallyincludebothorthologandparalogcopies.Aspurifyingselectiontreadsononeoftheduplicatecopyitcanbecomeassociatedwithadifferentphenotypecomparedtoitsparalogoussiblingand/ortotheparentalgene.InmostcasesofMendeliandiseasesonlyonespecificduplicateofthegeneinthehumangenomeresultstobeassociatedwithadisease.Thisindicatestheimportanceofconsideringbothcommonancestorsandanygene’sduplicationhistoryforthevariantsinterpretation.ThepresenceofsevenhumanMEKproteinsincreasesthechancesofincludingparalogsintotheanalysis,andtherefore,substantiallylimitsmutationinterpretation.InthisstudyweestablishedthefirstprecisedescriptionofanevolutionaryhistoryofMEKkinasesandidentifiedpotentialduplicationevents.WedeterminedthatMEK1isanancestoroftheentireMEKfamily.Indepthanalysisoftheorthologousproteinsshowedthatessentiallyallexperimentallyprovenpathogenicmutationswerepredictedas“damaging”byourapproach.BycomparingourresultswiththepredictionsmadebyPolyPhen-2andSIFTweshowedhowcarefulanalysisofanevolutionaryhistoryofagenemayimproveaccuracyofmissensemutationsoutcomesprediction.
![Page 83: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/83.jpg)
75
GeneraNlGeneral
IntegrativeAnalysisofCOPDandLungCancerMetadataRevealsSharedAlterationsinImmuneResponse,PTENandPI3K-AKTPathways}
DannielleSkander1,ArdaDurmaz1,MohammedOrloff2,GurkanBebek1
1CaseWesternReserveUniversity,2UniversityofArkansasforMedicalSciences
GurkanBebekChronicobstructivepulmonarydisease(COPD)andlungcancerareamongtheleadingcausesofdeathworldwide.Whileitisbelievedthetwodiseasesarerelated,themechanismsbehindthisrelationshipremainunclear.WeinvestigatetherelationshipbetweenCOPDandlungcancerusinganintegrative-omicsapproach.IntegrationofepigeneticandmRNAgeneexpressiondataallowsustodiscoverthefunctionallyrelevantgenes,i.e.,thegenescrucialfordiseasedevelopment.Usingthisapproach,ourstudysuggeststhatthemechanismsdrivingthedevelopmentofbothdiseasesarerelatedtotheinterleukinimmuneresponse(IL4andIL17),PTENandPI3K-AKTpathways.UnderstandingthisrelationshipbetweenCOPDandlungcanceriscrucialforfuturepreventionandtreatmentoptionsofbothCOPDandlungcancer.
![Page 84: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/84.jpg)
76
General
Investigatingsourcesofirreproducibilityinanalysisofgeneexpressiondata
CarlyA.Bobak,JaneE.Hill
DartmouthCollege
CarlyBobakTheuseofbigdatapromisestochangethelandscapeofbiomedicalresearch;however,irreproducibilityofresultsremainsaproblem.Inthiswork,wesetouttoinvestigateproposedmethodstoincreasereproducibilityofgeneexpressionresults.Specifically,wetestthefollowingthreehypotheses:Resultsfrompathwayenrichmentwillbemoresimilaracrossdatasetsthanresultsondifferentiallyexpressed(DE)genesSimilarityacrosssmallerdatasetswillbelowerthansimilarityinlargerdatasetsResultsfrommulti-cohortdatawillbemoresimilarthanresultsfromsinglecohortdataWeselectedthreeuniquedatasetsfromtheGeneExpressionOmnibusthatincludeactiveTBpatients,spanningpediatricandadultpatients.IneachdatasetwerankedDEgenesastheywereassociatedwithTBvsother(healthycontrols,otherdiseases,orlatenttuberculosisinfection).Wethencalculatedtherankbiasedoverlap(RBO)oftherankedgenesacrosseachdataset.RBOisasimilaritymeasurescaledbetween0and1andcanbeinterpretedastheaverageagreementbetweentwolists.Genesetenrichmentanalysis(GSEA)wasperformed,andwecalculatedarankforthepathwayhitsandcomparedRBOforassociatedpathwaysbetweendatasets.Onaverage,theRBOincreasedbyafoldchangeof1.83×10^4whencomparingsimilarityofassociatedpathwaystosimilarityofDEgenes.Wethendividedeachdatasetinhalfandrepeatedtheanalysisonallsub-datasets.Sub-datasetsfromthesameparentdatasethadsimilarresults(meanRBOof0.60,sd=0.24)asopposedtosubsetsfromadifferentparentdataset(mean=0.10,sd=0.15).Contradictingouroriginalhypothesis,overallRBOcalculatedbetweensubsetsfromdifferentparentdatasetsdidnotnecessarilydecreasecomparedtotheinitialRBOcalculation–infact,halfoftheRBOcomparisonsincreasedinthesub-datasetscomparedtousingthewholedatasets.Totestthefinalhypothesis,weco-normalized,merged,andthenrandomlydivideddatasetsintothreeapproximatelyequalpieces.WerepeatedtheDEanalysisoneachpieceofthemergeddataset.Acrossmixeddatasets,themeanRBOwas0.023(sd=0.43).Heterogeneousdatasetsweremorealikethanuniquedatasets,butlessalikethanasingledivideddataset.However,theRBOsfrommixeddatasetscomparedtooriginaldatasetswerenotstatisticallysignificantlydifferentfromtheRBOscomparingresultsfromtheoriginaldatasets.Thus,wedemonstratedthatassociatedpathwaysaregreatlymorereproduciblethanassociatedgenes.Furtherstudyisnecessarytoinvestigatetheconditionsunderwhichstatisticalpowerandheterogeneityofdatainfluencereproducibilityoffindingsfromgeneexpressionstudies.
![Page 85: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/85.jpg)
77
General
EthereumandMultiChainblockchainsassecuretoolsforindividualizedmedicine
CharlotteBrannon,GamzeGursoy,SarahWagner,MarkGerstein
YaleUniversityComputationalBiologyandBioinformaticsProgram
CharlotteBrannonWiththerapidlydecreasingcostofgenomesequencingandadventofindividualizedmedicine,relianceonindividualgenomicdatawillsoonbeintegraltomedicaltreatmentdecisions.Forexample,apatient’spersonalgenomicsequencewillprovidephysicianswithinformationonwhichtobasetestsanddiagnoses.Similarly,pharmacogenomicsdatawillrevealthemosteffectiveprescriptionsforaparticularpatient.Genomicdatawillneedtobesharedefficientlyamongmultipleparties.However,becausethesearesensitivepersonaldatawhichwilldirectlyimpactmedicaltreatmentdecisions,theymustbemaintainedinasecure,high-integrityfashion.Blockchaintechnologyisonewaytoachievesecure,high-integritydatastorage.Wepresenttwoproof-of-conceptsolutions,oneforstoringandqueryingpersonalgenomicsequencedatainaMultiChainblockchaindesignedfordirectsharingwithphysicians;andoneforstoringandqueryinggene-druginteractiondatainanEthereumblockchainsmartcontractdesignedforsharedaccessamongpermissionedresearchersandphysicians.Despitethehighsecurityandintegritythatcomeswithblockchaindatastorage,thereisatrade-offwithdataaccessefficiencyandstoragecosts.Weovercomethesechallengesbydevelopingnovelstoragetechniques.Whenstoringpersonalgenomicsequencedata,wedonotstoretheactualsequencedatabutratherasetofmeta-datawhichcanbeusedincombinationwithareferencegenometoreconstructtheoriginalsequences.Whenstoringpharmacogenomicsdata,weuseanindex-based,multi-mappingapproachtoprovidetime-andspace-efficientinsertionandquerying.
![Page 86: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/86.jpg)
78
General
GenomicpredictorsofL-asparaginase-inducedpancreatitisinpediatriccancerpatients
BrittI.Drögemöller,GalenE.B.Wright,ShahradR.Rassekh,ShinyaIto,BruceC.Carleton,ColinJ.D.Ross,TheCanadianPharmacogenomicsNetworkforDrugSafetyConsortium
FacultyofPharmaceuticalSciences,UniversityofBritishColumbia,Vancouver,BC,Canada;BCChildren’sHospitalResearchInstitute,UniversityofBritishColumbia,Vancouver,BC,Canada;DepartmentofPediatrics,FacultyofMedicine,UniversityofBritishColumbia,Vancouver,BC,Canada;ClinicalPharmacologyandToxicology,TheHospitalforSickChildren,Universityof
Toronto,Toronto,ON,Canada;PharmaceuticalOutcomesProgramme,BCChildren’sHospital,Vancouver,BC,Canada
BrittDrogemollerBackground:L-asparaginaseishighlyeffectiveinthetreatmentofpediatricacutelymphoblasticleukemia.Unfortunately,theuseofthistreatmentislimitedbytheoccurrenceofpancreatitis,asevereandpotentiallylethaladversedrugreaction,whichoccursin2-18%ofpatients.AspreviousstudieshavebeenunabletoidentifystrongassociationsbetweenclinicalvariablesandsusceptibilitytoL-asparaginase-inducedpancreatitis,geneticfactorsareexpectedtoplayanimportantrolethisadversedrugreaction.Objectives:WesoughttoexploretheroleofthesegeneticsusceptibilityfactorstoL-asparaginase-inducedpancreatitisinpediatriccancerpatients.Methods:PatientswhoweretreatedwithL-asparaginasewererecruitedfrom13pediatriconcologyunitsacrossCanada(n=284)andextensiveclinicaldatawerecollectedforallpatients.GenotypingwasperformedusingtheIlluminaHumanOmniExpressandGlobalScreeningArraysandpancreaticgeneexpressionprofileswereimputedintheseindividualsusingGTExv7andS-PrediXcan.Genome-andtranscriptome-wideassociations(GWASandTWAS)wereperformedtoidentifyassociationswithL-asparaginase-inducedpancreatitis.Results:GWASanalysesidentifiedsignificantassociationsbetweengeneticvariantsinHLA-DQA1and–DRB1andpancreatitis,whileTWASrevealedthatindividualsexperiencingL-asparaginase-inducedpancreatitisexhibitedlowerexpressionlevelsofHLA-DRB5.FurtherinterrogationoftheTWASdatarevealedanenrichmentingenesinvolvedinthesomaticdiversificationofimmunereceptors.Conclusions:Theseanalysesuncoveredanassociationbetweengeneticvariationinimmune-relatedgenesandthedevelopmentofL-asparaginase-inducedpancreatitis.TheseassociationsmirrorpreviousassociationswiththeHLAregionand(i)pancreatitisinducedbyotherdrugsand(ii)L-asparaginase-inducedhypersensitivity.
![Page 87: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/87.jpg)
79
General
NITECAP:Anovelmethodandinterfacefortheidentificationofcircadianbehaviorinhighlyparalleltime-coursedata
ThomasG.Brooks1,CrisW.Lawrence1,NicholasF.Lahens1,SoumyashantNayak1,DimitraSarantopoulou1,GarretA.FitzGerald1,2,GregoryR.Grant3
1InstituteforTranslationalMedicineandTherapeutics(ITMAT),UniversityofPennsylvania;
2SystemsPharmacologyandTranslationalTherapeutics;3DepartmentofGenetics,UniversityofPennsylvania
ThomasBrooksWeintroduceanewtoolcalledNITECAPforthetaskofidentifyingcircadianbehaviorinmassivelyparallelmeasurementsofbiologicalentities;forexample,findingcircadiangenesfromgeneexpressiontimecoursedatameasuredbyRNA-Seqormicroarrays.NITECAPemploysapermutation-basedapproachwhichusesanovelstatisticdesignedtobesensitivetocircadianbehavior.NITECAPalsousesanapproachtomultiple-testingwhichproducesq-valuesdirectlywithoutneedingtofirstgeneratep-valueswhichthenneedtobeadjusted.Ourapproachhasseveraladvantagesparticularlywhenindividualp-valuesareunderpoweredorunreliable.Importantly,wehavedevelopedanintuitiveuser-friendlyweb-basedinterfacewhichenablesinvestigatorstoperformrobustcircadiananalysesofthistypedirectlywithoutexpertinformaticssupport.Userscanquicklyscrollthroughtimecourseprofilessortedbyeffectsize,greatlyfacilitatingthechoiceofsignificancethresholdsthatcurrentlyrequiremakingblindchoicesofnumericalcutoffs.Puttingthistypeofanalysisinthehandsoftheinvestigatorscansignificantlystreamlinetheirresearch.ThewebsitealsoenablestheotherstandardsignificancetestssuchasJTKandANOVAandprovidestoolstoperformcomparativestudies,suchasfindingphaseoramplitudedifferencesbetweendifferentconditions.NITECAPisfreelyavailableforpublicuseat:http://www.nitecap.org
![Page 88: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/88.jpg)
80
General
TheInterplayofObesityandRace/EthnicityonMajorPerinatalComplications
YaadiraBrown,MPH1;OlubodeA.Olufajo,MD,MPH2;EdwardE.CornwellIII,MD2;WilliamSoutherland,PhD3
1ResearchCentersinMinorityInstitutions:HowardUniversity,HowardUniversityCollegeofMedicine;2ResearchCentersinMinorityInstitutions:HowardUniversity,CliveCallender
Howard-HarvardHealthSciencesOutcomesResearchCenter;3ResearchCentersinMinorityInstitutions:HowardUniversity
YaadiraBrownBackground:Ithasbeenestablishedthatasignificantdisparityexistsintheratesofadverseperinataloutcomesacrossdifferentracial/ethnicgroups,withnon-HispanicBlackwomengenerallybeingmostimpacted.Thereisalsoevidencethatobesityisassociatedwithadverseperinataloutcomes.Althoughsomestudieshaveexaminedtheimpactofrace/ethnicityandobesityonadverseperinataloutcomes,moststudieshavedonesousinglocalorstatewidedata.Thisstudyaimstouseanationalsampletodeterminetheroleofobesityintheracial/ethnicdisparitiesseeninadverseperinataloutcomesintheUnitedStates.Methods:DatafromtheNationalInpatientSamplewasutilizedinselectingpregnantwomenadmittedfordeliverybetween2010and2014.Demographics(race/ethnicity,insurancetype,householdincome,co-morbidities)andhospitalcharacteristicswereextracted.Race/ethnicitywascategorizedasNon-HispanicWhites(NHW),Non-HispanicBlacks(NHB),andHispanics.Outcomesofinterestweregestationaldiabetes,pre-eclampsia,pre-termbirth,andhospitalmortality.Multivariatelogisticregressionswereperformedtodeterminetheindependentpredictorsoftheoutcomes,usingtwosetsofmodels;onewhichincludedobesityasavariableinthemodelandonewhichdidnot.ThedifferencesbetweenthetwosetsofmodelswerecomparedbyperformingtheWaldTest.Results:Ourcohortconsistedof15,561,942pregnantindividualsadmittedfordelivery.Therewere9,247,729(59.43%)NHW,2,552,569(16.4%)NHB,and3,761,644(24.17%)Hispanic.Comparedtoothergroups,NHBhadsignificantlyhigherratesofpre-eclampsia(5.1%),pre-termbirth(9.4%),andhospitalmortality(.11%).Theyalsohadthehighestratesofobesity(9.0%).Onmultivariateanalysis,NHBweremorelikelytohavepre-eclampsia(AdjustedOddsRatio[aOR]1.26;95%ConfidenceInterval[CI]1.23-1.29),pre-termbirth(aOR1.38;95%CI1.34-1.41),andhospitalmortality(aOR2.05;95%CI1.2-3.38)whencomparedtoNHW.However,theyhadasimilarriskforgestationaldiabetes(aOR0.94;95%CI0.91-0.96)asNHW.Obesitywassignificantlyassociatedwithgestationaldiabetes(aOR3.08;95%CI3.02-3.15),pre-eclampsia(aOR2.14;95%CI2.09-2.19),andpre-termbirth(aOR1.04;95%CI1.01-1.06).Althoughthedifferenceswereminimal,theregressionmodelsthatincludedobesityasavariablebetterpredictedtheoutcomesthanthosethatdidnotwhenassessinggestationaldiabetes,pre-eclampsia,andpre-termbirth.Conclusion:Thesefindingsfurtherconfirmthatracial/ethnicdisparitiesexistamongstadverseperinataloutcomes,withNHBbeingdisproportionatelyaffected.Theyalsosuggestthatobesityplaysasignificantroleintheracial/ethnicdisparitiesthatdoexistfortheadverseperinataloutcomesmeasured,otherthanhospitalmortality.Thesedatasuggestthataddressingobesityinthepopulationmaybebeneficialinimprovingperinataloutcomes,buttheyalsosuggestthatmoreresearchisneededtoidentifythemajorfactorsthatdrivetheracial/ethnicdisparitiesthatexistamongstperinataloutcomesintheUnitedStates.
![Page 89: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/89.jpg)
81
General
AComparisonofPharmacogenomicInformationinFDA-ApprovedDrugLabelsandCPICGuidelines
KatherineI.Carrillo1,TeriE.Klein2
1HenryM.GunnHighSchool,PaloAlto,CA;2StanfordUniversity,Stanford,CA
KatherineCarrilloPharmacogenomics(PGx)isusefulinhelpingtopredictapatient’slikelyreactiontoamedicationbasedontheirgenotype,allowingforpersonalizedmedicine.TheFDAmaintainsa“TableofPharmacogenomicBiomarkersinDrugLabeling”(https://www.fda.gov/drugs/science-and-research-drugs/table-pharmacogenomic-biomarkers-drug-labeling)consistingofpharmacogenomicinformationfoundinthedruglabeling.However,manylabelsonthelistdonotcontainadviceforaclinicianabouthoworwhentouseapatient’sgeneticinformation.GuidelinescreatedbytheClinicalPharmacogeneticImplementationConsortium(CPIC;https://cpicpgx.org/)containinformationabouthowtousepatientgeneticinformationwhenprescribingdrugs.Also,CPICprovidesguidelinesforsomedrugsnotcurrentlyontheFDAbiomarkerlist,thoughitdoesnotprovideguidelinesforeverydrugonthebiomarkerlist.UsingPharmGKBannotatedFDA-approvedlabels(throughOctober2019),weevaluatedlabelinformationtodetermine(1)whichlabelscontainedanykindofprescribinginformationincludingasuggestedalternatedrug,dosinginformationorspecialconsiderationsbasedonthepatient’sgenotype/metabolizerstatus,(2)whichPharmGKBannotatedlabelswerepresentontheFDAbiomarkerlist,and(3)whatgeneswereinvolved.WedidnotincludeFDAlabelsannotatedforgeneticvariationincancercells;onlygermlinevariationwasincluded.WecomparedallavailableCPICguidelinerecommendationstotheinformationfromthelabels.Weidentifiedwherethelabelsandguidelinesaresimilarornot.PharmGKBhas223annotations(notincluding82annotationsforcancercellDNAvariation)basedon219FDA-approveddruglabels.Ofthese,199labelsarecurrentlyonthebiomarkerlistand17wereonthebiomarkerlistatonetimebuthavebeenremovedbytheFDA.Twentylabelshavedosinginformationand35recommendanalternatedrugbasedongenotype/metabolizerphenotype.Another34labelshavesomeotherspecialconsideration,butmostlabelsonthebiomarkerlist(136)havenoguidanceforcliniciansaboutwhattodoaboutthebiomarker,ifanything.Thereare45drugswithpublishedCPICguidelines(https://cpicpgx.org/genes-drugs/).Thirty-sixofthedrugshavealabelontheFDAbiomarkerlistbuttheinformationonthelabeldoesnotalwaysmatchtheguideline.Only21oftheCPICdrugshavelabelswithguidance.Forsomedrugs,thePGxinformationonthelabelsissimilartotheCPICguidelinesbutdifferentformanyothers.TheFDAbiomarkerlisthasmoredrugsthanCPICguidelineswrittenandinsomecasesthelabelstellclinicianswhentheyshouldtestapatientwhileCPICdoesn’ttalkabouttesting.However,formostdrugs,thelabelsdon’tgivethecliniciansalotofinformationaboutwhattodowiththeirpatients’genetictestresults.ForthedrugswithCPICguidelines,thereismoreinformationabouthowtousegenetictestresultsandwhy.FundedbyNIH/NIGMSR24GM61374.
![Page 90: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/90.jpg)
82
General
xTEA:atransposableelementinsertionanalyzerforgenomesequencingdatafrommultipletechnologies
ChongChu1,RebecaMonroy2,SoohyunLee1,E.AliceLee2,PeterJ.Park1
1HarvardMedicalSchool,2BostonChildren'sHospital
E.AliceLeeTransposableelements(TEs)comprisenearly50%ofthehumangenome.AlthoughmostoftheTEsarenowsilent,severaltypesofretrotransposonsincludingLINE-1,Alu,andSVAarestillactive.SomaticTEinsertionshavebeenshowntooccurfrequentlyinmultipletumortypes[1,2]andatalowrateinneuronsofphenotypicallynormalindividuals[3].MultipletoolshavebeendevelopedtocallTEinsertionsfromgenomesequencingdata,butanefficienttoolthatcanidentifybothgermlineandsomaticTEinsertionswithhighsensitivityandspecificityisstilllacking.Moreover,newertechnologiessuchas10XLinked-ReadandPacBioorNanoporelongreadsequencingprovideanunprecedentedopportunitytostudyTEs;however,currentmethodsdonottakeadvantageofthesedatatypes.Here,wepresentanewcomputationaltoolxTEA,buildingonourpreviousalgorithmTEA[1].ThistoolidentifiesTEinsertionsfromIlluminapaired-endreads,10XLinked-Reads,longreads,oracombineddataset.xTEAoutperformsMELT[4]andTraffic-mem[5]onnormalandtumorIlluminadata,respectively.Acomparisonofdifferentsequencingplatformsrevealsthattheanalysisoflongreadshadgreatersensitivityandspecificity,especiallyinrepetitiveregions.Both10XLinked-ReadsandlongreadsdemonstratedclearadvantagesovershortreadsinconstructingfulllengthTEinsertions.Betterperformancewasachievedonhybriddatacomparedtosingleplatformdata.Using22humansampleswitheitherPacBioorNanoporelongreadsandmatchedshortreads,weuncoveredLINE-1internalSVhotspotsandSVAinternalVNTRexpansion.xTEAisacomprehensivecross-platformTEinsertion-callingtool.Itcanbedeployedonacomputingcluster,AWS,andGoogleCloud,andisefficientforlargecohortanalysis.xTEAispubliclyavailableathttps://github.com/parklab/xTEA.References[1]Lee,Eunjung,etal."Landscapeofsomaticretrotranspositioninhumancancers."Science337.6097(2012):967-971.[2]Rodriguez-Martin,Bernardo,etal."Pan-canceranalysisofwholegenomesrevealsdriverrearrangementspromotedbyLINE-1retrotranspositioninhumantumours."BioRxiv(2017):179705.[3]Evrony,GiladD.,etal."Celllineageanalysisinhumanbrainusingendogenousretroelements."Neuron85.1(2015):49-59.[4]Gardner,EugeneJ.,etal."TheMobileElementLocatorTool(MELT):population-scalemobileelementdiscoveryandbiology."Genomeresearch27.11(2017):1916-1929.[5]Tubio,JoseMC,etal."ExtensivetransductionofnonrepetitiveDNAmediatedbyL1retrotranspositionincancergenomes."Science345.6196(2014):1251343.
![Page 91: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/91.jpg)
83
General
GoGetData(GGD):simple,reproducibleaccesstoscientificdata
MichaelCormier1,JonBelyeu1,BrentPedersen1,JoeBrown1,JohannesKoster2,AaronR.Quinlan1
1DepartmentofHumanGenetics,UniversityofUtah,SaltLakeCity,UT,USA;2Algorithmsforreproduciblebioinformatics,InstituteofHumanGenetics,UniversityofDuisburg-Essen,Essen,
NRW,Germany
AaronQuinlanGenomicsresearchiscomplicatedbythedifficultyofidentifying,collecting,andintegratingthenumerousdatasetsandannotationsgermanetoourexperiments.Furthermore,thesedataexistindisparatesources,andarestoredindiverse,oftenabusedformatspertainingtodifferentgenomebuilds.Thesecomplexitieswastetime,inhibitreproducibility,andcurtailresearchcreativity.Inspiredbythesuccessofsoftwarepackagemanagers,wehavedevelopedGoGetData(GGD;https://gogetdata.github.io/)asafast,reproducibleapproachtoinstallstandardizedpackagesofdataandannotationsforgenomicsresearch.
![Page 92: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/92.jpg)
84
General
GlobalepigenomicregulationofgeneexpressionandcellularproliferationinT-cellleukemia
SinisaDovat,YaliDing,BoZhang,JonathonL.Payne,FengYue
PennsylvaniaStateUniversityCollegeofMedicine,Hershey,PA,USA
SinisaDovatIkarosencodesaDNA-bindingproteinthatfunctionsasatumorsuppressorinT-cellacutelymphoblasticleukemia(T-ALL).Deletionand/orfunctionalinactivationofIkarosresultsinthedevelopmentofhigh-riskleukemia.ThemechanismsthroughwhichIkarosregulatesgeneexpressionandtumorsuppressioninT-ALLareunknown.Ikaroshaplo-knockoutmicedevelopT-ALLwith100%penetrancewitharrestofT-celldifferentiation.DuringtheprocessofmalignanttransformationtoT-ALL,IkaroshaploinsufficientthymocyteslosetheirremainingwildtypeIkarosallele.Re-introductionofIkarosintoIkaros-nullT-ALLcellsresultsincessationofcellularproliferationandinductionofT-celldifferentiation.Thus,thisisanoptimalsystemforstudyingIkarostumorsuppressorfunctionbecauseitcapturestheroleofIkarosinthetransitionfromamalignantstate(Ikaros-nullT-ALL)toanon-malignantstate(followingIkarosre-introduction).WeusedATAC-seqandChIP-seqofH3K4me1,H3K4me3,H3K27ac,andIkarostoperformdynamic,globalepigenomicandgeneexpressionanalysesatseveraltimepointsinIkaros-nullT-ALLandfollowingIkarosre-introductioninordertodeterminethemechanismsofIkaros’tumorsuppressoractivity.ExpressionanalysisidentifiedalargenumberofnovelsignalingpathwaysthataredirectlyregulatedbyIkarosandIkaros-inducedenhancers,andthatareresponsibleforthecessationofproliferationandinductionofT-celldifferentiationinT-ALLcells.EpigenomicanalysisidentifiednovelIkarosfunctionsintheepigeneticregulationofgeneexpression:Ikarosdirectlyregulatesdenovoformationanddepletionofenhancers;denovoformationofactiveenhancersandactivationofpoisedenhancers;andIkarosdirectlyinducestheformationofsuper-enhancers.GlobalanalysisofchromatinaccessibilityrevealedthatIkarosbindingresultedintheopeningofover3400previously-inaccessiblechromatinsites.ThisisaccompaniedbydenovoenrichmentofH3K4me1andH3K4me3modificationsandformationofdenovoenhancersandpromoters.ThesedatademonstratethatIkaroshaspioneeractivityandtriggerscoordinatedregulationofgeneexpression.Ikarospioneeringactivitywasfurtherdeterminedbydirectbindingofikarostoreconstitutednucleosomesbyelectromobilityshiftassay.Dynamicanalysesdemonstratethelong-lastingeffectsofIkaros’DNAbindingonenhanceractivation,denovoformationofenhancersandsuper-enhancers,andchromatinaccessibility.Inconclusion,ourresultsestablishthatIkaros’tumorsuppressorfunctionoccursviaglobalregulationoftheenhancerandsuper-enhancerlandscape,alongwithregulationofchromatinaccessibility,andidentifiednoveltumorsuppressorregulatorypathwaysinT-ALL.
![Page 93: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/93.jpg)
85
General
Apharmacogenomicinvestigationofthecardiacsafetyprofileofondansetroninchildrenandinpregnantwomen
GalenE.B.Wright,BrittI.Drögemöller,JessicaTrueman,KaitlynShaw,MichelleStaub,ShahnazChaudhry,SholehGhayoori,FudanMiao,MichelleHigginson,GabriellaS.S.Groeneweg,JamesBrown,LauraA.Magee,SimonD.Whyte,NicholasWest,SoniaBrodie,Geert’tJong,HowardBerger,ShinyaIto,
ShahradR.Rassekh,ShubhayanSanatani,ColinJ.D.Ross,BruceC.Carleton
BritishColumbiaChildren’sHospitalResearchInstitute,Vancouver,BritishColumbia,Canada;PharmaceuticalOutcomesProgramme,BritishColumbiaChildren’sHospital,Vancouver,BritishColumbia,Canada;Divisionof
TranslationalTherapeutics,DepartmentofPediatrics,UniversityofBritishColumbia,Vancouver,BritishColumbia,Canada;FacultyofPharmaceuticalSciences,UniversityofBritishColumbia,Vancouver,BritishColumbia,Canada;ClinicalResearchUnit,Children'sHospitalResearchInstituteofManitoba,Winnipeg,
Manitoba,Canada;DivisionofClinicalPharmacologyandToxicology,TheHospitalforSickChildren,Toronto,Ontario,Canada;BritishColumbiaWomen’sHospitalandHealthCentre,Vancouver,BritishColumbia,Canada;DepartmentofAnesthesiology,PharmacologyandTherapeutics,UniversityofBritishColumbia,Vancouver,BritishColumbia,Canada;SchoolofLifeCourseSciences,FacultyofLifeSciencesandMedicine,King'sCollege,London,UnitedKingdom;DepartmentofPediatricAnesthesia,BritishColumbiaChildren'sHospital,Vancouver,BritishColumbia,Canada;MaxRadyCollegeofMedicine,RadyFacultyofHealth
Sciences,UniversityofManitoba,Winnipeg,Manitoba,Canada;DepartmentofObstetricsandGynecology,St.Michael'sHospital,Toronto,Ontario,Canada;EpiMethodsConsulting,Toronto,Ontario,Canada;DivisionofCardiology,DepartmentofPediatrics,Children'sHeartCentre,BCChildren'sHospital,UniversityofBritish
Columbia,Vancouver,CanadaGalenWrightBackground:5-HT3receptorantagonists,suchasondansetron,arehighlyeffectivemedicationsforthetreatmentofnauseaandvomiting.However,thesemedicationsarealsoassociatedwithprolongationoftheQTinterval,placingpatientsatriskofcardiacadverseevents.Pharmacogenomicinformationfortherapeuticresponsetoondansetronexists,particularlypertainingtoCYP2D6,butnostudyhasbeenperformedongeneticfactorsthatinfluencethecardiacsafetyofthismedication.Objectives:Determineondansetron-inducedcardiacelectrophysiologicalchangesinthreeuniquepatientcohortsandidentifypharmacogenomicpredictorsofQTintervalprolongation.Methods:Threepatientgroupsreceivingondansetronforthepreventionofnauseaandvomitingwererecruitedandfollowedprospectively(pediatricpost-surgicalpatientsn=101;pediatriconcologypatientsn=98;pregnantwomenn=62).Electrocardiogramswereconductedatbaselineandpost-ondansetronadministration.PharmacogenomicassociationswerethenassessedviaanalysesofcomprehensiveCYP2D6genotypingdataandgenome-wideassociationanalyses.Results:Intheentirecohort,62patients(24.1%)weredefinedascasesbasedonBazett-correctedQTcvalues.Themostsignificantshiftfrombaselineoccurredatfiveminutespost-administration(P=9.8x10-4).Genome-wideanalysesidentifiednovelcandidategenesforthisdrug-inducedphenotype.ThetwomostsignificantassociationswereobservedforamissensevariantinTLR3(rs3775291;P=2.00x10-7)andaneQTLforSLC36A1(rs34124313;P=1.97x10-7).Thesegenesareimplicatedinserotonin-andQT-relatedtraitsandthereforelikelyrepresentbiologicallyrelevantfindings.CYP2D6activityscorewasnotassociatedwithcase-controlstatus.Conclusions:Theresultsofthisstudyprovidethefirststeptowardsunderstandingthegenomicbasisofcardiacchangesoccurringafterondansetronuseinchildrenandpregnantwomen,withtheoverallgoaltoimprovethesafetyofthesecommonlyusedantiemeticmedications.
![Page 94: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/94.jpg)
86
General
TREND:aplatformforexploringproteinfunctioninprokaryotesusingphylogenetics,domainarchitectures,andgeneneighborhoodsinformation.
VadimM.Gumerov,IgorB.Zhulin
TheOhioStateUniversity
VadimGumerovKeystepsinacomputationalstudyofproteinfunctioninvolveanalysisof(i)relationshipsbetweenhomologousproteins,(ii)proteindomainarchitecture,and(iii)geneneighborhoodsthecorrespondingproteinsareencodedin.Eachofthesestepsrequiresaseparatecomputationaltaskandsetsoftools.Combiningtheresultsintoacompleteanalysisisusuallydonebyhand,whichistime-consuminganderror-prone.Herewepresentanewplatform,TREND(tree-basedexplorationofneighborhoodsanddomains),whichcanperformallthenecessarystepsinautomatedfashionandputthederivedinformationintophylogenomiccontext,thusmakingevolutionarybasedproteinfunctionanalysismoreefficient.TRENDisfreelyavailableathttp://trend.zhulinlab.org.TRENDconsistsoftwopipelines:(1)Domains,whichidentifiesproteindomains,transmembraneregionsandlow-complexitysegments,andmapsthisinformationonthephylogenetictree,and(2)Neighborhoods,whichidentifiesgeneneighborhoodsforthegivensetofproteinsequences,clustersthegenesbasedonshareddomainsoftheencodedproteins,identifiesoperonsandputsthederiveddataintophylogenomiccontext.LocallystoreddatabasesofthePfamprofileHiddenMarkovmodels(HMMs)andCDDposition-specificscoringmatricesareusedasasourceofmodelsfordomainsidentification.Anothersourceisarichcollectionofsignal-transductionspecificprofileHMMsderivedfromMiSTdatabase.Thepipelinesarehighlycustomizable.Onstart,bothpipelinesfirstalignprovidedproteinsandbuildphylogenetictrees.Thesestepscanbeskippedifaresearcheralreadyhasanalignmentoratreeandwouldliketousetheminstead.Optionallyredundancyofthesequencescanbereduced.Insteadofproteinsequences,proteinidentifierscanbeprovidedasinput;correspondingsequenceswillbefetchedfromRefSeqandMiSTdatabases.Resultsofthepipelinesarepresentedasinteractivepictureswithcross-linkstoPfam,CDD,RefSeqandMiSTdatabases.Allproducedresultscanbedownloadedforsubsequentanalysis.
![Page 95: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/95.jpg)
87
General
TrackSigFreq:subclonalreconstructionsbasedonmutationsignaturesandallelefrequencies
CaitlinF.Harrigan1,2,4,YuliaRubanova1,2,4,QuaidMorris1,2,3,4,5,6,AlinaSelega2,4
1DepartmentofComputerScience,UniversityofToronto,Toronto,Canada;2DonnellyCentreforCellularandBiomolecularResearch,UniversityofToronto,Toronto,Canada;3Departmentof
MolecularGenetics,UniversityofToronto,Toronto,Canada;4VectorInstitute,Toronto,Canada;5OntarioInstituteforCancerResearch,Toronto,Canada;6MemorialSloanKetteringCancer
Centre,NewYork,USA(pending)
CaitHarriganMutationalsignaturesarepatternsofmutationtypes,manyofwhicharelinkedtoknownmutagenicprocesses.Signatureactivityrepresentstheproportionofmutationsasignaturegenerates.Incancer,cellsmaygainadvantageousphenotypesthroughmutationaccumulation,causingrapidgrowthofthatsubpopulationwithinthetumour.Thepresenceofmanysubclonescanmakecancershardertotreatandhaveotherclinicalimplications.Reconstructingchangesinsignatureactivitiescangiveinsightintotheevolutionofcellswithinatumour.Recently,weintroducedanewmethod,TrackSig,todetectchangesinsignatureactivitiesacrosstimefromsinglebulktumoursample.Bydesign,TrackSigisunabletoidentifymutationpopulationswithdifferentfrequenciesbutlittletonodifferenceinsignatureactivity.Herewepresentanextensionofthismethod,TrackSigFreq,whichenablestrajectoryreconstructionbasedonbothobserveddensityofmutationfrequenciesandchangesinmutationalsignatureactivities.TrackSigFreqpreservestheadvantagesofTrackSig,namelyoptimalandrapidmutationclusteringthroughsegmentation,whileextendingitsothatitcanidentifydistinctmutationpopulationsthatsharesimilarsignatureactivities.
![Page 96: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/96.jpg)
88
General
AFlexiblePipelineforthePredictionofBiomarkersRelevanttoDrugSensitivity
V.KeithHughitt1,SayehGorjifard1,AleksandraM.Michalowski1,JohnK.Simmons2,RyanDale1,EricC.Polley3,JonathanJ.Keats4,BeverlyA.Mock1
1NCI,2PersonalGenomeDiagnostics,3MayoClinic,Rochester,4TGen
V.KeithHughittRecentyearshaveseenanexplosionintheavailabilityofpairedmolecularprofilinganddrugscreendata,providinganunprecedentedopportunityforthedevelopmentoftargetedtherapiesbasedonanindividual’sgeneticbackground.Despiteanumberofrecentsuccessesindiseasesrangingfromcysticfibrosistocancer,significanthurdlesremaininourabilitytoaccuratelypredicttreatmentsbasedonmolecularprofilingdata.Inparticular,fewsuchtoolsexistthatallowtheintegrationofheterogeneousdatatypes(e.g.genomic,transcriptomic,andsomaticmutations),alongwithhigh-throughputdrugscreendatatomakepredictionsabouttreatmentefficacy.Here,wedescribeageneralizedopen-sourcepipelinedevelopedfortheanalysisofprecisionmedicinedata,PharmacogenomicsPredictionPipeline,or“P3”.ThemodulardesignofP3enablestheinclusionofarbitraryinputdatatypesandtheselectionfrommultiplealternativemachinelearningalgorithms,whileautomatedstatisticalandvisualizationreportingstepsincorporatedthroughoutthepipelineassistinparametertuningandearlydetectionofproblematicdataelements.ByincorporatingexternalbiologicalannotationsfromsourcessuchasTheMolecularSignaturesDatabase(MSigDB),DrugSignaturesDatabase(DSigDB),andDrugBank,P3isabletodetectimportantpathwayscorrelatedwithdrugsensitivity,whiletheinclusionofmolecularprofilingandclinicaldatafromexternalpatientandcelllinesdatasetsallowsP3tofocusitseffortsongeneswhicharemostlikelytoplayaroleintherapeuticresponse.TodemonstratetheuseofP3forpreclinicalbiomarkerprediction,weappliedP3toanunpublishedmultiplemyelomadatasetconsistingofexome,RNA-Seq,anddrugscreendatafor1900compoundsacross45tumorcelllines.Furthermore,geneexpressionandclinicaldatafrom20additionalpublically-availablepatientandcelllinemultiplemyelomadatasets(>5,500samplesintotal),alongwithdatafromtheGDSCandCCLEdrugsensitivityexperimentswerealsoanalyzed,providingarichsourceofinformationwithrespecttothebiologicalrelevanceofputativebiomarkersdetectedbythepipeline.
![Page 97: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/97.jpg)
89
General
CreatingaMetabolicSyndromeResearchResource(MetSRR)
WillyshaJenkins1,ChristianRichardson2,ClarLyndaWilliams-DeVanePhD1
1FiskUniversityNashvilleTN,2DukeUniversityDurhamNC
WillyshaJenkinsMetabolicsyndrome(MetS)isamultifacetedsyndrome.Riskfactorsincludevisceraladiposity,dyslipidemia,hyperglycemia,hypertension,andenvironmentalfactors.Anestablishedcomponentofchronicdiseasesequela,MetSleadstoanincreasedriskofcardiovasculardiseaseandtype2diabetes.MetSalsoleadstoanincreasedriskofstroke.ComparativestudieshaveidentifiedheterogeneityinthepathologyofMetSacrossgroups,however,theetiologyofthesedifferenceshasyettobeelucidated.DespitethepresenceofpublicrepositoriesofbiologicalMetS-relateddata,theabilitytoaccessandworksaiddatahasitschallenges.Theprocessofqueryingdatabases,wrestlingwithsoftwareandwranglingdataintoworkableformatspriortoanalysisisbothcumbersomeandtimeconsuming.TheMetabolicSyndromeResearchResource(MetSRR)isacurateddatabasethatprovidesaccesstoMetSassociatedbiologicalandancillarydata.ItisanamalgamationofcurrentandpotentialbiomarkersofMetSextractedfromrelevantNationalHealthandNutritionExaminationSurvey(NHANES)datafrom1999-2016.Eachpotentialbiomarkerselectionwasdrivenbyinsightselucidatedbythereviewofover100peer-reviewedarticles.Itincludes28demographic,surveyandknownMetSrelatedvariables.Thereare9curatedcategoricalvariablesand42potentiallynovelbiomarkers.Allmeasuresarecapturedfromover90,000individuals.ThisbiocurationeffortwillprovideincreasedaccesstocuratedMetSrelateddata.ItwillalsoserveasahypothesisgenerationtoolfordisparateMetSetiologydiscovery,providingtheabilitytogenerate;andexportethnicgroup/race,sex,andage-specificcurateddatasets.MetSRRseekstobroadenparticipationinresearcheffortstoidentifyclinicallyevaluativedisparateMetSbiomarkers.Tothebestofourknowledge,MetSRRistheonlyMetSspecificdatabasetargetedatuncoveringthedisparateetiologyofMetSthroughbiocuration.
![Page 98: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/98.jpg)
90
General
Utilizingcohortinformationtofindcausativevariants
SenayKafkas,RobertHoehndorf
ComputationalBioscienceResearchCenter,Computer,ElectricalandMathematicalSciences&EngineeringDivision,KingAbdullahUniversityScienceandTechnology,4700KAUST,Thuwal,
23955-6900SaudiArabiaSenayKafkasIdentificationofcausativevariantsingenomicdataischallenging.Currentstudiesfocusonprioritizingvariantswithinindividualgenomes,orapplystatisticalmethods(e.g.GWAS)tolargecohorts.WiththerapidadvancementsandcostdecreaseinNGS,scientistsareabletoproducesequencedatafromlargediseasecohortsandhealthypopulation.Forexample,UKBiobankmakesavailablegenotypetophenotyperelationsfor>500,000individualsandwholeexomesequencing(WES)datafor50,000individuals.Patientswiththesame/similarsetofphenotypesmaysharethesame/biologicallyrelatedgeneticabnormalitiesandriskfactors.Theavailabilityofthesedatasetsmayallowustostratifyindividualsbytheirphenotypeandusethisinformationtoidentifycausativevariantswithinlargecohorts.WeproposeanewmethodthatstratifiespatientsbytheirphenotypesandidentifiesthesetofcausativevariantswhichcanexplainphenotypesinmostindividualswithinacohortfromWES/WGS.First,wegeneratedandusedsyntheticdiseasecohortstoevaluateourmethod.Weusedthehumangenotype-phenotypeassociationsfromClinVarandthesequencedatafrom1000Genomesandgeneratedsyntheticcohortswithdifferentpopulationsizesfor200randomlyselecteddiseasesfromClinVar.TogenerateasyntheticdiseasecohortofsizeN,firstwepickedrandomlyNindividualsfrom1000Genomesandthenforeachindividual,wepickedrandomlyoneofthevariantsofthegivendiseaseandaddedittothegenotypeofthegivenindividual.Wepre-processedthesequencedatabyannotatingwithCADDandselectingonlythemostdeleteriousvariantofagivengeneforeachindividual.Furthermore,we“normalize”pathogenicityscoresbasedontheirfrequencieswithinapopulationinordertoaccountfordifferentdistributionwithingenesbasedontheirlength.WethenapplyourmethodonUKBiobank.WedevelopedamethodthatidentifiescausativevariantsbyutilizinginformationaboutsharedphenotypeswithinacohortandcomparedthemagainstindividuallyprioritizingvariantsusingWES/WGSdataandaveragegeneranks.Ourapproachreliesonamachinelearningmodeltrainedonapathogenicitypredictionscore(e.g.CADD),thefrequencyofobservingapathogenicityscoreaboveacertainthresholdinthesamegenewithinapopulation,andusesthiscohortandphenotype-derivedinformationasfeaturetopredictcausativevariantswithinindividualgenomesequences.Ourmethodcanidentifycausativevariantsinsmallandmedium-sizedcohorts(2to100individuals).Asthediseasebecomesmorecomplex(i.e.involvingharmfulvariantsinmultiplegenes),ourmachinelearningmodelimprovesoverestablishedmethodsinparticularinlargercohorts(>80individuals).Currently,weappliedourmethodonUKBiobankandsuggestcandidatecausativevariantsfor1499complexdiseases.
![Page 99: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/99.jpg)
91
General
IntegratedanalysisofJAK-STATpathwayinhomeostasis,simulatedinflammationandtumour
MilicaKrunic1,AnzhelikaKarjalainen1,MojoyinolaJoannaOla1,StephenShoebridge1,SabineMacho-Maschler1,CarolineLassnig1,AndreaPoelzl1,MatthiasFarlik2,NikolausFortelny2,
ChristophBock2,BirgitStrobl1,MathiasMueller1
1InstituteofAnimalBreedingandGeneticsandBiomodelsAustriaUniversityofVeterinaryMedicineViennaAustria;2CeMM–CenterforMolecularMedicineAustrianAcademyofSciences
ViennaAustria
MilicaKrunicJanuskinases(JAKs)andsignaltransducersandactivatorsoftranscription(STATs)playakeyroleincytokinesignallingandinthedefenceagainstinfectionandcancer.JAK-STATsignallingcomponentsinteractwithchromatinremodellingproteinsandchangechromatinarchitecture/landscapeduringcelldifferentiationandrecognitionandeliminationofpathogens.Usingdifferentsequencingapproaches(ATAC-Seq,ChIPmentation,single-cellRNA-Seq,RNA-Seq),ourgoalistountangletherolesofJAK-STATproteinsinshapingchromatinlandscapesofmyeloidandlymphoidcellsinhomeostasis,sterile(simulated)inflammationandwithintumourmicroenvironment.Additionally,weareinvestigatinghowevolutionaryconservedSTATproteinisoformsinteractwithchromatinandco-regulatoryproteinstoinducecelltype-andgene-specificresponses.Thepostershowsoursummarisedfindingsasaresultofintegrationofdifferentapproaches.
![Page 100: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/100.jpg)
92
General
BEERS2:TheNextGenerationofRNA-SeqSimulator
NicholasF.Lahens1,ThomasG.Brooks1,DimitraSarantopoulou1,SoumyashantNayak1,CrisW.Lawrence1,AnandSrinivasan2,JonathanSchug3,4,GarretA.FitzGerald1,5,JohnB.Hogenesch6,
YosephBarash4,GregoryR.Grant1,4
1InstituteforTranslationalMedicineandTherapeutics,PerelmanSchoolofMedicine,UniversityofPennsylvania,Philadelphia,PA;2PMACSEnterpriseResearchApplicationsandHigh
PerformanceComputing,PerelmanSchoolofMedicine,UniversityofPennsylvania,Philadelphia,PA;3InstituteforDiabetes,Obesity,andMetabolism,PerelmanSchoolofMedicine,Universityof
Pennsylvania,Philadelphia,PA;4DepartmentofGenetics,PerelmanSchoolofMedicine,UniversityofPennsylvania,Philadelphia,PA;5DepartmentofSystemPharmacologyandTranslationalTherapeutics,PerelmanSchoolofMedicine,UniversityofPennsylvania,Philadelphia,PA;6DivisionofHumanGenetics,DepartmentofPediatrics,Centerfor
Chronobiology,CincinnatiChildren'sHospitalMedicalCenter,Cincinnati,OH
NicholasLahensTheaccurateinterpretationofRNA-Seqdatapresentsamovingtargetasscientistscontinuetointroducenewexperimentaltechniquesandanalysisalgorithms.Thischallengehasledresearcherstoperformasubstantialnumberofbenchmarkingstudiesinordertodeterminebestanalysispractices.Simulateddatasetshaveproventobeaninvaluabletoolintheseefforts.Despitethisstrongneedforsimulateddata,onlyafewRNA-Seqsimulatorshavebeenreleasedinthepublicdomain,andallofthemarebasedonsimplifyingassumptionsthatlimittheirutility.ToaddresstheseshortcomingsandgeneraterealisticsimulateddatawearedevelopingtheBenchmarkerforEvaluatingtheEffectivenessofRNA-SeqSoftware(BEERS)2:anopen-source,modularsimulatorthatmodelseachstepintheprocessofconvertingRNAmoleculesintosequencingreads.WetakeanempiricalapproachtogeneratingrealisticRNAsamplesreflectingbiologicalvariability,alternativesplicing,andallele-specificexpression,whichusesrealdatatotraintheparameters.Next,wemodelbiochemicalreactionsandbiasesfromeachstepinlibraryconstructionasseparatemodules.Usinganobject-orientedparadigm,eachmodulehaswell-definedinputsandoutputsallowinguserstoeasilysubstitutenewmodules.ThisdesigngivesBEERS2theflexibilitytomodelchangestolibraryconstructionandsequencingprotocols,evolvinginparallelwithsequencingtechnology.BEERS2isopensource,freelyavailable,andwillbeacrucialtoolforthecommunityaswecontinuetodevelopstandardsfortranscriptomeanalysis.
![Page 101: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/101.jpg)
93
General
EffectModificationbyAgeonaDiagnosticThree-Gene-SignatureinPatientswithActiveTuberculosis
LaurenMcDonnell1,CarlyA.Bobak1,2,MatthewNemesure1,JustinLin1,JaneE.Hill1
1ThayerSchoolofEngineeringatDartmouthCollege,2GeiselSchoolofMedicineatDartmouthCollege
LaurenMcDonnellIntroductionTuberculosis(TB)istheleadingcauseofdeathfromasingleinfectiousagentworldwide(1).In2017,therewere10millionreportedcasesofTBandanother1.3milliondeathsfromthedisease(1).ItiscurrentlytheleadingkillerforindividualswhoareHIVpositive(1).In2014,theWHOdevelopedtheambitiousSustainableDevelopmentGoals(SDGs)whichincluded"EndTB",amajorprogramaimingtoeradicatetheTBepidemicby2030(2).Accomplishingthiswillrequiremoreadvanceddiagnosticsthatarelessinvasiveanddeterminethediseasestatusmorequicklyandmorereliably.Inouranalysis,weaimtomodelriskfactorsassociatedwiththedevelopmentofTB.Here,wearelookingatdemographicfeaturesfrommulti-cohortstudiespullingdatafromthirtydifferentcountriesfromtheGeneExpressionOmnibusexaminingpatientswithactiveTB,latentTB,otherdiseases,andhealthycontrols.Thedataispulledpredominantlyfromdevelopingcountries,butalsoincludessamplesfromdevelopedcountries,includingtheUK,France,Germany,andtheUnitedStates.Intotal,thedatasetincludes3,096participants.Metaanalysisofsimilardatasetshaveproposedathree-gene-scoreasa"global"tuberculosismetric(3).ThistypeofanalysissuggeststhatallactiveTBpatients,regardlessofotherfactors,willexpressthisgenescore.OurhypothesisisthatthisactiveTBwillbeadditionallymediatedbydemographicfactorssuchasageandHIVstatusthatareassociatedwithTB.MethodologyWeperformedamultivariatelogisticregressionanalysistoidentifydemographicfeaturesassociatedwithculture-confirmedTuberculosis.Themodelfeaturesincludedage,HIVstatus,andgeneexpressionsforeachgeneindividually(GBP5,DUSP3,andKLF2),aswellasaninteractiontermforHIVandagewitheachofthethreegenes.ResultsTheresultsofourmultivariatelogisticregressionsuggestthatagemodifiesallthreegenesintheproposedglobalgenesignatures(p-valuesof5.38e-05,6.75e-05,and,0.01012,forGBP5,KLF2andDUSP3respectively).InitialfindingsalsoindicatethatHIVstatusisamediatoroftheeffectofGBP5(p-valueof0.03437).Knowingthattherelationshipbetweenthegeneexpressionofthesethreegenesvariesbydemographicsmaychangethewaythatadiagnosticisimplementedinclinic.Ourhopeisthatthisanalysiswillbeusedtofurtherrefinethethree-genesignatureforspecificdemographicgroupswhereitmaybemosteffectiveindiagnosingactiveTB.Citations(1)WHOGlobalTuberculosisReport2018www.who.int/tb/publications/global_report/en/(2)EndingTuberculosisby2030:CanWeDoIt?A.B.Suthar,R.Zachariah,Harrieshttps://www.ingentaconnect.com/contentone/iuatld/ijtld/2016/00000020/00000009/art00007?crawler=true(3)Genome-WideExpressionforDiagnosisofPulmonaryTuberculosis:aMulticohortAnalysishttps://www.ncbi.nlm.nih.gov/pubmed/26907218
![Page 102: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/102.jpg)
94
General
Classificationandmutationpredictionfromgastrointestinalcancerhistopathologyimagesusingdeeplearning
SungHakLee1,Hyun-JongJang2
1DepartmentofHospitalPathology,SeoulSt.Mary’sHospital,CollegeofMedicine,TheCatholicUniversityofKorea,2DepartmentofPhysiology,CollegeofMedicine,TheCatholic
UniversityofKorea
SungHakLeeBACKGROUND:Althoughmicroscopicanalysisoftissueslideshasbeenthebasisfordiseasediagnosisfordecades,intra-andinter-observervariabilitiesremainissuestoberesolved.TherecentintroductionofdigitalscannershasallowedforresearcherstousedeeplearningintheanalysisoftissueimagesbecausemanyH&Ewholeslideimages(WSIs)areavailable.Inthepresentstudy,weinvestigatedthepossibilityofadeeplearning-based,fullyautomated,computer-aideddiagnosissystemwithWSIsfromagastricadenocarcinoma(STAD)dataset.Inaddition,wetrainedthenetworktopredictseveralcommonlymutatedgenesinSTAD.Furthermore,weshowedthatdeeplearningcanpredictMSIdirectlyfromH&Eimages.MATERIALSANDMETHODS:Westudiedtheautomaticclassificationof‘normal’and‘tumor’regionsusingatotalof432H&E-stainedWSIsfromTCGAgastriccancerimagedataset.Theslidesweretiledinnon-overlapping360x360pixelwindowsatamagnificationof20x.Weused70%ofthosetilesfortraining,15%forvalidation,and15%forfinaltesting.Thedeeplearningwithconvolutionalneuralnetworkswasperformedbasedoninceptionv3architecture.TostudythepredictionofgenemutationsfromH&Eimages,averageareaunderthecurve(AUC)valuesforKRASandSMAD4mutation(93and88cases,respectively)werecalculatedusingourautomatictumorclassificationdeep-learningapproach.TostudythepredictionofMSI(MSSvs.MSI-H)fromH&Eimages,383caseswereenrolledusingthesameapproach.RESULTS:Theperformanceofourmethodiscomparabletothatofpathologists,withanAUCofupto0.999.Furthermore,wetrainedthenetworktopredicttwocommonlymutatedgenesinSTAD(KRASandSMAD)andinvestigatedwhethertheycanbepredictedfrompathologyH&Eimages.WefoundthatKRASandSMADmutationcanbepredictedfrompathologyimages,withAUCsof0.711to0.737,similarresultsfrompreviousstudieswithnon-smallcelllungcancerhistopathologyimagesusingdeeplearning.ForthepredictionofMSI,patch-levelandpatient-levelAUCswere0.843and0.912,respectively,whichissuperiortothepreviousstudieswithTCGA-COADand-STADhistopathologyimages.CONCLUSIONS:Thesefindingssuggestthatdeep-learningmodelscanassistpathologistsinthedetectionofcancersubtypesandinthepredictionofgenemutationsandMSIstatus.Aftertrainingonlargerdatasetsandprospectivevalidation,thisapproachhasthepotentialtoprovideimmunotherapytoamuchbroadersubsetofpatientswithSTAD.
![Page 103: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/103.jpg)
95
General
MappingtheEmergenceandMigrationofHematopoieticStemCellsandProgenitorsDuringHumanDevelopmentatSingleCellResolution
FeiyangMa,VincenzoCalvanese,SandraCapellera-Garcia,SophiaEkstrand,MatteoPellegrini,HannaK.A.Mikkola
DepartmentofMolecular,CellandDevelopmentalBiology,UCLA,LosAngeles,CA,USA
FeiyangMaHematopoiesisisestablishedduringdevelopmentthroughmultiplewavesofbloodcellproduction,startingwithlineage-primedprogenitorsrequiredfortheembryosneeds,andculminatinginthegenerationofself-renewinghematopoieticstemcells(HSCs)forlife-longhematopoiesis.Althoughhematopoieticontogenyhasbeenstudiedextensivelyinmice,welackknowledgeoftheanatomical,temporalandmolecularmapforhematopoieticdevelopmentinhuman.PriorstudiessuggestthatHSCsemergefromhemogenicendotheliumintheaorta-gonad-mesonephros(AGM)regionbetween4-6weeksofhumangestation.Extraembryonicsitesincludingtheplacenta,umbilicalandvitellinearteries,andtheyolksac,havebeenproposedtogenerateHSCsinthemouse.However,whetherthesamesitesgenerateHSCsinhumanisunclear,mainlyduetothelimitedaccesstodevelopmentaltissuesandlackofreliablemethodstoidentifydevelopinghumanHSCs.Wecreatedasingle-celltranscriptomemapofhemato-vascularcells(CD34+and/orCD31+)fromhumanhematopoietictissuesat1stand2ndtrimester.Usingamolecularsignatureofself-renewingHSCsdefinedinourpreviousmolecularandfunctionalstudies,wecouldidentifyCD34+Thy1+RUNX1+HOXA7+MLLT3+HLF+cellsasHSCsthroughoutdevelopment.Analysesof5-wkAGMrevealedadistinctpopulationofnewlyemergedHSCsthatvanishedby7wks.HSCscolonizedthefetalliverby6wks,wheretheyexpandedanddifferentiatedbeyond15wks.SmallbutdistinctpopulationexpressingHSCmolecularmarkerswasreproduciblydetectedin5wkplacentas.Atthistime,theheart,umbilicalcordandfetalliverlackedclearHSCpopulations,implyingminimalspreadingthroughcirculatingblood.Interestingly,precedingHSCcolonization,the5wkfetalliveralreadyharboredCD34+Thy1-RUNX1+HOXA7-MLLT3-HLF-progenitorsthatco-expressedmarkersassociatedwitherythro-myeloidandlympho-myeloidpotential.Comparablepopulationswereabundantintheyolksac,suggestiveoftheirorigin.Thisdata-setprovidesanunprecedentedresourcetodissectthedynamicsandmolecularpathwaysgoverningtheemergenceandprogressionofdistinctwavesofhematopoieticcellsduringhumandevelopment,andservesasareferencemapforthegenerationofHSCsinvitrofortherapeuticpurposes.
![Page 104: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/104.jpg)
96
General
Large-scaleMachineLearningandGraphAnalyticsforFunctionalPredictionofPathogenProteins
JasonMcDermott1,SongFeng1,WilliamNelson1,Joon-YongLee1,SayanGhosh1,ArifulKhan1,MahanteshHalappanavar1,JustineNguyen2,JonathanPruneda2,DavidBaltrus3,JoshuaAdkins1
1PacificNorthwestNationalLaboratory,2OregonHealth&ScienceUniversity,3Universityof
Arizona
JasonMcDermottProteinsenactthefunctionalityencodedbygenomesandsounderstandingproteinfunctioniscriticaltomanyareasofbiology.Predictionofproteinfunctionfromsequenceispossiblebecauseofevolutionaryrelationshipsbetweenproteinswithsimilarfunctions,andexistingalgorithmscanidentifythecorrespondingsequencesimilarity.However,manyproteinshavesimilarfunctionsbutdiversesequences,whichthwartexistingmethods,anddrivenbyadvancesinsequencingtechnologythenumberofproteinsequenceswithnoknownfunctionorsimilaritytoproteinsofknownfunctionislargeandgrowingrapidly.Weusereducedaminoacidalphabetmappingandkmer-basedproteinsequencerepresentationtodetectfunctionalsimilaritiesbetweenproteinsandapplythismethodtobacterialandviralproteinsthatmimiceukaryoticubiquitinligasesanddeubiquitinasesandclassesofbacteriocins.Thesemodelsallowpredictionofnovelexamplesthatarenotdetectedbytraditionalsequencesimilarity,andcanprovideinsightintoactivesitesorotherfunctionaldomainsfortheproteins.Toexploresequencespaceinamorediscovery-orientedwaywehaveappliedthisapproachtoaverylargesetofbacterialproteinsequences(>20millionsequences)anduseaGPU-basedalgorithmtoquicklycalculateasimilaritygraphbasedonproteinfeaturesbeyondtraditionalsequencesimilarity.Exascalegraphanalyticsmethodsareusedtoidentifygroupsofcloselyrelatedsequencesfromthesimilaritygraph.Weshowthatthismethodcanrecapitulateknownrelationshipsbetweenproteins,highlightinconsistenciesintheunderlyingproteindatabase,andprovidehypothesesforfunctionsofnovelproteinsthusprovidingalarge-scalesequencelandscape.
![Page 105: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/105.jpg)
97
General
Gene-setanalysisusingGWASsummarystatisticsandGTExdatabase
MasahiroNakatochi
DepartmentofNursing,NagoyaUniversityGraduateSchoolofMedicine
MasahiroNakatochiRecently,samplesizesofgenome-wideassociationstudies(GWASs)arerapidlyincreasing.Consequently,manygeneticlociassociatedwithtraitshavebeenidentified.ItisdifficulttointerprethowthesemanylociidentifiedbyGWAScontributetothetraits.AsafunctionofSNP,regulationofgeneexpressionlevelisconsidered.TheSNPiscalledasexpressionquantitativetraitloci(eQTLs).TheGTExprojectrevealedmanyeQTLsinmanytissuesofhuman.Inthisstudy,IproposeanapproachofagenesetanalysisusingGWASsummarystatisticsandGTExdatabasetoinvestigatehowthegeneticlociidentifiedbyGWAScontributetothetrait.Thisapproachhasthreesteps.Atfirst,trait-associatedSNPsareidentifiedbyGWAS.Second,geneswhoseexpressionlevelwasassociatedwithtrait-associatedSNPsinatleastonetissueintheGTExdatabasearesearched.Thesegeneswereclassifiedintoeitherofpositivelyornegativelycorrelatedgenes.Finally,genesetenrichmentanalysesofpositivelycorrelatedgenesandnegativelycorrelatedgenesareperformedwiththemodifiedFisher’sexacttesttoidentifytrait-associatedpathwaysorgenesets.Usingthisapproach,Ifoundserumuricacid(SUA)-associatedgenesetsbasedonaSUAGWAS.GenesetenrichmentanalysisofUniProttermsfoundtheterms“Williams-Beurensyndrome”,“sodium”,“transport”,“sodiumtransport”,and“alternativesplicing”wereenrichedforthepositivelycorrelatedgenes.ThisapproachprovidesanotherinsightintotheSNPsidentifiedbyGWAS.
![Page 106: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/106.jpg)
98
GeneralGeneral
TargetingCancerviaSignalingPathways:ANovelApproachtotheDiscoveryofGeneCCDC191'sDouble-agentFunctionusingDifferentialGeneExpression,HeatMap
AnalysesthroughAIDeepLearning,andMathematicalModeling
AnnieOstojic
PurdueUniversity
AnnieOstojicAccordingtoarecentJohnsHopkinsUniversitystudypostedinMayof2018,thenumberoftotalgenesinthegenomewasrecalculatedtobe43,162genescomprisedof21,306protein-codedgenesand21,865non-codedgenes.WithcompletionofbasepairsequencingintheHumanGenomeProjectbackin2003,hopeexistedforaccelerationofnewmedicaltreatmentsanddiseaseintervention.However,earlierbioinformaticprocesseswereunabletoproduceresultsquicklyenough,somanygenefunctionsremainunknowntodate.Aneedexiststoanalyzegenefunctionsinpathwaystomeetachangingmedicalindustryofpharmacogenomics,personalizedmedicine,andcancertreatmentsrelativetogeneexpressionpatterns.Newmethodologyfordeterminingfunctionsofunstudiedgenestorapidlyextrapolate,classify,andcorrelatetheirgeneexpressionstobiologicalpathwaysisattheforefrontofbioinformaticstudies.ThisresearchdiscoveredthefunctionofgeneCCDC191,acoiled-coildomain-containingprotein-codinggene,whosefunctionhadnotbeenfullystudiednordefined.AnovelapproachwasutilizedtodeterminethefunctionofCCDC191bycombininggeneexpressionanalysis,patientsurvivalanalysis,differentialgeneexpression,heatmapwithAIdeeplearning,andreverseengineeringmathematicalmodeling.ThisstudypresentsanalysesandinsightsintogeneCCDC191whichhavenotbeenperformedprior,anditprovidesareplicablemethodologywhichincorporatesAIdeeplearningimageclassification,andreverseengineeringmathematicalmodelingtodeterminegenefunctionsinpathwaysandcancerconnectedness.
![Page 107: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/107.jpg)
99
General
RFEX:SimpleRandomForestModelandSampleExplainerfornon-MachineLearningexperts
DragutinPetkovic,AliAlavi,DanDanCai,JizhouYang,SabihaBarlaskar
SanFranciscoStateUniversity(allauthors)
DragutinPetkovicMachineLearning(ML)isbecominganincreasinglycriticaltechnologyinmanyareas.However,itscomplexityanditsfrequent“non-transparency”createsignificantchallenges,especiallyinthebiomedicalandhealthareas.OneofthecriticalcomponentsinaddressingtheabovechallengesistheexplainabilityortransparencyofMLsystems,whichreferstothemodel(relatedtothewholedata)andsampleexplainability(relatedtospecificsamples).OurresearchfocusesonbothmodelandsampleexplainabilityofRandomForest(RF)classifiers.OurRFexplainer,RFEX,isdesignedfromthegroundupwithnon-MLexpertsinmind,andwithsimplicityandfamiliarity,e.g.providingaone-pagetabularoutputandmeasuresfamiliartomostusers.InthispaperwepresentsignificantimprovementinRFEXModelexplainercomparedtotheversionpublishedpreviously,anewRFEXSampleexplainerthatprovidesexplanationofhowtheRFclassifiesaparticulardatasampleandisdesignedtodirectlyrelatetoRFEXModelexplainer,andaRFEXModelandSampleexplainercasestudyfromourcollaborationwiththeJ.CraigVenterInstitute(JCVI).WeshowthatourapproachoffersasimpleyetpowerfulmeansofexplainingRFclassificationatthemodelandsamplelevels,andinsomecasesevenpointstoareasofnewinvestigation.RFEXiseasytoimplementusingavailableRFtoolsanditstabularformatofferseasy-to-understandrepresentationsfornon-experts,enablingthemtobetterleveragetheRFtechnology.
![Page 108: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/108.jpg)
100
General
ApparentbiastowardlonggenemisregulationinMeCP2syndromesdisappearsaftercontrollingforbaselinevariations
AyushT.Raman1,2,AmyE.Pohodich2,Ying-WooiWan2,HariKrishnaYalamanchili2,WilliamE.Lowry3,HudaY.Zoghbi2,ZhandongLiu2
1BroadInstituteofMITandHarvard,2BaylorCollegeofMedicine,3UniversityofCaliforniaLos
Angeles
AyushRamanBackground:RettsyndromeisaneurodevelopmentaldisordercausedbymutationsinMECP2,amethyl-bindingproteinwhosetaskistoorchestrategeneexpression,andMeCP2mutationsdisrupttheexpressionofseveralthousandgenes.Overthepasttenyears,anumberofstudiesobservedthatRettsyndromeandotherdisordersthataffectneuronalsynapsesseemtopreferentiallydysregulategenesthatarelongerthan100Kb.Theselength-dependenttranscriptionalchangesinMeCP2-mutantsamplesaremodest,but,giventhelowsensitivityofhigh-throughputtranscriptomeprofilingtechnology,herewere-evaluatethestatisticalsignificanceoftheseresults.Results:Wedeveloparobuststatisticalapproachtoestimatenoiseaccuratelyandidentifystatisticallysignificantgenelength-dependentchanges.Wefindthattheapparentlength-dependenttrendspreviouslyobservedinMeCP2microarrayandRNA-sequencingdatasetsdisappearafterestimatingbaselinevariability(i.e.,intra-sampledifferences)fromrandomizedcontrolsamplesacrosspublicallyavailable17differentMeCP2datasets.WeshowthatevenMAQC/SEQCPhase-IIIbenchmarkdatasetsarepronetothelonggenebias,whichdoesnotincludeMeCP2oritseffectsonexpression—suggestingthatthebiasisnotaninherentfeatureofgeneexpressionfollowingMeCP2disruption.WehypothesizedthatPCRamplification,aprocesssharedbybothmicroarrayandRNA-seqtechnologies,mightintroducetheobservedbiasinlonggeneexpression.WefindnobiaswithnanoStringtechnology,atechniquethatdoesnotusePCRamplification,forSEQC/MAQCsamplesorMecp2mutantsamples.Thisconfirmedournotionthatthepreviousobservationsoflong-genebiasresultedfromamplification-basedtechnologiesandthefailuretoestablishaproperbaseline.Conclusions:Weconcludethataccuratecharacterizationoflength-dependent(orother)trendsrequiresestablishingabaselinefromrandomizedcontrolsamples.WeproposethatsmallerfoldchangesintranscriptionobservedafterPCRamplificationleadstoanoverestimationoflonggeneexpressionlevels.
![Page 109: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/109.jpg)
101
General
Predictionofchronologicalandbiologicalagefromlaboratorydata
LukeSagers1,LukeMelas-Kyriazi2,ChiragJ.Patel3,ArjunK.Manrai1
1BostonChildren’sHospitalComputationalHealthInformaticsProgram,2HarvardUniversityDepartmentofMathematics,3HarvardMedicalSchoolDepartmentofBiomedicalInformatics
LukeSagersAginghaspronouncedeffectsonbloodlaboratorybiomarkersusedintheclinic.Priorstudieshavelargelyinvestigatedasinglebiomarkerorpopulationatatime,limitingacomprehensiveviewofbiomarkervariationandagingacrossdifferentpopulations.Herewedevelopasupervisedmachinelearningapproachtostudytheagingprocessusing356bloodbiomarkersmeasuredin67,536individualsacrossdemographicallydiversepopulations.Ourmodelpredictsagewithameanabsoluteerror(MAE)inheld-outdataof4.76yearsandanR2valueof0.92.Agepredictionwashighlyaccurateforthepediatriccohort(MAE=0.87,R2=0.94)butinaccurateforages65+(MAE=4.30,R2=0.25).Extensivevariabilitywasobservedinwhichbiomarkerscarrythemostpredictivepoweracrossdifferentagegroups,genders,andrace/ethnicitygroups,andnovelcandidatebiomarkersofagingwereidentifiedforspecificageranges(e.g.VitaminEforages18-45).Wefurthershowthatpredictorsaccurateforoneagegroupmayfailtogeneralizetoothergroups,andfindthatnearlyathirdofallbiomarkersexhibitnon-linearitynearadulthood.Aspopulationsworldwideundergomajordemographicchanges,itwillbeincreasinglyimportanttocataloguebiomarkervariationacrossagegroupsanddiscovernewbiomarkerstodistinguishchronologicalandbiologicalaging.
![Page 110: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/110.jpg)
102
General
WholegenomesequencinganalysisofinfluenzaCvirusinKorea
SooyeonLim,HanSolLee,JiYunNoh,JoonYoungSong,HeeJinCheong,WooJooKim
DivisionofInfectiousDiseases,DepartmentofInternalMedicine,KoreaUniversityCollegeofMedicine,Seoul,SouthKorea;DivisionofBrainKorea21ProgramforBiomedicineScience,
CollegeofMedicine,KoreaUniversity,Seoul,SouthKorea;AsiaPacificInfluenzaInstitute,KoreaUniversityCollegeofMedicine,Seoul,SouthKorea
SooyeonLimThroughtheHospital-basedInfluenzaMorbidityandMortality(HIMM)surveillancesystem,973nasopharyngealswabspecimensfromchildrenunder2yearsofagewerecollectedandtestedforinfluenzavirusesusingreal-timePCR.Amongthetestedspecimens,383werepositiveforinfluenzaAand/orBvirus.InfluenzaCviruswasconfirmedinfivespecimens.Inthisstudy,weusedfiveinfluenzaCviruspositivespecimensandacell-culturedinfluenzaCvirus.ViralRNAwasisolatedusingtheQIAampviralRNAminikit(Qiagen,Hilden,Germany)followingamanufacturer’sinstructions.AllisolatedRNAwasfinallyelutedwith60ulofdistilledwater.ReversetranscriptionreactionwasperformedbyPrimescript1ststrandcDNAsynthesiskit(Takara,Shiga,Japan)usinguni-5’primer.Thegenome-wideamplificationoftheinfluenzaCviruswasperformedusingtaqpolymerase.TheamplifiedgenefragmentswereperformedusingtheNexteraXTDNAlibraryPrepkit(Illumina),accordingtothemanufacturer’sprotocol.ThisstudywasthefirstreportofinfluenzaCvirususingNGSanalysisinSouthKorea.Inthisstudy,youngchildrenwithinfluenzaCvirusinfectionshadacuterespiratoryillnesses,suchasfever,rhinorrhea,andcough,butnopneumoniaorsevererespiratoryillnesswasobserved.BasedonNGSanalysis,wecanexpandourunderstandingvarioussymptomsofinfluenzaCvirus.
![Page 111: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/111.jpg)
103
General
MiningtheHumuhumunukunukuapuaandtheShakaofAutismwithBigDataBiomedicalDataScience
PeterWashington,BriannaChrisman,KaitiDunlap,AaronKline,ArmanHusic,MichaelNing,KelleyMariePaskov,NathanielStockham,MayaVarma,EmilieLeBlanc,JackKent,Yordan
Penev,MinWooSun,Jae-YoonJung,CatalinVoss,NickHaber,DennisP.Wall
DepartmentsofPediatrics(SystemsMedicine)andBiomedicalDataScience,StanfordUniversity
DennisWallMentalhealthisarguablyatthecoreofallhealth,andearlychildhoodmentalhealthpredictsalongtermhealthylifecourse.Yet,finding,treating,andpreventingmentalhealthdisordersinchildrenislimitedbyreachandscalablemethods.Thankfully,advancesinAIandubiquitoustechnologyhavemarshaledinunparalleledopportunitiesforscalablemobilehealth.Wehaveconstructedaseriesofmobilesolutionsthattreatandtrackwhilesimultaneouslybuildingnovelcomputervisionlibrariesforprecisionmodels.Thesesolutionsfunctionasmobilegamesthatarehighlyengaginganddesignedfortheindividual,encouragingcompliancewiththerequired“dose”whilepassivelycollectingmetricstomeasure,andultimatelypredictoutcomes.Wecanquantifyordigitizeachild’sphenotypethroughthesepassivelycollecteddata,notjustonce,butmanytimes,asthechildplaysourgamesandlearnsthroughplaying.Thesegamesengendertrustandastheydo,we“crowd”buildacommunityofstakeholdersthatnotonlysharesPhenomedata,butalsodataontheirGenomeandtheEnvironment.Withthe3modalities,weusedatafusionmultivariatetechniquestoresolvetheG+E=Pequationforautismandsetthestagefordoingthesameinotherspectrumdisordersacrossmentalhealth.
![Page 112: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/112.jpg)
104
General
Developmentofarecurrencepredictionmodelforearlylungadenocarcinomausingradiomics-basedartificialintelligence
HeeChulYang,GunseokPark,JiEunOh
DivisionofConvergenceTechnology,NationalCancerCenterResearchInstitute
HeeChulYangPurpose:Thisstudyaimedatpredictingtherecurrenceaftercurativeresectionforthepatientswithlungadenocarcinoma(ADC)usingthephenotypicradiomicsfeaturesobtainedfromtheCTimages.Material:FromJanuary1,2010,toDecember31,2015,atotalof604primarylungADCpatientswhohadthetumorsizeof1-3cmunderwentcurativeresectionatasingleinstitution.Method:Atotalof604patients’preoperativeCTimageswereusedforfeatureextraction.Thefinaldatasetwasrandomizedintoatrainingset(n=424)andatestset(n=180)withtheratioof7:3.Radiomicsfeatureswereselectedfromt-test(P<0.05)andaradiomicssignaturewasclassifiedbythelogisticregressionmodel.TheoptimalmodelwasevaluatedthroughaROCcurve.Result:Inalogisticregressionanalysis,6radiomicsfeatureswerefinallyselectedfrom51featurestobuildaradiomicssignaturethatwassignificantlyassociatedwithrecurrence.Theoptimalmodelwasbuiltwithfeaturesassociatedwiththedependentvariable.TheypresentedgoodperformanceinthepredictionofrecurrencealonewithanAUCof76.2%accuracy.Thetestsetvalidated72.2%accuracy.Conclusion:Theradiomicssignaturecanbeausefulrecurrencepredictiontooleveninsmall-sizedlungADC.
![Page 113: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/113.jpg)
105
General
DRLPC:DimensionReductionofSequencingDatausingLocalPrincipalComponents
YunJooYoo1,FatemehYavartanu1,ShelleyB.Bull2
1SeoulNationalUniversity,2TheLunenfeld-TanenbaumResearchInstitute
YunJooYooGenome-wideassociationstudies(GWAS)usingsinglenucleotidepolymorphism(SNP)datausuallyhavemillionsofvariableswithcomplexcorrelationstructureresultingfromlinkagedisequilibrium.Whenmulti-SNPjointanalysisusingmultipleregressionisapplied,adimensionreductionmethodsuchasprincipalcomponentanalysiscanbeconsidered.ReplacingSNPdatawithprincipalcomponentscanresolvemulti-collinearitywhichoftenoccursinregressionusinghigh-densitysequencingorimputedSNPdata.However,theprincipalcomponentsconstructedfromallSNPvariablesinaregionarehardtointerpretasabiologicalentityandarenotusefulforlocalizationandfinemapping.Inthisstudy,weproposeanalgorithmDRLPC(DimensionReductionusingLocalPrincipalComponents)toreducethedimensionforregressionanalysisbyselectingclustersofSNPsinhighcorrelationandreplacingeachclusterbyalocalprincipalcomponentconstructedfromtheSNPsinthecluster.Thealgorithmaimstoresolvemulticollinearitybetweenupdatedvariablesbyconsideringvarianceinflationfactor(VIF)andremovingvariableswithhighVIF.WeexaminedthebehaviourofDRLPCbyapplyingthealgorithmtothe1000GenomesProjectdata.Chromosome22SNPsetsofthreepopulations(EUR,ASN,AFR)weredimensionreducedforeachgeneregionseparatelycomparingseveralchoicesofthresholdvaluesforclusteringandprincipalcomponentsselection.Whenaveragedacrossthegenes,theratioofthenumberoffinalvariablesoverthenumberoforiginalvariableswas50%forthegeneswith5~10SNPsandaslowas10%forthegeneswithmorethan1,000SNPs.ThereductionratewassmallerfortheAFRpopulationcomparedtotheotherpopulationsEURandASN,possiblyduetoweakerLDintheAfricanpopulation.Wealsocomparedthepowerofmulti-SNPtestsconstructedbasedonregressionresultsobtainedfromtheoriginaldataanddimensionreduceddata.ThesetestsincludegeneralizedWald,LC(linearcombination)tests,andMLC(Multi-binslinearcombination)tests.LCtestsandMLCtestsarealsodimensionreductiontechniquesinthesensethatLCcombinesallindividualeffectsintoaonedegreeoffreedomtestandandMLCcombinestheindividualeffectsintoalinearcombinationwithinabin(cluster)andconstructsatestwithdegreesoffreedomequaltothenumberofclusters.SinceDRLPCusesthesameclusteringalgorithmbasedoncliquepartitioningasMLCwecomparedresultsofMLCwithoriginaldatatoDRLPCWaldtestwithprocesseddataunderthesameclusteringthresholdandfoundthattheyyieldsimilarpower.WeconcludethatDRLPCcanprovideefficientdimensionreductionwhileresolvingmulti-collinearityandalsolessenstheproblemofinterpretabilitybecausetheseprincipalcomponentsrepresentsmallersizedregions,possiblyshorthaplotypes.
![Page 114: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/114.jpg)
106
General
Meta-analysisinexhaustedTcellsfromHomosapiensandMusmusculusprovidesnoveltargetsforimmunotherapy
LinZhang1,YichengGuo2,HafumiNishi1
1TohokuUniversityGraduateSchoolofInformationSciences,2ColumbiaUniversity,Department
ofSystemsBiology
LinZhangAntibodytargetimmunecheckpointinhibitorstoreverseTcellexhaustionisapromisingapproachforimmunotherapyofcancers.However,thetherapeuticefficacyisstilllowforknownimmunecheckpointinhibitors,suchasPD1andCTLA4.TcellexhaustionisastateofTcelldysfunctionduringchronicinfectionsandcancers.Itexhibitsseveralcharacteristicfeatures,suchaspooreffectorfunctionsinahierarchicalmanner,impairedmemoryTcellpotential,sustainedupregulationandco-expressionofmultipleinhibitoryreceptors.ThemechanismandpathwaysforTcellexhaustionremaintobefullydescribed.Inthisstudy,weperformedmeta-analysiswith7datasetsfrombothhumansandmice,touncoverthemolecularmechanismofTcelldysfunction.Throughgenesetenrichmentanalysis,thepredefinedexhaustiongenesetswereobservedtobesignificantenrichmentintheexhaustedTcells.Thedifferentexpressionanalysesshowedanoverlapof21upregulationand37downregulationgenessharedbyexhaustedTcellsinhumansandmice.Thesegenesweresignificantlyenrichedinexhaustionresponse-relatedpathways,suchassignaltransduction,immunesystemprocess,andregulationofcytokineproduction.Besides,co-expressionanalysisidentified175geneswerehighlycorrelatedwithexhaustiontraitinhumansandmice.Aboveall,ourstudyrevealedthatTOXandCD200R1mightbeconsideredaspotentialandhigh-efficienttargetsforimmunotherapy.
![Page 115: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/115.jpg)
107
INTRINSICALLYDISORDEREDPROTEINS(IDPS)ANDTHEIRFUNCTIONS
POSTERPRESENTATIONS
![Page 116: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/116.jpg)
108
IntrinsicallyDisorderedProteins(IDPs)andTheirFunctions
DisorderedFunctionConjunction:Onthein-silicofunctionannotationofintrinsicallydisorderedregions
SinaGhadermarzi,AkilaKatuwawala,ChristopherJ.Oldfield,AmitaBarik,LukaszKurgan
VirginiaCommonwealthUniversity
SinaGhadermarziIntrinsicallydisorderregions(IDRs)lackastablestructure,yetperformbiologicalfunctions.ThefunctionsofIDRsincludemediatinginteractionswithothermolecules,includingproteins,DNA,orRNAandentropicfunctions,includingdomainlinkers.Computationalpredictorsprovideresiduelevelindicationsoffunctionfordisorderedproteins,whichcontrastswiththeneedtofunctionallyannotatethethousandsofexperimentallyandcomputationallydiscoveredIDRs.Inthiswork,weinvestigatethefeasibilityofusingresidue-levelpredictionmethodsforregion-levelfunctionpredictions.Foraninitialexaminationofthemultiplefunctionregion-levelpredictionproblem,weconstructedadatasetof(likely)singlefunctionIDRsinproteinsthataredissimilartothetrainingdatasetsoftheresidue-levelfunctionpredictors.Wefindthatavailableresidue-levelpredictionmethodsareonlymodestlyusefulinpredictingmultipleregion-levelfunctions.Classificationisenhancedbysimultaneoususeofmultipleresidue-levelfunctionpredictionsandisfurtherimprovedbyinclusionofaminoacidscontentextractedfromtheproteinsequence.WeconcludethatmultifunctionpredictionforIDRsisfeasibleandbenefitsfromtheresultsproducedbycurrentresidue-levelfunctionpredictors,however,ithastoaccommodateinaccuracyinfunctionalannotations.
![Page 117: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/117.jpg)
109
MUTATIONALSIGNATURES
POSTERPRESENTATIONS
![Page 118: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/118.jpg)
110
MutationalSignatures
Transcription-associatedregionalmutationratesandsignaturesinregulatoryelementsacross2,500wholecancergenomes
JüriReimand
OntarioInstituteforCancerResearch,UniversityofToronto
JuriReimandThegenomesofhealthyandcancerouscellsaccumulatesomaticmutationsovertimewithcomplexvariationsacrosstissuesandgenomiccontexts.Certainclassesoffunctionalelementsofthegenomearesubjecttodifferentialmutationratesduetoregionalizedactivitiesofmutationalprocesses.Toinvestigateregionalmutations,wedevelopedRM4RM,astatisticalframeworkfordetectingdifferentialmutationratesandtrinucleotidesignaturesinsetsofgenomicregulatoryelements.Tovalidateourmodel,wefirstanalyzedCTCFbindingsitesacross>2,500wholecancergenomesof39cancertypesoftheICGC-TCGAPCAWGcohort.WefoundsignificantmutationenrichmentsinCTCFsitesinliver,esophageal,breastandothercancertypesthatwasprimarilydrivenbyT>C/Gmutationsandmultipleraremutationsignaturesofunknownetiology.Transcriptionstartsitesofprotein-codinggenesandabroadersetofexperimentally-definedregulatoryelementsderivedfromprimarytumorsoftheTCGAprojectalsoshowedsignificantlyelevatedregionalmutationratesinmultiplecancertypes.TSS-specificregionalmutationenrichmentwasparticularlydominantinhighlytranscribedgenesofmatchingtumorswhilenonewasapparentinsilencedgenes.Incontrast,nomutationenrichmentdependencyontranscriptabundancewasobservedindistalregulatoryelements.Thesedataindicateatranscriptioninitiation-coupledmutationalprocessactiveinmultiplecancertypessupportedbymultiplemutationalprocessesandtrinucleotidesignaturesspecificallyenrichedinhighly-transcribedTSSs.Ourfindingsandstatisticalmodelenabledetailedstudiesofthemechanismsofsomaticmutagenesisandadvancesourunderstandingofgeneticdriversofdisease.
![Page 119: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/119.jpg)
111
MutationalSignatures
Complexmosaicstructuralvariationsinhumanfetalbrains
ShobanaSekar1,LiviaTomasini2,MariaKalyva3,TaejeongBae1,LoganManlove1,BoZhou4,JessicaMariani2,FritzSedlazeck5,AlexanderE.Urban4,ChristosProukakis3,FloraM.Vaccarino2,
AlexejAbyzov1
1MayoClinic,2YaleUniversity,3UniversityCollegeLondon,4StanfordUniversity,5BaylorCollege
ofMedicine
AlexejAbyzovSomaticmosaicismincellsofthehumanbrainiscommonandmayhavefunctionalconsequencesthatleadtodiseasesincludingneurologicalones.Mosaicvariationsinbraincanbepointmutations,insertionsofmobileelements,andstructuralchanges.Previouslywedetectedanddescribed200-400mosaicpointmutationspersinglecellclonesfromcorticesofthreehumanfetuses(15to21weekspostconception).Herewedescribefourmosaicstructuralvariations(SVs)inthesamebrains.TheSVswereofkilobasescaleandcomplex,i.e.,consistingofdeletion(s)andafewrearrangedgenomicfragmentsthatsometimesoriginatedfromdifferentchromosomes.Sequencesatbreakpointsattherearrangementshadmicrohomologiessuggestingtheiroriginfromreplicationerrors.OneSVwasfoundintwoclonesandwetimeditsoriginto~14weekspostconception.OurstudyrevealstheexistenceofmosaicSVs,likelyarisingfromcellproliferation,inthehumanbraininmid-neurogenesis.
![Page 120: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/120.jpg)
112
PATTERNRECOGNITIONINBIOMEDICALDATA:CHALLENGESINPUTTINGBIGDATATOWORK
POSTERPRESENTATIONS
![Page 121: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/121.jpg)
113
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWorkPatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
Stratificationofkidneytransplantrecipientsbasedontemporaldiseasetrajectories
IsabellaFriisJørgensenPhD1,SørenSchwartzSørensenPhD2,SørenBrunakPhD1
1NovoNordiskFoundationCenterforProteinResearch-FacultyofHealthandMedicalSciences-UniversityofCopenhagen-Blegdamsvej3B-DK-2200CopenhagenN-Denmark;2DepartmentofNephrology-Rigshospitalet-CopenhagenUniversityHospital-Blegdamsvej9-DK-2100
CopenhagenØ-Denmark
IsabellaFriisJørgensenOrgantransplantationsoftenimprovethelifeofchronicallysickpatients.However,immune-suppressivemedicationgiventotransplantrecipientsincreasetheriskofcomplications,especiallyinfectionsandinfection-relateddeath.Oneinfivekidneytransplantrecipientsdiefrominfection.Wewanttostratifykidneytransplantrecipientsintogroupsofpatientswithdifferentpatternsofinfectiousdiseasesandmortalitytopredictwhichpatientshavehigherriskofspecificinfections.WeusetheDanishNationalPatientRegistry(DNPR)thatcontainshospitaldiagnosesfor6.9millionpatientsfromtheentireDanishpopulationfrom1994to2018.Weuseapreviouslypublishedmethodtoidentifysignificanttime-dependentdiseasetrajectoriesforallpatientswithakidneytransplantation.Subsequently,weusehierarchicalclusteringofJaccarddistancesbetweenthediseasetrajectoriestofinddistinctgroupsoftrajectoriesfromkidneytransplantrecipients.IntheDNPR,weidentified5,644patientswithakidneytransplantationresultingin43significantdiseasetrajectoriesthatconsistofthreeconsecutivediseasesincludingseveralinfectious-relateddiagnoses.Morethan87%ofthekidneytransplantationrecipientsfollowatleastoneofthesetrajectories;hencearediagnosedwiththethreediseasesintheorderthetrajectoryspecifies.Clusteringrevealstwomaingroupsoftemporaldiseasetrajectories.Weidentifypatientsfollowingthetwogroupsofdiseasetrajectoriesanddiscoversignificantdifferencesinmortalityafterkidneytransplantationbetweenpatientsfollowingdifferentdiseasetrajectories.Thisstudyusedpreviousdiseasehistoryfromlarge-scalehospitaldiagnosestostratifycommon,temporaldiseasetrajectoriesintotwodistinctgroups.Dependingonthetypeoftrajectorykidneytransplantationrecipientsfollowsignificantdifferencesinmortalityareseen.Thesemethodscanbeusedtoguidecliniciansabouthigherrisksofcertaininfectionsandmortalityofcertaingroupsofkidneytransplantrecipients.
![Page 122: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/122.jpg)
114
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
ModelingGeneExpressionLevelsfromEpigeneticMarkersUsingaDynamicalSystemsApproach
JamesBrunner1,JacobKim2,KordM.Kober3
1MayoClinic,Rochester,MN;2ColumbiaUniversity,NewYork,NY;3UniversityofCalifornia,San
Francisco,CA
KordKoberGeneregulationisanimportantfundamentalbiologicalprocessandinvolvesanumberofcomplexbiologicalprocessesthatareessentialfordevelopmentandadaptationtotheenvironment.Understandingtheroleofepigeneticchangesingeneexpressionisafundamentalquestionofmolecularbiology.Predictinggeneexpressionfromepigeneticdataisanactiveareaofresearchandpreviousstudieshaveusedstatisticalapproachesforbuildingpredictionmodels.Dynamicalsystemscanbeusedtogenerateamodeltopredictgeneexpressionusingepigeneticdataandageneregulatorynetwork(GRN).Bydynamicallysimulatinghypothesizedmechanismsoftranscriptionalregulation,weprovidepredictionsbaseddirectlyonthesebiologicalhypotheses.Furthermore,astochasticdynamicalsystemprovidesuswithadistributionofgeneexpressionestimates,representingthepossibilitiesthatmayoccurwithinthecell.ThepurposeofthisstudyistodevelopandevaluateastochasticdynamicalsystemsmodelpredictinggeneexpressionlevelsfromepigeneticdataforagivenGRN.Wemodelgeneregulationusingapiecewise-deterministicMarkovprocess(PDMP)wheretranscriptionfactor(TF)bindingisaBooleanrandomvariablerepresentingthebound/unboundstateofabindingsiteregionofDNA.TFbindingisgivenasthedifferenceoftwoPoissonjumpprocesses(i.e.,bindingandunbinding),sothattimebetweenbindingandunbindingeventsisexponentiallydistributedwithpropensitiestakentobelinearfunctionsoftheavailableTF.EpigeneticmodificationoftheTFbindingsiteimpactsthebindingpropensityofTFandismeasuredasthepercentageofmethylatedbases(i.e.,beta).WeusealinearordinarydifferentialequationbasedontheunderlyingGRNtodeterminethevalueofthetranscriptbetweenTFbindingorunbindingevents.Weincludebaselinetranscriptionanddecayandareabletosolveexactlybetweenjumpsofbinding/unbindingevents.Inadiscretespace,continuoustimeMarkovprocess,theequilibriumdistributioncanbeestimatedbysamplingfromarealizationoftheprocess.ForourcontinuousspacePDMPwecanestimatetheequilibriumdistributioninasimilarmannerusingkerneldensityestimationwithaGaussiankernel.Weestimatethemarginaldistributionsofvariousgenevariableswitha1-dimensionalkernel.WeuseaGRNassumetobeknowntocreateamodelofgeneregulationthatincludesTFbindingdynamics.Weassociatebindingsiteswiththegenesthattheyregulateandusetheseassociationstocreateabipartitegraph.TheGRNandtraining/testingdataarecreatedfrompubliclyavailabledata.Theepigeneticparameterisassumedtobemeasurable.Theremainingparametersareestimatedusinganegativelog-likelihoodminimizationprocedure.Wecancomputealog-likelihoodforasetofpairedepigeneticandtranscriptionsamplesbytimeaveragingasamplepathagainstaGaussiankernel.Wereportonthedesignandevaluationofthemodel’sperformance.
![Page 123: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/123.jpg)
115
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
TranslatingBigDataneuroimagingfindingsintomeasurementsofindividualvulnerability
PeterKochunov1,PaulThompson2,NedaJahanshad2,ElliotHong1
1UniversityofMarylandSchoolofMedicine,Maryland,USA;2UniversityofSouthernCalifornia,
California,USAPeterKochunovWeproposeanintuitiveanatomicallyinformedapproachtoderiveanindexofsimilaritybetweenindividualbrainpatternsandtheexpectedpatternsofneuropsychiatricdisordersbasedonBigDataneuroimagingstudies.BigDataneuroimagingstudies,suchastheseperformedbyEnchancingNeuroImagingGeneticsMetaAnalysis(ENIGMA)consortiumprovidedscientificcommunitywiththeregionalpatternsofeffectsizesincommonneuropsychiatricdisorderssuchasschizophrenia(SZ),bipolarandmajordepressivedisorders(BPandMDD),epilepsy(EP),Alzheimer’sdementia(AD),mildcognitiveimpairment(MCI)andothers.ThesepatternsdescriberegionaldeficitusingstandardizedsMRI,dMRIandrsfMRIworkflows.Theyarederivedfromstatisticallypowerfulandinclusivesamplesandarehighlyreproducible(r=0.8-0.9)inindependentsamples.Wedeveloped“RegionalVulnerabilityIndex”(RVI)tomeasuresimilaritybetweenanindividualandtheexpectedpatternofthepatient-controldifferencesRVIcanbecalculatedforasingleoracrossimagingmodalities.ForasinglemodalityRVI,exampleusesFractionalAnisotropy(FA)measurefromdMRI,iscalculatedasfollowing.FAforeachofthe23majorwhitematterregions,asdefinedbyENIGMAatlas,inanindividualisconvertedtoz-valuesby(A)calculatingtheresidualvaluesafterregressingoutageandsexeffectsforthisregionand(B)subtractingtheaveragevalueforaregionand(C)dividingbythestandarddeviationcalculatedfromthehealthycontrols.Thisproducesavectorof23z-values(oneperregion)foreachindividualinthesample.RVIiscalculatedasthecorrelationcoefficientbetween23region-wisezvaluesforthesubjectandthepatient-controlseffectsizesinENIGMA.RVItakesvaluesfrom1(individualpatternisalignedwithdisorderpattern)to-1(individualpatternisinanti-alignment).Forcross-modalityresearch,RVIcanbeexpandedhierarchicallybybuildingacombinedvectorthatincludesmultiplephenotypes.Forexample,theRVI-WhiteMattercalculationusesavectorof69valuesthatcombinetract-wiseFA,radial(RaD)andaxial(AxD)diffusivityvaluesperperson.Tomergeeffectsizesacrossdiversedomains,weuseapseudo-ordinarytransformationthatmapseffectsizesbetween0and1whilepreservingtherelativedistancebetweenthem.WefirstdemonstratedthatRVI-SZvaluesaresignificantlyelevatedinpatientswithSZandarealsopredictiveoftreatmentresistance.ThatissubjectswhodevelopedresistancetomodernantipsychoticmedicationshadsignificantlyhigherRVI-SZvaluesthanthesewhorespondedtotreatment.WenextdemonstratedthatRVIforSZweresignificantlycorrelatedwithRVIforADbutnotMCIduetosignificantoverlapindeficitpatternsbetweenthesedisorders.WenextshowedthatcalculatingRVIacrossmultiplemodalitiesproducesvulnerabilitymeasuresthataremoresensitivetopatientcontroldifferencesintheindependentdatasetsandshowedstrongersensitivitytocognitivedeficitsandnegativesymptoms.TheRVIcalculatortoolsaredistributedwithsolar-eclipsesoftware(www.solar-eclipse-genetics.org)
![Page 124: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/124.jpg)
116
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
Automatingnew-usercohortconstructionwithindicationembeddings
RachelD.Melamed
DepartmentofComputationalBiomedicineandBiomedicalData,UniversityofChicago
RachelMelamedTheelectronichealthrecordisarisingresourceforquantifyingmedicalpracticeanddiscoveringadverseeffectsofdrugs.Oneofthechallengesofhealthcaredataisthehighdimensionalityofthehealthrecord.Anystudyofpatternsinhealthdatamustaccountfortensofthousandsofpotentiallyrelevantdiagnosesortreatments.Inthiswork,wedevelopindicationembeddings,awaytoreducethedimensionalityofhealthdatawhilecapturingtheinformationrelevanttotreatmentdecisions.Wedemonstratethattheseembeddingsrecovertherapeuticusesofdrugs.Thenweusetheseembeddingsasaninformativerepresentationofrelationshipsbetweendrugs,betweenhealthhistoryeventsanddrugprescriptions,andbetweenpatientsataparticulartimeintheirhealthhistory.Weshowtheapplicationoftheseembeddingsinareasofcurrentresearch.Fordrugsafetystudies,particularlyretrospectivecohortstudies,ourlow-dimensionalrepresentationhelpsinfindingcomparatordrugsandconstructingcomparatorcohorts.Thisenablesustodevelopanautomatedapproachtochoosecomparatorcohortsforatreatedpopulation.
![Page 125: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/125.jpg)
117
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
Reproducibility-optimizedstatisticaltestingforomicsstudies
TomiSuomi,LauraElo
TurkuBioscienceCentre,UniversityofTurkuandÅboAkademiUniversity,Turku,Finland
LauraEloDifferentialexpressionanalysisisoneofthemostcommontypesofanalysesperformedonvariousbiologicalandbiomedicaldata,includinge.g.RNA-sequencingandmassspectrometryproteomics.Itistheprocessthatdetectsfeatures,suchasgenesorproteins,showingstatisticallysignificantdifferencesbetweenthesamplegroupsundercomparison.However,asdifferentteststatisticsperformwellindifferentdatasets,thechoiceofanappropriateteststatistichasremainedamajorchallenge.Toaddressthechallenge,ourreproducibility-optimizedteststatistic(ROTS)optimizesthestatisticonthebasisofthedatabymaximizingthereproducibilityofthetop-rankedfeaturesthroughabootstrapprocedure.Finally,itprovidesarankingofthefeaturesaccordingtotheirstatisticalevidencefordifferentialexpressionbetweenthesamplegroups.WehaveshowntherobustperformanceofROTSinarangeofstudiesfromtranscriptomicstoproteomics,coveringbothbulkandsinglecellmeasurements.ROTSisfreelyavailableasanRpackageinBioconductor.
![Page 126: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/126.jpg)
118
PatternRecognitioninBiomedicalData:ChallengesinPuttingBigDatatoWork
DataIntegrationExpectationMaps:Towardsmoreinformed'omicdataintegration
TiaTate1,ChristianRichardson2,ClarLyndaWilliams-DeVane3
1UniversityofNorthCarolina-Charlotte,2DukeUniversity,3FiskUniversity
ClarLyndaWilliams-DeVaneInnovativedatatechnologiesanddecreasingcostshaveexpandedthescopeofavailabledatarelatingtovariousdiseases.Avastamountof-omicsdatageneratedatdiverselevels(DNA,RNA,protein,metaboliteandepigenetic)haverevealedrelationshipsofvariousbiologicalprocesses.Generally,thesediversedatatypesareconsideredindependentlywhilecombinationsoftwoormoredatatypesarelessexplored.Thisnarrowapproachoftenfailstoidentifytheintricateinteractionsresponsiblefortheetiologyofcomplexdisease.Completebiologicalmodelsofcomplexdiseasesareonlylikelytobediscoveredifthevariouslevelsof-omicmechanismsareconsideredfromanintegrativeperspective.Integrativemodelsoftenrequiretheintegrationofbiological,computational,mathematical,andstatisticaldomains.However,awell-documentedshortageofresearcherswithacommandofmultipledomainsexists.Thus,wehaveproposedtheuseofDataIntegrationExpectationMaps(DIEMs)asvisualtoolsforfacilitatingtheunderstandingofintegratingvarious-omicdatatypestounderstandcomplexdiseasesbyfillingingapsinbiologicalknowledge.DIEMsprovideauser-friendlyformatforunderstandingintegrativemodeldevelopmentincomplexdiseasesby1)identifyingdataformatsthatcanand/orhavebeenintegrated,2)providingguidanceonthebestmethodtointegratethedata,and3)providinganexpectationofbiologicalinsighttobegainedfromtheintegration.
![Page 127: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/127.jpg)
119
PRECISIONMEDICINE:ADDRESSINGTHECHALLENGESOFSHARING,ANALYSIS,ANDPRIVACYATSCALE
POSTERPRESENTATIONS
![Page 128: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/128.jpg)
120
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
Integratedomicsdataminingofsynergisticgenepairsforcancerprecisionmedicine
EunaJeong,ChoaPark,SukjoonYoon
SookmyungWomen'sUniversity
EunJeongCurrenthigh-throughputtechnologiesenablesimultaneousacquisitionofmulti-levelomicsandRNAi/chemicalscreeningdataincancers.Productionandintegrationofthesedatahelpidentifyingassociationsofdrugtargetsandsynergisticbiomarkers(mutationsorgeneexpression),thusacceleratingtheirclinicalapplicationsandpatientstratification.WehaveextensivelycarriedoutcancerbigdataminingandphenotypicsiRNAlibraryscreeningforfindingtheoptimalcombinationoftargetsandbiomarkersforadvancedcancertherapiessuchasregulatingcancerstem-likecells(CSLCs)andoncogenictranscriptionfactors.Ourmultiplexedscreeningdissectphenotypicresponsesintosensitivityandresistancytothetargetknockdown.Combinedwithmutaomeandtransciptomedataofscreenedcelllines,targetome-wideknockdowndatarevealthefunctionalaspectofsynergisticeffectsbetweentargetsiRNAsandmutation/transcriptionsignatures,leadingtothediscoveryofnovelsyntheticlethalgenepairs.Productionandintegrationofthesedataenabledustoidentifytarget-biomarkercombinationsforacceleratingtheirclinicalapplicationsandpatientstratification.
![Page 129: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/129.jpg)
121
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
Thepowerofdynamicsocialnetworkstopredictindividuals'mentalhealth
ShikangLiu1,DavidHachen1,OmarLizardo2,ChristianPoellabauer1,AaronStriegel1,TijanaMilenkovic1
1UniversityofNotreDame,2UniversityofCaliforniaatLosAngeles
ShikangLiuPrecisionmedicinehasreceivedattentionbothinandoutsidetheclinic.Wefocusonthelatter,byexploitingtherelationshipbetweenindividuals'socialinteractionsandtheirmentalhealthtopredictone'slikelihoodofbeingdepressedoranxiousfromrichdynamicsocialnetworkdata.Existingstudiesdifferfromourworkinatleastoneaspect:theydonotmodelsocialinteractiondataasanetwork;theydosobutanalyzestaticnetworkdata;theyexamine"correlation"betweensocialnetworksandhealthbutwithoutmakinganypredictions;ortheystudyotherindividualtraitsbutnotmentalhealth.Inacomprehensiveevaluation,weshowthatourpredictivemodelthatusesdynamicsocialnetworkdataissuperiortoitsstaticnetworkaswellasnon-networkequivalentswhenrunonthesamedata.
![Page 130: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/130.jpg)
122
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
Robust-ODAL:Learningfromheterogeneoushealthsystemswithoutsharingpatient-leveldata
JiayiTong1,RuiDuan1,RuowangLi1,MartijnJ.Scheuemie2,JasonH.Moore1,YongChen1
1UniversityofPennsylvania,2JanssenResearchandDevelopmentLLC
JiayiTongElectronicHealthRecords(EHR)containextensivepatientdataonvarioushealthoutcomesandriskpredictors,providinganefficientandwide-reachingsourceforhealthresearch.IntegratedEHRdatacanprovidealargersamplesizeofthepopulationtoimproveestimationandpredictionaccuracy.Toovercometheobstacleofsharingpatient-leveldata,distributedalgorithmsweredevelopedtoconductstatisticalanalysesacrossmultipleclinicalsitesthroughsharingonlyaggregatedinformation.However,theheterogeneityofdataacrosssitesisoftenignoredbyexistingdistributedalgorithms,whichleadstosubstantialbiaswhenstudyingtheassociationbetweentheoutcomesandexposures.Inthisstudy,weproposeaprivacy-preservingandcommunication-efficientdistributedalgorithmwhichaccountsfortheheterogeneitycausedbyasmallnumberoftheclinicalsites.Weevaluatedouralgorithmthroughasystematicsimulationstudymotivatedbyreal-worldscenariosandappliedouralgorithmtomultipleclaimsdatasetsfromtheObservationalHealthDataSciencesandInformatics(OHDSI)network.TheresultsshowedthattheproposedmethodperformedbetterthantheexistingdistributedalgorithmODALandameta-analysismethod.
![Page 131: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/131.jpg)
123
Precisionmedicine:addressingthechallengesofsharing,analysis,andprivacyatscale
PharmGKB:AutomatedLiteratureAnnotations
MichelleWhirl-Carrillo1,LiGong1,RachelHuddart1,KatrinSangkuhl1,RyanWhaley1,MarkWoon1,JuliaM.Barbarino2,JakeLever3,RussB.Altman4,TeriE.Klein5
1DepartmentofBiomedicalDataScience,StanfordUniversity;2FormerlyDepartmentofBiomedicalDataScience,StanfordUniversity;3DepartmentofBioengineering,StanfordUniversity;4DepartmentsofBioengineering,MedicineandGenetics,StanfordUniversity;
5DepartmentsofBiomedicalDataScienceandMedicine,StanfordUniversity
MichelleWhirl-CarilloPharmGKBisthelargestpubliclyavailableresourceforpharmacogenomics(PGx)discoveryandimplementation.Itsmissionistocollect,curate,integrateanddisseminateknowledgeabouthowhumangeneticvariationinfluencesdrugresponse.PharmGKBscientistSmanuallycuratetheprimaryliteraturetocapturedetailsofpublishedpharmacogenomicstudiessuchasvariant-gene-drug-phenotypeassociations,statisticalsignificance,studysizeandpopulationcharacteristics.PharmGKBreferstothesemanuallycreatedannotationsas“VariantAnnotations.”
![Page 132: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/132.jpg)
124
PACKAGINGBIOCOMPUTINGSOFTWARETOMAXIMIZEDISTRIBUTIONANDREUSE
WORKSHOPPOSTERPRESENTATIONS
![Page 133: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/133.jpg)
125
Workshop:PackagingBiocomputingSoftwaretoMaximizeDistributionandReuse
ApolloprovidesCollaborativeGenomeAnnotationEditingwiththepowerofJBrowse
NathanDunn1,ColinDiesh2,RobertBuels2,HelenaRasche3,AnthonyBretaudeau4,NomiHarris1,IanHolmes2
1LawrenceBerkeleyNationalLab,2UCBerkeley,3UniversityofFreiburg,4INRA
NathanDunnGenomeannotationprojectsinvolvemulti-stepworkflowsthatarelargelyautomated.However,evenwithafullyautomatedannotationpipelinevisualinspectionandrefinementofdiversetypesofinformationsuchasgenomicandtranscriptomealignmentsandpredictivemodelsbasedonsequenceelementsarecriticaltoassureandimprovetheaccuracyofthegenomeannotationspriortopublication.Tothisend,Apollo(https://github.com/GMOD/Apollo/)isawebapplicationthatprovidesresponsiveandcustomizablevisualizationandeditingofgenomicelements.BuiltontopoftheJBrowsegenomebrowser(http://jbrowse.org/)anditslargeregistryofplugins(https://gmod.github.io/jbrowse-registry/),Apollosupportsefficientannotationcurationthroughdrag-and-dropediting,alargesuiteofautomatedstructuraleditoperations,theabilitytopre-definecuratorcommentsandannotationstatustomaintainconsistency,attributionofannotationauthors,fine-graineduserandgroupaccessandeditpermissions,andavisualhistoryofrevertibleannotationedits.SettingupanewgenomeannotationinApolloisstraightforward.ApollocanberunfromDockerorfromprovidedAWSinstances,andgenomeswithfeatureevidencecanberetrievedfromanexistingJBrowsedirectory.Wehavealsorecentlyenabledresearcherstouploadtheirgenomesequenceandfeatures(inFASTA,VCF,BAM,orGFF3format)directlytoApollo,minimizingtheneedforscriptingorserveraccess.ItisalsopossibletocreateannotationsontheflyfromBLATorBLASTsearchresults,whichprovidesawaytoinitiateagenepreviouslyannotatedonacloselyrelatedspecies..ApolloprovidesaPythonlibrarythatwrapstheweb-services(https://github.com/galaxy-genome-annotation/python-apollo)sothatworkflowenvironmentssuchasGalaxycanbeautomatedsothattheoutputofanautomatedworkflowcandirectlycreategenomeprojects,provideevidence,andmanageaccesstoanApolloinstance.Apollosupportsseveralpopularformatsfordataexport.StructuralgenomeannotationscanbeexportedasFASTA,GFF3,orVCF(ifannotatingvariants)alongwithanyassociatedmetadata.FunctionalannotationsmappedtoGeneOntologytermscanbeexportedinGPAD2orGPI2format.Apolloisanopen-sourcetoolusedinoveronehundredgenomeannotationprojectsaroundtheworld,rangingfromtheannotationofasinglespeciestolineage-specificeffortssupportingtheannotationofdozensofgenomes.https://github.com/GMOD/Apollo/https://genomearchitect.readthedocs.io/
![Page 134: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/134.jpg)
126
Workshop:PackagingBiocomputingSoftwaretoMaximizeDistributionandReuse
g:Profiler - One functional enrichment analysis tool, many interfaces serving life science communities
Liis Kolberg, Uku Raudvere, Ivan Kuzmin, Jaak Vilo, Hedi Peterson
University of Tartu
Making sense of gene lists plays an important role in majority of biological and biomedical experiments. There are several methods and tools that help the scientists to carry out the computational load of these tasks. One of such is g:Profiler (https://biit.cs.ut.ee/gprofiler), a widely used toolset for functional interpretation and conversion of gene lists from hundreds of species. g:Profiler has served the community since 2007 and continues to provide life scientists with the most up-to-date data and methods to this day. Keeping the service trustworthy, the results reproducible and transparent has been the main goal of the team developing g:Profiler. The success in this end is indicated in the increasing number of user requests per year, which already in 2019 alone is close to 9 million queries. These millions of queries originating across the world reflect the diversity of usage preferences, skill sets and research goals of the scientific community. We, as the developers of g:Profiler, have taken this into account by developing and supporting different access options which, in hindsight, has been a huge factor in the increasing user traffic. On the one hand, g:Profiler web application provides researchers, who want quick and easily interpretable results, with nice visualizations, searchable tables and data export possibilities. On the other hand, there is a large bioinformatics community, whose members prefer to analyze gene lists in an automated manner. We support them by offering a standardized access through public APIs. And, as R and Python are the most popular programming languages among life scientists with informatics expertise, we have simplified the usage of APIs by wrapping them into corresponding packages named gprofiler2 and gprofiler-official, respectively. For the users somewhere in between, g:Profiler is also available from the Galaxy platform, which is a popular framework for data intensive biomedical research pipelines run in a graphical user interface. It is clear that the tools in such an interdisciplinary field need to be flexible in order to fully benefit the research community. However, from our experience, the complexity of providing a widely distributed toolset lies in the maintenance of the services rather than in the development, and this is the core reason for depreciation of tools. In g:Profiler the separate interfaces all use the data and methods from a shared hub making them reliable and consistent with each other even after the frequent data updates. We are positive that g:Profiler has been able to help thousands of researchers across the life science community because our priorities have been to reuse high quality and regularly updated data, and to maximize the access options so that we would not leave any life science subcommunity behind.
![Page 135: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/135.jpg)
127
Workshop:PackagingBiocomputingSoftwaretoMaximizeDistributionandReuse
IncreasingusabilityanddisseminationofthePathFXalgorithmusingwebapplicationsanddockersystems
JenniferWilson1,NicholasStepanov2,AjinkyaChalke2,MikeWong3,DragutinPetkovic2,RussB.Altman4
1DepartmentofChemical&SystemsBiologyatStanfordUniversity;2ComputerScienceDeptat
SanFranciscoStateUniversity;3COSEComputingforLifeSciencesatSanFranciscoStateUniversity;4HelixGroupatStanfordUniversity
MikeWongLimitedefficacyandunacceptablesafetyconfoundtherapeuticdevelopment.Identifyingpotentialliabilitiesearlierindrugdevelopmentcouldsignificantlyimprovesuccessrates.Recently,incollaborationwiththeUSFDA,wedevelopedthePathFXalgorithmandopenlyavailablePathFXwebapplicationforbetterunderstandingpathway-levelsafetyandefficacyphenotypesassociatedwithadrug’starget(s).RunningPathFXalgorithmlocallywouldenableimprovedefficiency,security,andprivacy,howeverinstallationofPathFXanditsdependenciesischallengingfornon-computationalscientistsandpreventsdissemination.Inaddition,whilePathFX-webquicklyanalyzesnetworkassociations,thephenotypeclusteringfeaturehashighcomputationalcoststhatlimittheefficiencyofthesharedcloudserver.Toresolvethesechallenges,wedevelopedPathFX-webDockercontainerwhichprovidesaneasy-to-install,easy-to-usewebinterface,astandalonecommand-lineformulationtoPathFX,addedsecurity/privacyandallowsleveragingofthecomputationalpoweroftheuser’shardware.
![Page 136: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/136.jpg)
128
TRANSLATIONALBIOINFORMATICSWORKSHOP:BIOBANKSINTHEPRECISIONMEDICINEERA
WORKSHOPPOSTERPRESENTATIONS
![Page 137: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/137.jpg)
129
Workshop:TBIworkshop
Identificationofbiomarkersrelatedtoautismspectrumdisorderusinggenomicinformation
LeenaSait,MarthaGizaw,IosifVaisman
SchoolofSystemsBiology,GeorgeMasonUniversity
LeenaSaitAutismspectrumdisorder(ASD)isoneofthemostcommonneurodevelopmentaldisorders.Worldwide,ASDtendstohaveaprevalenceofoneper132persons,withanestimatedprevalenceof1in59children,accordingtoCDC’sAutismandDevelopmentalDisabilitiesMonitoringNetwork.Todate,noeffectivemedicaltreatmentsforthecoresymptomsofASDexists.However,biomarkerscapableofdetectinganddiagnosingASDcanhelptotranslateexperimentalresearchresultstobenchsideclinicalpractices.BiomarkerdiscoveryinASDiscomplicatedbythediversityofcoresymptomswhichcomprisedeficitsinsocialcommunication,presenceofrigid,repetitiveandstereotypicalbehaviors,andcomorbidmedical(e.g.,epilepsy)orpsychiatricsymptoms.TheEU-AIMSLongitudinalEuropeanAutismProject(LEAP),thelargestconsortiamadeagreatadvancementinthediscoveryofbiomarkersforASD.Itseekstoidentifystratificationbiomarkersusingneurobiologicalorneurocognitivemeasures,neuroimaging,electrophysiology,biochemistryandgenetics.Thisworkisaimedattheidentificationofsinglenucleotidepolymorphisms(SNPs)basedonSNPgenotypingingenomicDNAinalargecohortofASDpatientsandunaffectedrelatedindividualstohelpunderstandtheexactgeneticcausesofASD.Wehypothesizedthatrankingthegenesbasedondistanceinthespaceoftheallelesfrequenciesbetweenaffectedandunaffectedpopulationscanbeusedtoidentifynewputativebiomarkers.ThedatasetretrievedfromtheGeneExpressionOmnibusdatabase(GSE6754)containsmorethan6000samplesfrom1,400families.OurresultsshowthattheSNPsthatarehighlyrankedbythedistanceinthree-dimensionalgenotypecountspacebetweenalltheaffectedandunaffectedsubjectsinthecohortaremorelikelytobelinkedtoASD.TheseresultscanopennewpossibilitiesforfurtherinvestigationinidentifyingthegeneticmechanismsofASD.
![Page 138: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/138.jpg)
130
Workshop:TBIworkshop
Apan-cancer3-genesignaturetopredictdormancy
IvyTran1,AnchalSharma2,SubhajyotiDe2
1RutgersUniversity-Camden,2RutgersCancerInstituteofNewJersey
IvyTranTumordormancyischaracterizedbythedisseminationofhibernatingtumorcellsthatdonotproliferateuntilyearsafterapparentlysuccessfulremovalofpatients’primarycancer,resultinginthelaterelapseofthecancer.Distinguishingbetweentheriskofearly(£8months)andlate(³5years)relapseincancerpatientsisimportantforthetargetedtreatmentofthetumor.Inthisstudy,weidentified53genesthatweresignificantlyup-regulatedordown-regulatedindormantcells,fromwhichthreegenes,CD300LG,OCIAD2,VSIG4,weredeterminedbyrecursivefeatureeliminationtobethemostimportantfeaturesinpredictingtumordormancy.Usingthisthreegenesignature,wetrainedaRandomForestalgorithmonacross-validated(10foldrepeated3times)dataset(n=422)randomlysubsettedintotrainingdata(75%)andtestdata(25%),consistingofsevendifferenttumortypes-testicularcancer,breastcancer,glioblastomamultiforme,lungcancer,colonrectalcancer,kidneycancerandmelanoma.Thetunedpredictionmodelyielded80.19%predictionaccuracyusingconfusionmatrixanalysis,and82.74%predictionaccuracywhenusingAUCofaROCcurveastheaccuracymetric.Whenindependentlytestingthemodelonavalidationset(n=44)oflivercancerdownloadedfromICGC,confusionmatrixanalysisyieldeda67.44%accuracyandAUCofaROCcurveyieldeda60.48%accuracy.Thisidentified3-genesignaturecanbeusefulinpredictingearlyorlaterelapseofcancerinpatientsinclinicalpractice.
![Page 139: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/139.jpg)
131
AUTHORINDEX
’
’tJong,Geert·85
A
Abyzov,Alexej·111Adkins,Joshua·96
·3Agrawal,MonicaAlavi,Ali·99
·28Allen,MaryA.·47Alterovitz,Wei-Lun
Althagafi,Azza·70·27,34,37,123,127Altman,RussB.
·23Anastopoulos,IoannisN.·21Andrade-Navarro,MiguelA.
Andrianova,Katia·74·31Arslanturk,Suzan
·50Atwal,Gurnit
B
·63Bae,HoBae,Taejeong·111
·65Baladandayuthapani, VeerabhadranBaltrus,David·96Barash,Yoseph·92
·34,123Barbarino,JuliaM.·10,108Barik,Amita
Barlaskar,Sabiha·99·39Barnard,Martha·19Beam,AndrewL.
Bebek,Gurkan·75Belyeu,Jon·83
·60Benchek,PenelopeBerger,Howard·85
·65Bhattacharyya,Rupam·22Blinder,Pablo·20,76,93Bobak,CarlyA.
Bock,Christoph·91·25Bourque,Guillaume
·7Branch,Andrea·2Brand,Lodewijk
Brannon,Charlotte·77Bretaudeau,Anthony·125
·27Brinton,ConnorBrodie,Sonia·85Brooks,ThomasG.·79,92Brown,James·85Brown,Joe·83Brown,Yaadira·80Brunak,Søren·113
Brunner,James·114Buels,Robert·125
·4Bui,NamBull,ShelleyB.·105
·21Burkhardt,Sophie·60,64Bush,WilliamS.
·42Bustamante,CarlosD.
C
·6Cai,ChunhuiCai,DanDan·99
·19Cai,TianxiCalvanese,Vincenzo·95
·44Candido,ElisaCapellera-Garcia,Sandra·95Carleton,BruceC.·78,85Carrillo,KatherineI.·81
·49Ceri,StefanoChalke,Ajinkya·127Chaudhry,Shahnaz·85
·3Chen,IreneY.·4Chen,JessicaW.
·12Chen,JianhanChen,Jun·70
·45Chen,Yang·38,122Chen,Yong
Cheong,HeeJin·102·56Cheong,Jae-Ho·53Cherng,SarahT.
Chia,Nicholas·71·50Chmura,Jacob·63Choi,Hyun-Soo
·68,103Chrisman, Brianna·54Christensen,BrockC.
·15Christensen,SarahChu,Chong·82
·6Cohen,WilliamW.·52Coker,Beau
·64CookeBailey,JessicaN.Cormier,Michael·83CornwellIII,EdwardE.·80
·4,42Costa,HelioA.·64Crawford,DanaC.
·5Crowell,Andrea·8Cui,Tianyi
D
Dale,Ryan·88·53Danieletto,Matteo
De,Subhajyoti·130·27Derry,Alexander
Diesh,Colin·125Ding,Yali·84
![Page 140: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/140.jpg)
132
Dovat,Sinisa·84·28Dowell,RobinD.
·31Draghici,SorinDrögemöller,BrittI.·78,85
·38,122Duan,Rui·44Duchen,Raquel·7,53Dudley,JoelT.·47Dunker,A.Keith
Dunlap,Kaiti·103 ·68Dunlap, Kaitlyn
Dunn,Nathan·125Durmaz,Arda·75
E
Ekstrand,Sophia·95·15El-Kebir,Mohammed
Elo,Laura·117
F
·47Faraggi,EshelFarlik,Matthias·91Feng,Song·96
·43Feng,YunyiFitzGerald,GarretA.·79,92
·60Fondran,JeremyR.Fortelny,Matthias·91
·19Fried,Inbar·23Friedl,Verena
G
·59Gao,Jean ·65Garmire, Lana
Gerstein,Mark·77·10,108Ghadermarzi,Sina
Ghayoori,Sholeh·85Ghosh,Sayan·96
·28Gilchrist,AlisonR.Gizaw,Martha·129
·21Glodde,Josua·22Golgher,Lior
·34,123Gong,LiGorjifard,Sayeh·88Grant,GregoryR.·79,92Groeneweg,GabriellaS.S.·85Gumerov,VadimM.·86
·27,37Guo,MargaretGuo,Yicheng·106
·22Gur,ShirGursoy,Gamze·77
H
·65Ha,MinJin·23Haan,David
·68,103Haber, Nick·35,121Hachen,David
·60Haines,JonathanL.Halappanavar,Mahantesh·96
·66Hall, MollyA.·60Hamilton-Nelson,KaraL.
·24Hao,Jie·5Harati,Sahar
·16,87Harrigan,CaitlinF.Harris,Nomi·125
·8Hauskrecht,Milos ·66He, Xi
·32Hernandez-Ferrer,CarlesHigginson,Michelle·85
·20,76,93Hill,JaneE.·25Hocking,TobyDylan
Hoehndorf,Robert·70,90Hogenesch,JohnB.·92
·11Hogue,ChristopherW.V.Holmes,Ian·125Hong,Elliot·115
·3Horng,Steven·47Huang,Fei·2Huang,Heng·58Huang,Kun
·34,123Huddart,RachelHughitt,V.Keith·88Husic,Arman·103
·56Hwang,TaeHyun
I
Ito,Shinya·78,85
J
·44Jaakkimainen,Liisa·11Jagannathan,N.Suhas
Jahanshad,Neda·115Jang,Hyun-Jong·94Jenkins,Willysha·89Jeong,Euna·120Jørgensen,IsabellaFriis·113Jouline,Igor·74
·7Jun,Tomi·63Jung,Dahuin
Jung,Jae-Yoon·103
![Page 141: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/141.jpg)
133
K
Kafkas,Senay·90Kalantari,John·71
·68Kalantarian, HaikKalyva,Maria·111
·24Kang,Mingon·56Kar,Nabhonil
Karjalainen,Anzhelika·91·10,108Katuwawala,Akila
Keats,JonathanJ.·88·41Kelly,Libusha
Kent,Jack·103Khan,Ariful·96
·41Khan,SaadKim,Jacob·114Kim,WooJoo·102
·64Kinzy,Tyler ·66Kleber, MarcusE.
·34,81,123Klein,TeriE. ·68,103Kline, Aaron
·47Kloczkowski,AndrzejKober,KordM.·114
·33Kocher,Jean-PierreKochunov,Peter·115
·55Koestler,DevinC.·19Kohane,IsaacS.
Kolberg,Liis·126·19,52Kompa,Benjamin
·32Kong,SekWonKoohi-Moghadam,Mohamad·72
·24Kosaraju,SaiChandraKoster,Johannes·83
·21Kramer,Stefan·13Kriwacki,RichardW.
Krunic,Milica·91·4Kunder,ChristianA.
·60Kunkle,BrianW.·10,108Kurgan,Lukasz
Kuzmin,Ivan·126
L
Lahens,NicholasF.·79,92·33Larson,MelissaC.·33Larson,NicholasB.
Lassnig,Caroline·91Lawrence,CrisW.·79,92LeBlanc,Emilie·103Lee,E.Alice·82Lee,HanSol·102
·53Lee,Hao-ChihLee,Joon-Yong·96
·45Lee,RenaLee,Soohyun·82Lee,SungHak·94
·15,17Leiserson,MarkD.M.·34,123Lever,Jake
·54Levy,JoshuaJ.
Li,Hongyan·72·7Li,Li
·38,122Li,Ruowang·26Lichtarge,Olivier
Lim,Sooyeon·102·43Lin,Deborah
·64Lin,John·20,93Lin,Justin·43Lin,Simon·43Liu,Chang·65Liu,Qingzhi·35,121Liu,Shikang·12Liu,Xiaorong
Liu,Zhandong·100·35,121Lizardo,Omar
Lowry,WilliamE.·100·6Lu,Xinghua
·36Luthria,Gaurav·45Lv,Tianling
M
Ma,Feiyang·95·58Machiraju,Raghu
Macho-Maschler,Sabine·91 ·66Maerz, Winfried
Magee,LauraA.·85·58Mallick,Parag
Manlove,Logan·111Manrai,ArjunK.·101Mariani,Jessica·111
·5Mayberg,HelenMcDermott,Jason·96
·20,93McDonnell,Lauren·55Meier,Richard
Melamed,RachelD.·116Melas-Kyriazi,Luke·101
·47Meng,JingweiMiao,Fudan·85Michalowski,AleksandraM.·88Mikkola,HannaK.A.·95
·35,121Milenkovic,Tijana·53Miotto,Riccardo·13Mitrea,DianaM.
Mock,BeverlyA.·88Monroy,Rebeca·82
·38,122Moore,JasonH.·43Moosavinasab,Soheil
·16,44,50,87Morris,QuaidMueller,Mathias·91
·66Mueller-Myhsok, Bertram
N
·33Na,JieNakatochi,Masahiro·97Nayak,Soumyashant·79,92Nelson,Heidi·71
![Page 142: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/142.jpg)
134
Nelson,William·96·5Nemati,Shamim
Nemesure,Matthew·93·20Nemesure,MatthewD.
·55Neums,LisaNguyen,Justine·96
·31Nguyen,Tin·2Nichols,Kai
·42Nie,AllenNing,Michael·103Nishi,Hafumi·106Noh,JiYun·102
O
·64O'Toole,JohnF.Oh,JiEun·104Ola,MojoyinolaJoanna·91
·10,47,108Oldfield,ChristopherJ.Olufajo,OlubodeA.·80Orloff,Mohammed·75Ostojic,Annie·98
P
·19Palmer,NathanPark,Choa·120Park,Gunseok·104Park,PeterJ.·82
·56Park,Sunho·50Park,Yoonsik
·57Parmigiani,Giovanni ·68,103Paskov, KelleyMarie ·66Passero, Kristin
Patel,ChiragJ.·101·42Patel,RonakY.
·57Patil,Prasad ·68Patnaik, Ritik
Payne,JonathonL.·84Pedersen,Brent·83Pellegrini,Matteo·95Penev,Yordan·103
·37Pershad,Yash·7Perumalswami,Ponni
Peterson,Hedi·126Petkovic,Dragutin·99,127
·26Pham,Minh ·67Pietras, ChristopherMichael
·42Pineda,ArturoL.·49Pinoli,Pietro·49Piro,Rosario
·35,121Poellabauer,ChristianPoelzl,Andrea·91Pohodich,AmyE.·100Polley,EricC.·88
·67Power, LiamProukakis,Christos·111Pruneda,Jonathan·96
·17Przytycka,TeresaM.
Q
Quinlan,AaronR.·83
R
Raman,AyushT.·100·57Ramchandran,Maya·61Ramsey,StephenA.
Rasche,Helena·125Rassekh,Shahrad·78Rassekh,ShahradR.·85Raudvere,Uku·126Reimand,Jüri·110Richardson,Christian·89,118
·47Romero,PedroRoss,ColinJ.D.·78,85
·33Rowsey,Ross·16,87Rubanova,Yulia
·39Ryder,Nathan
S
Sagers,Luke·101Sait,Leena·129
·54Salas,LucasA.Sanatani,Shubhayan·85
·34,123Sangkuhl,KatrinSarantopoulou,Dimitra·79,92
·28Sawyer,SaraL.·38,122Scheuemie,MartijnJ.
·19Schmaltz,AllenSchug,Jonathan·92
·68Schwartz, JesseySedlazeck,Fritz·111
·64Sedor,JohnR.Sekar,Shobana·111
·16,87Selega,Alina·17Sharan,Roded
Sharma,Anchal·130·58Sharpnack,Michael
Shaw,Kaitlyn·85·2Shen,Li·19Shi,Xu
Shoebridge,Stephen·91·21Siekiera,Julia
Simmons,JohnK.·88Skander,Dannielle·75
·67Slonim, DonnaK.·13Somjee,Ramiz·24Song,DaeHyun
Song,JoonYoung·102·3Sontag,David
Sørensen,SørenSchwartz·113·33Sosa,CarlosP.
![Page 143: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/143.jpg)
135
·27Sosa,DanielN.Southerland,William·80
·54Sriharan,AravindhanSrinivasan,Anand·92
·58Srivastava,Arunima·28Stabell,AlexC.
·49Stamoulakatou,Eirini·28Stanley,JacobT.
Staub,Michelle·85·4Stehr,Henning
Stepanov,Nicholas·127 ·68,103Stockham, Nathaniel
·35,121Striegel,AaronStrobl,Birgit·91
·23Stuart,JoshuaM.Sun,Hongzhe·72Sun,MinWoo·103Suomi,Tomi·117
T
·23Tao,Ruikang·6Tao,Yifeng
·68Tariq, QandeelTate,Tia·118
·55Thompson,JeffreyA.Thompson,Paul·115
·39Tintle,NathanTomasini,Livia·111
·38,122Tong,JiayiTran,Ivy·130
·59Tran,NhatTrueman,Jessica·85
·24Tsaku,NelsonZange·11Tucker-Kellogg,Lisa
U
Urban,AlexanderE.·111·47Uversky,VladimirN.
V
Vaccarino,FloraM.·111·54Vaickus,LouisJ.
Vaisman,Iosif·129·7Vandromme,Maxence
·68,103Varma, MayaVilo,Jaak·126
·68,103Voss, Catalin
W
Wagner,Sarah·77 ·68,103Wall, DennisP.
Wan,Ying-Wooi·100·42Wand,Hannah
·33Wang,Chen·29Wang,Gao
Wang,Haibo·72·2Wang,Hua
Wang,Junwen·72·36Wang,Qingbo
Wang,Yuchuan·72·29Wang,Yue
·29Wang,Yunlong·60Warfe,Mike
·68,103Washington,Peter·19Weber,Griffin
·27Wei,Eric·23Weinstein,AlanaS.
West,Nicholas·85·39Westra,Jason·34,123Whaley,Ryan
·60Wheeler,NicholasR.·34,123Whirl-Carrillo,Michelle
Whyte,SimonD.·85Williams-DeVane,ClarLynda·89,118Wilson,Jennifer·127
·44Wilton,AndrewS.·44Wodchis,Walter·17Wojtowicz,Damian
·39Wolf,Jack·22Wolf,Lior
·23Wong,ChristopherK.Wong,Mike·127
·34,123Woon,MarkWright,GalenE.B.·78,85
·42Wright,MattW.·29Wu,Tong·42Wulf,Bryan
X
·39Xia,Xueting·45Xing,Lei·47Xue,Bin
Y
Yalamanchili,HariKrishna·100Yang,HeeChul·104Yang,Jizhou·99Yang,Xinming·72
·61Yao,YaoYavartanu,Fatemeh·105Yoo,YunJoo·105Yoon,Sukjoon·120
·63Yoon,Sungroh·50Young,Adamo
·8Yu,KeYue,Feng·84
![Page 144: PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 · PACIFIC SYMPOSIUM ON BIOCOMPUTING 2020 ABSTRACT BOOK Poster Presenters: Poster space is assigned by abstract page number. Please find the](https://reader033.vdocuments.us/reader033/viewer/2022050522/5fa58a0161517b41d657561e/html5/thumbnails/144.jpg)
136
Z
·4Zehnder,JamesL.·43Zeng,Xianlong
Zhang,Bo·84·44Zhang,Haoran
Zhang,Lin·106
·8Zhang,Mingda·45Zhao,Wei
Zhou,Bo·111 ·66Zhou, Jiayan
Zhulin,IgorB.·86Zoghbi,HudaY.·100
·42Zou,James