standing on the shoulders of giants, german demidov,...

68
Standing on the shoulders of giants, German Demidov, Bioinformatics Summer School 2017

Upload: others

Post on 16-Mar-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

Standing on the

shoulders of giants,

German Demidov,

Bioinformatics

Summer School

2017

Page 2: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

BiologyandBigData

> Discoveringtruth

bybuildingon

previous

discoveries

Page 3: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

Whyitisuseful?

Justoneexample:

Page 4: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

Usingdatafromconsortia

> Whichtypesofdatacanyouobtainfrom

consortia?Howtoaccessanddownload

data?

> Howtoworkasapartofconsortia?Which

problemsyoumayface?

Page 5: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ImportantRemark

> Workshops“Howtouseconsortium_name”

usuallytake~3days(ie

https://www.encodeproject.org/tutorials/

encode-meeting-2016/),wewilltrytomake

anoverviewin1hour

> However,ifyouwanttofindmoreinformation

– google“consortium_nameworkshop”

> Thereareseparatepapers(i.e.EwanBirney,

2012,Nature,aboutENCODE)

Page 6: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

GWASConsortia

> http://

www.wikigenes.org/

e/art/e/185.html

> 500.000genotyped

peopleinUK

Page 7: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

EWASConsortia

Page 8: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

GenomicsConsortia

> TheExomeAggregationConsortium

> 1000Genomes

> HumanReferenceGenome

> InternationalCancerGenomeConsortium

> TheCancerGenomeAtlas

> PanCancerAnalysisofWholeGenomes

> GTEx

Page 9: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

EpigenomicsConsortia

> ENCODE

> RoadmapEpigenomics

> BluePrint

> InternationalHumanEpigenome

Consortium

Page 10: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ExACOverivew

> http://exac.broadinstitute.org/about

> Firstthingtodo–lookandreadflagship

paper!

> Thedatasetprovidedonthiswebsitespans

60,706unrelatedindividualssequencedas

partofvariousdisease-specificand

populationgeneticstudies.

Page 11: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ExAC:Whyitisuseful

Itisusedto

> calculateobjectivemetricsofpathogenicityforsequencevariants,

> identifygenessubjecttostrongselectionagainstvariousclassesofmutation;identifying3,230geneswithnear-completereductionofnumberofpredictedprotein-truncatingvariants,with72%ofthesegeneshavingnocurrentlyestablishedhumandiseasephenotype,

> efficientfilteringofcandidatedisease-causingvariants

Page 12: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ExAC:Results

•  ANNOVARandATAVwereupdatedusing

ExACdata

•  CADDscoreswerere-calculated

•  CommercialtoolssuchasGoldenHelixand

GeneTalkalsoincorporatedExACdata

Page 13: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ExAC:Download

> Download

Page 14: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ExAC:Methods

> FlagshipPaper–Methods–short

descriptionwithdetailedpipelinesin

SupplementaryInformation

> 91,796individualexomesdrawnfroma

widerangeofprimarilydisease-focused

consortia

Page 15: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ExACQualityAssesment

> Comparisonwithintrios:singletontransmissionrateof50.1%(~50%)

> >10.000sampleswerecheckedwithSNPArrays–97-99%heterozygousconcordance

> Platinumstandardgenomesequencedwith5differenttechnologies–99.8%Sensitivity,0.056%FDR

> Comparisonwith13WGS~30x,PCR-free

> IndelFDRishigher(4.7%),singletonvariantsshowhigherFDR

> FDRisdifferentfordifferentannotationclasses(missense,synonymous,proteintruncating)

Page 16: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ExACSampleFiltering

> Only60.706samplespassedQCoutof91.796

> SetofcommonSNPswasselected(5.400)andsampleswithoutlierheterozygositywereremovedpriortoPCA

> Persamplenumberofvariants,transition/transversion(TiTv)ratio,alternatealleleheterozygous/homozygous(Het/Hom)ratioandinsertion/deletion(indel)ratio

> Closerelativeswereremoved

> Finalcoverage:80%oftargetedbases>20x

> 77%wereenrichedwithAgilentKit(33MBtarget)

Page 17: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP

> http://www.internationalgenome.org

Page 18: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:Overview,goals

> http://www.internationalgenome.org/data-portal/sample

> Prettyconvenientdataportalthatallowsyounicefiltering!

> Thegoalofthe1000GenomesProjectwastofindmostgeneticvariantswithfrequenciesofatleast1%inthepopulationsstudied.

> Theprojectplannedtosequenceeachsampleto4xgenomecoverage;atthisdepth,sequencingcannotdiscoverallvariantsineachsample,butcanallowthedetectionofmostvariantswithfrequenciesaslowas1%.

Page 19: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:MainPublications

> Pilot:Amapofhumangenomevariationfrompopulation-scalesequencingNature467,1061–1073(28October2010)

> Phase1:Anintegratedmapofgeneticvariationfrom1,092humangenomesNature491,56–65(01November2012)

> Phase3:AglobalreferenceforhumangeneticvariationNature526,68–74(01October2015)

> Anintegratedmapofstructuralvariationin2,504humangenomesNature526,75–81(01October2015)

Page 20: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:Pipeline

Page 21: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:PowerofDetection,Heterozygous

Discordance,SequencingDepth

Page 22: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:Results

Page 23: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:VariantCalling

Page 24: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:CNVs

Page 25: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

1000GP:CNVsconcordance

Page 26: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

PanCancerAnalysisOfWG

> https://dcc.icgc.org/pcawg

Page 27: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

PanCancerAnalysisOfWG

1.  Novelsomaticmutationcallingmethods

2.  Analysisofmutationsinregulatoryregions

3.  Integrationofthetranscriptomeandgenome

4.  Integrationoftheepigenomeandgenome

5.  Consequencesofsomaticmutationsonpathwayandnetworkactivity

6.  Patternsofstructuralvariations,signatures,genomiccorrelations,retrotransposonsandmobileelements

7.  Mutationsignaturesandprocesses

8.  Germlinecancergenome

9.  Inferringdrivermutationsandidentifyingcancergenesandpathways

10.  Translatingcancergenomestotheclinic

11.  Evolutionandheterogeneity

12.  Portals,visualizationandsoftwareinfrastructure

13.  Molecularsubtypesandclassification

14.  Analysisofmutationsinnon-codingRNA

15.  Mitochondrial

16.  Pathogens

Page 28: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

PCAWG,WG8:Validation

> High-coveragevalidation

> 3maincallers:BroadInstitute–HaplotypeCaller,Annai-RTG(privatecompany),Freebayes(EMBL-DKFZ)

> 50samples,5000sitespersamplesequencedwith~1000depth

> ~2300SNVs,~2700indels

> SNPRecall/PPV/concordance~0.995

> Indels:0.94Recall,0.91PPV,concordance0.88

Page 29: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

PCAWGWG8,CNVs

> CNVs

Page 30: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

PCAWGWG8:Results

> Sensitivity,deletionsonly~60%,

duplications~40%!

Page 31: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

FurtherInformation

> Flagshippaperisnotinformative:/

> 16papersarereleasedinbioRxiv

Page 32: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

GTEx

> TheGenotype-TissueExpressionprojectaimstoprovidetothescientificcommunityaresourcewithwhichtostudyhumangeneexpressionandregulationanditsrelationshiptogeneticvariation

> Variationsingeneexpressionthatarehighlycorrelatedwithgeneticvariationcanbeidentifiedasexpressionquantitativetraitloci,oreQTLs

Page 33: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

GTEx

> Alotofgeneticchangesassociatedwithcommonhumandiseases,suchasheartdisease,cancer,diabetes,asthma,andstroke,liesoutsideoftheprotein-codingregionsofgenes

> ThecomprehensiveidentificationofhumaneQTLswillgreatlyhelptoidentifygeneswhoseexpressionisaffectedbygeneticvariation

Page 34: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

GTExDataOverview

Page 35: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

GTExScheme

Page 36: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

GTEx:CausesofDeath

Page 37: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:Overview

> https://www.encodeproject.org

> EncyclopediaofDNAelements

> ThegoalofENCODEistobuilda

comprehensivepartslistoffunctional

elementsinthehuman(mouse/fly/worm)

genome

Page 38: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODETimeline

Page 39: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODEasfor2012

Page 40: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:TypesofData

> https://www.encodeproject.org

Page 41: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:DataMatrix

Page 42: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:AuditCategory

Eachsamplecanhavemultiple

QCissuesandcanstill

Beavailablefordownloading!

Page 43: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:ResultofAnalysis

Page 44: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:GroundLevel

Page 45: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:Mid-level

Page 46: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODE:Top-Level

Page 47: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODEpublications

> Ofcourse,oneoftheproductsis

publicaitons!

0

100

200

300

400

500

600

Nu

mb

er

of

Pu

blic

ati

on

s

Cumulative ENCODE Publications Over Time

Papers from Non-ENCODE Authors

Papers from ENCODE 2 Production Groups

Page 48: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

ENCODEstandards

> DataStandards

Page 49: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

BluePrint

> “BLUEPRINTisalarge-scaleresearchprojectreceivingcloseto30millioneurofundingfromtheEU.”

> 42leadingEuropeanscientificcenters

> Theaimtofurthertheunderstandingofhowgenesareactivatedorrepressedinbothhealthyanddiseasedhumancells

> Focusondistincttypesofhaematopoieticcellsfromhealthyindividualsandontheirmalignantleukaemiccounterparts

Page 50: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

BluePrint

> http://www.blueprint-epigenome.eu

> Publications(CellPapers)&DataPortal

Page 51: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

BluePrint

> http://dcc.blueprint-epigenome.eu/#/home

Page 52: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

BluePrint

Page 53: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

BluePrint

Page 54: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

> TheNIHRoadmapEpigenomicsResearchtotransformourunderstandingofhowepigeneticscontributestodisease

> TheConsortiumleveragesexperimentalpipelinesbuiltaroundnext-generationsequencingtechnologiestomapDNAmethylation,histonemodifications,chromatinaccessibilityandsmallRNAtranscriptsinstemcellsandprimaryexvivotissuesselectedtorepresentthenormalcounterpartsoftissuesandorgansystemsfrequentlyinvolvedinhumandisease

Page 55: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

Page 56: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

Page 57: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

ItlookslikewecangetProtocolsclickingonthelink,however,

therearenotalotofthemthere.Theprotocolsaresuper

outdated!(egREMCSTANDARDSANDGUIDELINESFORCHIP-

SEQDEC.2,2011—V1.0)

Page 58: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

> Ifyouwannatoworkwiththesedata–readthepaper“Integrativeanalysisof111referencehumanepigenomes”(+16ENCODE2012,donotprintthepaper!)

> Gothroughthe“Publications”list

Page 59: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

ThemostusefulsectionisMethods:

> RNA-sequniformprocessingandquantificationforconsolidatedepigenomes

> ChIP-seqandDNase-sequniformreprocessingforconsolidatedepigenomes

> Methylationdatacross-assaystandardizationanduniformprocessingforconsolidatedepigenomes

> Chromatinstatelearning

> Etc.

Page 60: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

> Publications

Page 61: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

RoadMapEpigenomics

>  HistonemarkcombinationsshowdistinctlevelsofDNAmethylationandaccessibility,andpredictdifferencesinRNAexpressionlevelsthatarenotreflectedineitheraccessibilityormethylation.

>  Megabase-scaleregionswithdistinctepigenomicsignaturesshowstrongdifferencesinactivity,genedensityandnuclearlaminaassociations,suggestingdistinctchromosomaldomains.

>  Approximately5%ofeachreferenceepigenomeshowsenhancerandpromotersignatures,whicharetwofoldenrichedforevolutionarilyconservednon-exonicelementsonaverage.

>  Epigenomicdatasetscanbeimputedathighresolutionfromexistingdata,completingmissingmarksinadditionalcelltypes,andprovidingamorerobustsignalevenforobserveddatasets.

>  Dynamicsofepigenomicmarksintheirrelevantchromatinstatesallowadata-drivenapproachtolearnbiologicallymeaningfulrelationshipsbetweencelltypes,tissuesandlineages.

Page 62: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

WorkinginConsortia

Page 63: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

WorkingwithData

•  GettingRawData

•  Workingwiththedatafromdifferent

consortiasimultaneously:differentQCs,

differentdataanalysispipeline

•  Versionsoftoolsmissedoroutdated/

unsupportedtools–failureofreplication!

Page 64: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

WorkinginConsortiaI

•  WhenyourServergetsdownorallyour

datawereaccidentallyremoved

•  Deadlines–add3-6monthstoexpected

date!

•  Communication:teleconferences

•  Passwordsrenewal,permissionstoaccess

•  Efficientdatasharing–speed,reliability,

confidentiality

Page 65: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

WorkinginConsortiaII

•  Differentnamingofthesamesamplesindifferentworkinggroups/labs

•  Wrong/MissingIdentifiers(egwrongcancertypeorpopulation)–case:normalandsomaticwereactuallyswapped

•  Thesame,butfromclinicians

•  Differentlabs-differentlibrarypreparation(egcoveragedepthsafterPCR-freeandPCR-basedWGS)

•  Severaltoolscanbeusedfortheanalysis–establishmentofthebesttoolorgenerationofjointcallset

•  Multipleblacklistoroutlierlists(everylab/grouphasitsownandtheydonotcompletelyoverlap)

Page 66: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

WorkinginConsortiaIII

•  UnbalancedPopulationStructure

•  Mixofdifferenteffects(egCancervs.

Population)

•  IsyourGermlinereallyGermline?

Page 67: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

SlidefromAgENCODE,EwanBirney

Page 68: Standing on the shoulders of giants, German Demidov, …bioinformaticsinstitute.ru/sites/default/files/demidov... · 2020. 8. 31. · 1061–1073 (28 October 2010) > Phase 1: An integrated

Спасибозавнимание!