church_ncbivariation2013

50
Deanna M. Church Staff Scientist, NCBI @deannachurch Variation Resources at NCBI

Upload: deanna-church

Post on 24-Jun-2015

549 views

Category:

Technology


0 download

DESCRIPTION

NCBI Variation resources for CSHL Genome Access Course.

TRANSCRIPT

Page 1: Church_NCBIvariation2013

Deanna M. Church Staff Scientist, NCBI

@deannachurch

Variation Resources at NCBI 

Page 2: Church_NCBIvariation2013

Variation Resources Team at NCBIMing WardLon PhanBrad HolmesAnna GlodekMichael KholodovRama MaitiJuliana SampsonDavid ShaoEugene ShekhtmanQiang WangHua Zhang

Donna MaglottMelissa LandrumJennifer LeeGeorge RileyRay TullyCraig WallinShanmuga ChitipirallaDouglas HoffmanWonhee JangKen KatzMichael OvetskyRicardo Villamarin

Tim HefferonJohn LopezJohn GarnerChao Chen

Heidi Rehm, Harvard PartnersChrista Lese Martin, Geisinger Sherri Bale, GeneDxLisa Kalman, CDCBirgit Funke, Harvard PartnersMadhuri Hegde, Emory

Key Collaborators

Page 3: Church_NCBIvariation2013

Figure credit: http://itknowledgeexchange.techtarget.com/

Page 4: Church_NCBIvariation2013

dbSNPdbVar

ClinVarGTR

Quality ControlRef variantsReferences

Annotations

VisualizationTools

Data fromexternal sources

Page 5: Church_NCBIvariation2013

Variant Definitions Variant Annotations

LocationEvidenceMethodology

PhenotypesConsequencesTestsOther Biology

dbSNPdbVar

ClinVarGTRdbSNP

Page 6: Church_NCBIvariation2013
Page 7: Church_NCBIvariation2013

GenBank RefSeq vs

Submitter Owned RefSeq Owned

Redundancy Non-RedundantUpdated rarely Curated

INSDC Not INSDC

BRCA183 genomic records31 mRNA records27 protein records

3 genomic records 5 mRNA records1 RNA record5 protein records

Page 8: Church_NCBIvariation2013

Genome Res. 1999. 9: 677-679http://www.ncbi.nlm.nih.gov/snp

Page 9: Church_NCBIvariation2013
Page 10: Church_NCBIvariation2013

>gnl|dbSNP|ss76078129|allelePos=17|len=33|alleles='A/G’ GTGGCAGAGA CTGAATRAAGGGTTGAC CCAGGG

SNPs defined by flanking position

>gnl|dbSNP|ss3354770|allelePos=499|len=661|alleles='T/C’ actattcaca atagcaaaga cttggaacca acccaaatgt ccaacaatga tagactggat taagaaaatg tggcacatat acaccatgga atactaggca TTCCATTCTA CTGTGCACGA GTCACTGCAA ACTCAAGCAT TTCCAGAGTT CTGAAAGCTC AACTAAGAAC CAAGCCTACT CATTCAACAT CAACACACAC AGCACCCTGA GCGTCCAAAA CCACGGGGGT TATGTTCTAG ACCACAGGAC TGGCTACCTG GCCCTGCTCA AGGCGGCAGG ATCAATGGGC AAGAATGTGC AAGAATTTAC CACAACTCAG CCTTGCTGTG TCAACCACAG AGGCCAAGTA CCCCTAACAC CCAGATAGAG TAATTGTGCC TTACTTCTTT GTTCATTCCC ACCATTACAT TTTGTAAATT GGAACTTCTA GGAGGTTAGA AGGATATGCT GATCAAAAAA AGGGGACATA TTCAAGGAGT GTCCCTGGGT CAACCCTT Y ATTCAGTCTC TGCCACATGT CTAGTAACTG TGAGTGATGG GTGCATCAGT ATAATCCTGA GCCTCCCAAG GTACAGCCTT TCACTACTAT TCATCATATT GGCTAAGGTA TTCATCATAT TGGCTAAGGT ATTCACCAAC AGGGCTCATT TTCTATCAGA CC

Page 11: Church_NCBIvariation2013

ss76078129 (aligns to plus strand)

'A/G’ ss76078129ss3354770 'T/C’

ss3354770 (aligns to minus strand)

ss76078129 (33bp)

ss76078129 (661bp)

Page 12: Church_NCBIvariation2013
Page 13: Church_NCBIvariation2013

rs397515413

Page 14: Church_NCBIvariation2013

rs397515413

NC_000016.9 (chr16)

NW_003871055.3 (chr1 fix patch)

Hydin

Hydin2

Page 15: Church_NCBIvariation2013

Defines variant by location rather than flanking sequence

VCF (Variant Call File)

Page 16: Church_NCBIvariation2013
Page 17: Church_NCBIvariation2013

Clustering microsatellites

Page 18: Church_NCBIvariation2013

rs62645748

To be replacedby a Variation Viewer

To be replacedby a link to ClinVar

Page 19: Church_NCBIvariation2013

rs62645748 (NCBI Homo sapiens annotation run 104)

Page 20: Church_NCBIvariation2013

http://www.ncbi.nlm.nih.gov/dbvar

Page 21: Church_NCBIvariation2013

Submitter Information

Study Information

Sample/Sampleset data

Experiment data

Variants

Contact and author information

Study meta-data (description, PMID, ProjectID, etc)

Sample IDs (if samples are consented)Sampleset ID for pooled samples (case v control sets)

Assay method (sequencing, array)Platform and analysis information

Variant definitions

Page 22: Church_NCBIvariation2013
Page 23: Church_NCBIvariation2013

Variant Call Ambiguitystart stop

Inner start Inner stop

Outer start Outer stop

Probes with decreased signal intensityProbes with expected signal intensity

breakpoint breakpoint

Inner start Inner stop

Page 24: Church_NCBIvariation2013

Variant Call AmbiguityOuter start Outer stop

Fosmid clone (40 Kb +/- 1 Kb)

20Kb Clone has an insertionrelative to the genome

Clone has a deletionrelative to the genome 60 Kb

Page 25: Church_NCBIvariation2013
Page 26: Church_NCBIvariation2013
Page 27: Church_NCBIvariation2013

http://www.ncbi.nlm.nih.gov/clinvar

Page 28: Church_NCBIvariation2013

ClinVar data model and display

SCV

RCV

SCV

RCV

VariantPhenotypeSubmitter

AlleleVariant

Variant Phenotype

SCV SCV SCV SCV

Page 29: Church_NCBIvariation2013

Allele summary• Gene• Variant type• Genomic location• HGVS expressions*• Molecular

consequence*• Links*• Frequency*

Phenotype summary• Names• Links*• Age of onset *• Prevalence *

Interpretation• Significance• Review status *• Accession.version *

* May be provided by NCBI

ClinVar RCV report - Overview

Page 30: Church_NCBIvariation2013

ClinVar RCV report – Summary of assertions

• Each submission is accessioned and versioned• Terms provided by the submitter are mapped to controlled values• Method of review is clearly reported so primary data can be distinguished

from that reported in the literature

Page 31: Church_NCBIvariation2013

ClinVar RCV report - Evidence

Under active review

Page 32: Church_NCBIvariation2013

Allele report – available December

Page 33: Church_NCBIvariation2013

http://www.ncbi.nlm.nih.gov/refseq/rsghttp://www.lrg-sequence.org/

Page 34: Church_NCBIvariation2013

http://www.ncbi.nlm.nih.gov/refseq/rsg

RefSeq Gene

L R

Page 35: Church_NCBIvariation2013
Page 36: Church_NCBIvariation2013

http://www.ncbi.nlm.nih.gov/genome/tools/remap

From Assembly 1 <-> Assembly 2Assembly <-> RefSeqGene/LRGPrimary Assembly <-> Alternate loci

Page 37: Church_NCBIvariation2013

1:215844373

Page 38: Church_NCBIvariation2013

http://www.ncbi.nlm.nih.gov/variations/tools/reporter

This new look coming next month

Page 39: Church_NCBIvariation2013
Page 40: Church_NCBIvariation2013

http://www.ncbi.nlm.nih.gov/variation/view

Page 41: Church_NCBIvariation2013
Page 42: Church_NCBIvariation2013

http://www.ncbi.nlm.nih.gov/variation/tools/get-rm

Calls

Tests

cSRA

ConcordantDiscordantNA

Target audience: Clinical testing labsSubmissions from: Clinical and Research labs

Page 43: Church_NCBIvariation2013

Twelve submitting labs to date

Twelve custom scripts to regularize data

Defined formats here:http://www.ncbi.nlm.nih.gov/projects/variation/get-rm

Page 44: Church_NCBIvariation2013

Platforms

HiSeq 2000 HiSeq 2500 MiSeq Ion Torrent Sanger 4540

5

10

15

20

25

30

NA12878 Tests by Platform

Page 45: Church_NCBIvariation2013

Lab Provided Validation

Variants validated in this sample using another platformVariants validated in another sample using another platformVariants seen in other samples from submitting lab using this platformVariants seen in public data setVariants that are novelVariants that were not assessed

Page 46: Church_NCBIvariation2013

Based on May 2013 Data release

Page 47: Church_NCBIvariation2013

Based on May 2013 Data release

Page 48: Church_NCBIvariation2013

http://www.ncbi.nlm.nih.gov/variation/tools/get-rm

Page 49: Church_NCBIvariation2013
Page 50: Church_NCBIvariation2013

Gene level concordance

Σ (max(xi)/Σ T)i = genotype callX = count per call for each variantT = total genotype calls per variant

Sums are taken over all variants ina gene.Tested regions taken into accountPhasing ignored