church gia13

Post on 24-Jun-2015

155 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Converting from Analog to DigitalIntegrating the historical archive of human variation in an NGS world

Deanna M. Church Staff Scientist, NCBI

@deannachurch Genome Informatics Alliance 2013

AcknowledgementsGeT-RM

Lisa Kalman (CDC)Birgit Funke (Harvard)Mahduri Hegde (Emory)Maryam HalaviChao ChenJon TrowDouglas SlottaPeter MericDaniel FrishbergVictor Ananiev

ClinVarAlex Astashyn Shanmuga ChitipirallaDouglas Hoffman Wonhee Jang Brandi KattmanMelissa LandrumJennifer LeeAdriana Malheiro Wendy RubinsteinGeorge Riley Amanjeev Sethi Ricardo Villamarin

ISCAChrista Lese Martin (Geisinger)Erin Riggs (Geisinger)Jose MenaMike FeoloTim HefferonJohn Garner John Lopez

GRCValerie Schneider (NCBI)The Genome Institute at Washington UniversityThe Wellcome Trust Sanger InstituteThe European Bioinformatics Institute

Variation

Phenotypes

Phenotypes

Variant Call (dbVar submission)

Array data files

Clinical Labs

QC AnalysisCuration

Data regularization

dbGaP

Controlled Access

Web accessFTP AccessAssembly

Remapping

dbVar

ISCA

UCSC

DGV

DGVa

NCBIApproved Users

BioProject ID

ClinVardbGaP projects needa sponsoring NIH institute to run the DAC (NICHD)

ASDAtrial Septum Defect Autism Spectrum Disorder

??

No HPO 1,814

HPO6,770

Riggs et al, 2012

~2 HPO terms/case(max of 16)

The Human Phenotype Ontology

http://www.ncbi.nlm.nih.gov/medgen

Variation

sequences alignments genotype likelihoods individual variants1

10

100

1,000

10,000

100,000

size

(gi

gaby

tes)

component

1092 genomes (low coverage + exome)

38.2M SNPs3.9M Short Indels and14K Deletions

FASTQBAM

VCF

VCF

FASTQBAM

VCF

VCF

Steve Sherry, NCBI

http://www.bioplanet.com/gcat

http://www.ncbi.nlm.nih.gov/variation/tools/1000genomes

http://genomereference.org

GRCh37

Dennis et al., 2012

1q32 1q21 1p21

1p21 patch alignment to chromosome 1

Hydin: chr16 (16q22.2)Hydin2: chr1 (1q21.1)Missing in NCBI35 Unlocalized in NCBI36/GRCh37 Finished in GRCh38

Alignment to Hydin2 Genomic, 300 Kb, 99.4% ID

Alignment to Hydin1 CHM1_1.0, >99.9% ID

Alignment to Hydin2 Genomic, 300 Kb, 99.4% ID

Alignment to Hydin1 CHM1_1.0, >99.9% ID

Doggett et al., 2006

Kidd et al, 2007 APOBEC cluster

Part of chr22 assembly

Alternate locus for chr22

White: InsertionBlack: Deletion

http://www.ncbi.nlm.nih.gov/variation/tools/1000genomes

Human Resolved for GRCh38

http://genomereference.org

GRCh38 is coming(September, 2013)

http://www.ncbi.nlm.nih.gov/variation/tools/get-rm

Calls

Tests

cSRA

ConcordantDiscordantNA

Target audience: Clinical testing labsSubmissions from: Clinical and Research labs

Reporting Standards: Not standard

Twelve submitting labs to date

Twelve custom scripts to regularize data

Despite defined formats here:http://www.ncbi.nlm.nih.gov/projects/variation/get-rm

What are the issues?

Reporting Standards: Not standard

What are the issues?

Better Example: QUAL*

*Required sixth column in VCF file

10.01-18357.112.6-21.20-21.220-3070Allele string34.79-44624.03None20-46006

c.1956+15C>CT

Reporting Standards: Not standard

What are the issues?

Lab reporting a single nucleotide change (C->T) het change as:

c.1956+15C>T[=]

HGVS standards says this should be reported as:

Lab reporting a single nucleotide change (A->G) hom change as:

c.670+9A>GHGVS standards says this should be reported as:

c.[670+9A>G];[670+9A>G]

Defining a reference sequence: Data validation

NM_007171.3:c.942T>CReported as:

Base in transcript is a ‘C’ not a ‘T’

http://www.ncbi.nlm.nih.gov/clinvar

Standardize data: what is the variation?607008.0001

985A>G985A>G (K304E)A985GACADM, LYS304GLUK304EK304E (985 A->G)K304E (K329E)K304E onlyK329EK329E(985A>G)LYS304GLUMutation c.985A>G (p.K304E)c.985A>Gc.985A>G (p.K304E)c.985A>G (p.Lys304Gluincludes: K304E (985A>G)p.K304Ep.Lys329Glupreviously known as p.Lys329GluAnalysis of ACADM 985A>G mutation

NC_000001.10:g.76226846A>GNG_007045.1:g.41804A>GNM_000016.4:c.985A>GNP_000007.1:p.Lys329Glurs77931234

Miki et al, 1994

top related