d ata b ase of g enotype a nd p henotype

18
1 National Center for Biotechnology Information database of Genotype and Phenotype Kim Pruitt (for Matt Mailman) NCBI ttp://www.ncbi.nlm.nih.gov/ entrez/query.fcgi?db =gap

Upload: dannon

Post on 25-Feb-2016

15 views

Category:

Documents


0 download

DESCRIPTION

d ata b ase of G enotype a nd P henotype. Kim Pruitt (for Matt Mailman) NCBI. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gap. Overview. Phenotype Genotype Genotype X Phenotype Association. Overview. Phenotype Data tables Columns are phenotypes Rows are individuals - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: d ata b ase of  G enotype  a nd  P henotype

1National Center for Biotechnology Information

database of Genotype and Phenotype

Kim Pruitt(for Matt Mailman)

NCBI

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gap

Page 2: d ata b ase of  G enotype  a nd  P henotype

2National Center for Biotechnology Information

Overview

• Phenotype• Genotype• Genotype X Phenotype Association

Page 3: d ata b ase of  G enotype  a nd  P henotype

3National Center for Biotechnology Information

Overview• Phenotype

– Data tables• Columns are phenotypes• Rows are individuals

– Documents (ie: protocols, data collection forms)• Parts of documents linked to variables

– Data dictionary• Genotype• Genotype X Phenotype Association

Page 4: d ata b ase of  G enotype  a nd  P henotype

4National Center for Biotechnology Information

Overview

• Phenotype• Genotype

– Genotype files directly from vendor– Intensity files (ie: .CEL)

• Genotype X Phenotype Association

Page 5: d ata b ase of  G enotype  a nd  P henotype

5National Center for Biotechnology Information

Overview

• Phenotype• Genotype• Genotype X Phenotype Association

– Various statistical models and methods– P-value or LOD score for each marker– Filters by P-value, HWE, minor allele frequency– Map phenotypes onto genomic sequence

Page 6: d ata b ase of  G enotype  a nd  P henotype

6National Center for Biotechnology Information

Overview• Phenotype• Genotype• Genotype X Phenotype Association

• Obvious expansion potential:– More species; different types of association data (QTL)

• Critically important to archive all data:– Submit primary data to appropriate public archive!– Probe DB: primers, resequencing amplicons– dbSTS: STS markers– Maps: UniSTS; Map Viewer– GenBank: ESTs

Page 7: d ata b ase of  G enotype  a nd  P henotype

7National Center for Biotechnology Information

dbGaP Web Site

two levels of access - open and controlled

•open access to non-sensitive data•study summaries and documents•measured variables and data elements•analysis reports•genome browser

•controlled access provides oversight and accountability for use of sensitive datasets involving personal information

•De-identified phenotypes and genotypes for individual subjects•Pedigrees

Page 8: d ata b ase of  G enotype  a nd  P henotype

8National Center for Biotechnology Information

Browse Studies

Link to study reportList of variables in study

Instructions Description of dbGaP

List of documents in study

Link back to dbGaP homepage

Automated query to PubMed for genome-wide association study articles

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gap

Page 9: d ata b ase of  G enotype  a nd  P henotype

9National Center for Biotechnology Information

Browse Studies by Disease

Expand/collapseLink to Terms from MeSH vocabulary

Link to study report

Page 10: d ata b ase of  G enotype  a nd  P henotype

10National Center for Biotechnology Information

Advanced Search

Fields to be searched

Add any number of search criteria

Page 11: d ata b ase of  G enotype  a nd  P henotype

11National Center for Biotechnology Information

Study Report

Links back to submitter websiteHistoryPublicationsAttributionAccess Rules

Link tovariablereport

search this studyGenotype x phenotypeassociation or linkageanalyses

Citeable unique stable identifier

Criteria for inclusion/exclusion

Page 12: d ata b ase of  G enotype  a nd  P henotype

12National Center for Biotechnology Information

Variable ReportCiteable unique stable identifier

Documents containing a section that has been linked to this variable

P-value is red if cases differ from controls

Statistical summary of values for this variable

Page 13: d ata b ase of  G enotype  a nd  P henotype

13National Center for Biotechnology Information

Variable Report (continued)

Document name Section of document that hasbeen linked to this variable

Link to document

Page 14: d ata b ase of  G enotype  a nd  P henotype

14National Center for Biotechnology Information

Analysis Report

Link back to report for measured or derived variable that was analyzed

Genome browser of analysis results

Page 15: d ata b ase of  G enotype  a nd  P henotype

15National Center for Biotechnology Information

Genome Browser of Analysis Results

Slider filters results less significant than threshold

2MB bins colored to represent the mostSignificantly associated marker

Click on bin of interest tozoom in and see associationin context with other objectsmapped to the same genomicregion

LINK

Page 16: d ata b ase of  G enotype  a nd  P henotype

16National Center for Biotechnology Information

Genome Browser – Higher Resolution

P-value of genotyped marker

Scroll viaboxesabove

Collapse table

CFH gene has beenassociated with AMDin several studies

Add maps

Page 17: d ata b ase of  G enotype  a nd  P henotype

17National Center for Biotechnology Information

Coming Soon…• Studies

– Early 2007• Michael J. Fox Foundation Parkinson’s Disease Study (LEAPS) • NINDS Stroke and ALS

– Spring 2007• GAIN (Genetic Association Information Network)• Framingham SHARe – first two generations• NIDDK GoKinD and EDIC

– Summer 2007• Framingham SHARe – third generation

– Late 2007- Early 2008• GEI (Genes and Environment Initiative)

• Features– Search analysis results by:

• Gene• SNP or microsatellite marker• Genomic region

– Filter analysis results by:• P-value• HWE• Minor allele frequency• Call rate?

– Download• Public summaries• Authorized access for individual-level data

Page 18: d ata b ase of  G enotype  a nd  P henotype

18National Center for Biotechnology Information

Acknowledgements• Phenotype

– Rinat Bagoutdinov– Luning Hao– Mas Kimura– Jimmy Jin– Natasha Popova– Stephanie Pretels– Karl Sirotkin– Jack Wang– Matt Mailman

• Genotype– Mike Feolo– Lon Phan– David Shao– Ming Ward– Steve Sherry

• XML– Kim Tryka– Laura Kelly– Jeff Beck

• Authorized Access– Steve Sherry– Eugene Yaschenko– Valdimir Soussov– Misha Kimmelman– Don Preuss– Al Graeff– Jim Ostell

http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gap