april 2006 march 2007 xosé mª fernández european bioinformatics institute browsing genomes with...
Post on 18-Dec-2015
216 views
TRANSCRIPT
![Page 1: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/1.jpg)
April 2006March 2007March 2007
XosXosé Mª Fernándezé Mª FernándezEuropean Bioinformatics InstituteEuropean Bioinformatics Institute
Browsing Genomes with EnsemblBrowsing Genomes with Ensembl
![Page 2: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/2.jpg)
2 of 50
• Overview of Ensembl• Making genomes useful• Beyond Ensembl
Outline of talkOutline of talk
![Page 3: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/3.jpg)
3 of 50
• Overview of Ensembl– Ensembl - Project– Exploring genomes– Gene annotation
• Making genomes useful• Beyond Ensembl
Outline of talkOutline of talk
![Page 4: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/4.jpg)
4 of 50
Ensembl - ProjectEnsembl - Project
• Joint project– EMBL – European Bioinformatics Institute (EBI) – Wellcome Trust Sanger Institute
• Produce accurate, automatic genome annotation• Focused on selected eukaryotic genomes • Integrate external (distributed) biological data• Presentation of the analysis to all via the Web at
http://www.ensembl.org • Open distribution of the analysis the community• Development of open, collaborative software (databases
and APIs)
![Page 5: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/5.jpg)
5 of 50
Ensembl - ProjectEnsembl - Project
• Joint project– EMBL – European Bioinformatics Institute (EBI) – Wellcome Trust Sanger Institute
• Produce accurate, automatic genome annotation• Focused on selected eukaryotic genomes • Integrate external (distributed) biological data• Presentation of the analysis to all via the Web at
http://www.ensembl.org • Open distribution of the analysis the community• Development of open, collaborative software (databases and
APIs)
![Page 6: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/6.jpg)
6 of 50
Beyond classical Beyond classical ab initioab initio gene predictiongene prediction
• Ensembl automatic gene prediction relies on homology ‘supporting evidence’ to avoid overprediction.
• Classical ab initio gene prediction (eg GENSCAN) relies partly on global statistics of protein coding potentials, not used in the cell
• Genes are just a series of short signals– Transcription start site– Translation start site– 5’ & 3’ Intron splicing signals– Termination signals
• Short signal sequences difficult to recognise over background noise in large genomes
![Page 7: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/7.jpg)
7 of 50
Ensembl - ProjectEnsembl - Project
• Joint project– EMBL – European Bioinformatics Institute (EBI) – Wellcome Trust Sanger Institute
• Produce accurate, automatic genome annotation• Focused on selected eukaryotic genomes • Integrate external (distributed) biological data• Presentation of the analysis to all via the Web at
http://www.ensembl.org • Open distribution of the analysis the community• Development of open, collaborative software (databases and
APIs)
![Page 8: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/8.jpg)
8 of 50
Ensembl v43Ensembl v43
![Page 9: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/9.jpg)
9 of 50
Ensembl - ProjectEnsembl - Project
• Joint project– EMBL – European Bioinformatics Institute (EBI) – Wellcome Trust Sanger Institute
• Produce accurate, automatic genome annotation• Focused on selected eukaryotic genomes • Integrate external (distributed) biological data• Presentation of the analysis to all via the Web at
http://www.ensembl.org • Open distribution of the analysis the community• Development of open, collaborative software (databases and
APIs)
![Page 10: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/10.jpg)
10 of 50
http://www.dasregistry.orghttp://www.dasregistry.org
DAS DAS RegistryRegistry
![Page 11: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/11.jpg)
11 of 50
DASDAS
![Page 12: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/12.jpg)
12 of 50
Ensembl - ProjectEnsembl - Project
• Joint project– EMBL – European Bioinformatics Institute (EBI) – Wellcome Trust Sanger Institute
• Produce accurate, automatic genome annotation• Focused on selected eukaryotic genomes • Integrate external (distributed) biological data• Presentation of the analysis to all via the Web at
http://www.ensembl.orghttp://www.ensembl.org • Open distribution of the analysis the community• Development of open, collaborative software (databases and
APIs)
![Page 13: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/13.jpg)
13 of 50
PrPre! and Archiv and Archive! sites sites
http://pre.ensembl.org
http://www.ensembl.org
http://archive.ensembl.org
![Page 14: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/14.jpg)
14 of 50
Ensembl - ProjectEnsembl - Project
• Joint project– EMBL – European Bioinformatics Institute (EBI) – Wellcome Trust Sanger Institute
• Produce accurate, automatic genome annotation• Focused on selected eukaryotic genomes • Integrate external (distributed) biological data• Presentation of the analysis to all via the Web at
http://www.ensembl.org • Open distribution of the analysis the community• Development of open, collaborative software (databases and
APIs)
![Page 15: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/15.jpg)
15 of 50
• Object model– standard interface makes it easy for others to build
custom applications on top of Ensembl data
• Open discussion of design ([email protected])• Most major pharma and many academics represented
on mailing list and code is being actively developed externally
• Ensembl locally– Both industry & academia
Open source open Open source open standardsstandards
![Page 16: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/16.jpg)
16 of 50
Ensembl – Open sourceEnsembl – Open source
![Page 17: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/17.jpg)
17 of 50
Ensembl - ProjectEnsembl - Project
• Joint project– EMBL – European Bioinformatics Institute (EBI) – Wellcome Trust Sanger Institute
• Produce accurate, automatic genome annotation• Focused on selected eukaryotic genomes • Integrate external (distributed) biological data• Presentation of the analysis to all via the Web at
http://www.ensembl.org • Open distribution of the analysis the community• Development of open, collaborative software (databases
and APIs)
![Page 18: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/18.jpg)
18 of 50
APIsAPIs• Used to retrieve data from and to store data
in Ensembl databases.• Ensembl Perl API;
– Written in Object-Oriented Perl,
– Foundation for the Ensembl Pipeline and Ensembl Web interface.
![Page 19: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/19.jpg)
19 of 50
• Overview of Ensembl– Ensembl - Project– Exploring genomes– Gene annotation
• Making genomes useful• Beyond Ensembl
![Page 20: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/20.jpg)
20 of 50
Making genomes usefulMaking genomes useful• Interpretation
– Where are the interesting parts of the genome?– What do they do?– How are they related to elements in other
genomes?• Access
– for bench biologists– for non-programming mid-scale groups– for good programming groups
![Page 21: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/21.jpg)
21 of 50
Access… bench biologistsAccess… bench biologists• Mainly via the web• Web site designed for non programming, not
that genome aware biologist– Simple things to find are simple to find– Graphically displays and overviews– Consistency of layout, colour and text
![Page 22: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/22.jpg)
22 of 50
Analysis DB
CPU
Final DB
SupportingDatabases
SNP
ManualAnnotation
EnsemblEnsembl
![Page 23: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/23.jpg)
23 of 50
Genome browsingGenome browsingwhy present the whole genome?why present the whole genome?
• Explore what is in a chromosome region• See features in and around a specific gene• Search & retrieve across the whole genome• Investigate genome organization• Compare to other genomes
![Page 24: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/24.jpg)
24 of 50
Introduction to the Introduction to the
Ensembl web siteEnsembl web site Ensembl … …
takes genomic sequence assemblieshuman build 36, mouse, rat, mosquito…
adds annotation and links automated process
presents all the data on a web site
![Page 25: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/25.jpg)
25 of 50
Basic Genome AnnotationBasic Genome Annotation
• Genes– Genomic location– Gene model structures
• Exons• Introns• UTRs
– Transcript(s)
• Pseudogenes• Non-coding RNA
– Protein(s)– Links to other sources of information
![Page 26: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/26.jpg)
26 of 50
Advanced Genome AnnotationAdvanced Genome Annotation
• Cytogenetic bands• Polymorphic markers
– Sequence Tagged Sites (STS)
• Genetic variation– Single Nucleotide Polymorphisms (SNPs)
– Deletion-Insertion Polymorphisms (DIPs)
– Short Tandem Repeats (STRs)
• Repetitive sequences• Expressed Sequence Tags (ESTs)• cDNAs or mRNAs from related species• Regions of sequence homology
![Page 27: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/27.jpg)
27 of 50
How to get started … …How to get started … …
• Species homepage
• Map View
• Text search
• BLAST
• SSAHA
![Page 28: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/28.jpg)
28 of 50
HomepageHomepage
![Page 29: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/29.jpg)
MapViewMapView
![Page 30: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/30.jpg)
30 of 50
BLAST and SSAHABLAST and SSAHA
See blast hit on genome
![Page 31: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/31.jpg)
31 of 50
Regions, maps and markersRegions, maps and markers
MarkerView
SNPView
GeneSNPView
ContigView
CytoView
SyntenyView
MultiContigView
![Page 32: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/32.jpg)
EnsemblEnsemblContigView
![Page 33: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/33.jpg)
33 of 62
ContigViewContigView close-up
Transcriptsred & black(Ensembl predictions)Blue (Vega) & gold (HAVANA, only in human)
Pop-up menu
![Page 34: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/34.jpg)
34 of 62
ContigViewContigView - Navigation
Click and drag mouse to select region
![Page 35: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/35.jpg)
CytoViewCytoView
![Page 36: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/36.jpg)
GeneSNPGeneSNPViewView
![Page 37: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/37.jpg)
SNPViewSNPView
![Page 38: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/38.jpg)
MarkerViewMarkerView
![Page 39: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/39.jpg)
MultiContigViewMultiContigView
![Page 40: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/40.jpg)
40 of 50
Genes & gene productsGenes & gene products
GeneView
TransViewExonView
ProteinView
FamilyView
GOView
![Page 41: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/41.jpg)
EnsemblEnsemblGeneView
![Page 42: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/42.jpg)
ExonViewExonView
TransViewTransView
![Page 43: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/43.jpg)
ProteinProteinViewView
![Page 44: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/44.jpg)
FamilyFamilyViewView
![Page 45: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/45.jpg)
GOViewGOView
![Page 46: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/46.jpg)
46 of 50
Data retrievalData retrieval
BioMart
Data sets on ftp site
MySQL queries of databases
Perl API access to databases
Export View
![Page 47: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/47.jpg)
ExportViewExportView
![Page 48: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/48.jpg)
48 of 50
Help!Help!
• context sensitive help pages - click
• access other documentation via generic home page
• email the helpdesk
![Page 49: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/49.jpg)
49 of 50
Ensembl TeamEnsembl TeamJuly 2006July 2006
![Page 50: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/50.jpg)
50 of 50
Leaders Ewan Birney (EBI), Tim Hubbard (Sanger Institute)
Database Schema and Core API Glenn Proctor, Andreas Kähäri, Ian Longden, Patrick Meidl
BioMart Arek Kasprzyk, Damian Smedley, Richard Holland, Syed Haider
Distributed Annotation System (DAS) Eugene Kulesha
Outreach Xosé M Fernández, Bert Overduin, Giulietta Spudich, Michael Schuster
Web TeamJames Smith, Bethan Pritchard, Fiona Cunningham, Anne Parker, Stephen Rice, Steve Trevanion (VEGA), Matt Wood
Comparative GenomicsAbel Ureta-Vidal, Kathryn Beal, Benoît Ballester, Stephen Fitzgerald, Javier Herrero Sánchez, Albert Vilella
Analysis and Annotation PipelineVal Curwen, Steve Searle, Bronwen Aken, Julio Banet, Laura Clarke, Sarah Dyer, Jan-Hinnerck Vogel, Kevin Howe, Felix Kokocinski, Stephen Rice, Simon White
Functional Genomics Paul Flicek, Yuan Chen, Stefan Gräf, Nathan Johnson, Daniel Rios
Zebrafish Annotation Kerstin Howe, Mario Caccamo, Tina Eyre, Ian Sealy
VectorBase Annotation Martin Hammond, Dan Lawson, Karyn Megy
Systems & Support Guy Coates, Tim Cutts, Shelley Goddard
ResearchDamian Keefe, Guy Slater, Michael Hoffman, Alison Meynert, Benedict Paten, Daniel Zerbino, Dace Ruklisa
Ensembl TeamEnsembl Team
March 2007March 2007
![Page 51: April 2006 March 2007 Xosé Mª Fernández European Bioinformatics Institute Browsing Genomes with Ensembl](https://reader036.vdocuments.us/reader036/viewer/2022062407/56649d265503460f949fc94f/html5/thumbnails/51.jpg)
51 of 50
Training...Training... Somewhere near you Somewhere near you