Genome Browsers
Carsten O. Daub
Omics Science Center
RIKEN, Japan
May 2008
Outline
• Give some impression of the intuitive handling of the browsers
• Highlight some of the specific functions
• Point out some of the strengths of the different browsers
General comments
• We will focus on two genome growsers in this lecture
• UCSC genome browser
• ENSEMBL genome browser
• Some comments about the RIKEN FANTOM3 genome browser
General comments cont’d.
• The genome browsers are parts of bigger genome information resources
• Their main purpose is to graphically display complex information
• And to put this information into the genomic context
• We will not discuss all of the rich functionalities of UCSC and ENSEMBL here
General comments cont’d.
• Genome browsers are tools to – display various types of information – in the context of the genome
The context of the genome
• Graphical representation of the genome
• The chromosomes are displayed as straight ’strings’
• With coordinates for the positions
• Various features are aliged to the chromosomes and displayed as tracks
What information is displayed?
• Which features can be displayed in such a way?
• Which features NOT?
UCSC genome browser
chr7
Mouseover effect
Zoom in
Tracks
Comments on Gene models
Some comments on gene models:• different databases have different ways to define gene models
Common examples are• NCBI: RefSeq, ’Known genes’• ENSEMBL: Known genes, Novel genes, Predicted genes
• What are the differences of the models?• Which one is the better model?
stat3
Tracks in the Genome Browser
• Various types of information are aligned to the genome in groups, so calles tracks
• Each track contains a logical unit of information– Different gene models– Experimental evidence: cDNA, mRNA, EST– Expression and regulation– Repeats, SNPs, miRNA, ...– Comparative genomics
Customizing tracks
• Tracks can be displayed in different levels of detail
dense
FULL
Upload your own track
• You can upload your own track to the genome browser– As a file– As a URL pointing to a file– Data must be formatted in BED, GFF, GTF,
WIG or PSL formats.
• Example: – You want to display miRNA target predictions
in the genome browser
Export as high quality graphic
• It can be important, for example for a publication, to obtain high quality versions of the graphs displayed in the genome browser
Customizing the display
• Many details about the display can be customized
UCSC genome browser as data repository
• The genome browser is the front-end of a data repository
• The backend is a database that contains all the details about the displayed information
• The information in the databases can be retrieved seperately from the download section
ENSEMBL• About the Ensembl Project• Ensembl is a joint project between EMBL -
European Bioinformatics Institute (EBI) and the Wellcome Trust Sanger Institute (WTSI) to develop a software system which produces and maintains automatic annotation on selected eukaryotic genomes. Ensembl is primarily funded by the Wellcome Trust.
• Goals of Ensembl• The Ensembl project aims to provide: • Accurate, automatic analysis of genome data. • Analysis and annotation maintained on the current data. • Presentation of the analysis to all via the web. • Distribution of the analysis to other bioinformatics laboratories.
stat3
Summary
• The UCSC browser and the ENSEMBL browser have very similar functions
• It needs some time to get accustomed to any of them
• Remommendation: – Choose one of them– Get used to it– Stick to it
Summary cont’d
• They are highly flexible
• They allow to easily get an impression of a genomic region
• They are extremely powerful tools for – beginners as well as for experts– For biologists as well as Bioinformaticians
RIKEN Genomic Elements Viewer
• RIKEN provides the genomic elements viewer
• It was specifically developed to display the RIKEN CAGE data
• CAGE data provides infoemation about the start of a transcript
• And is very valuable for e.g.– Promoter analysis– Alternative regulation (isoforms)