title: genewiz browser: an interactive tool for visualizing sequenced chromosomes by peter f....

9
Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies, Craig J. Benham, and David W. Ussery Published on Standards in Genomic Sciences (2009) 1: 204-215 Citation count: 35

Upload: brendan-nelson

Post on 11-Jan-2016

221 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,

Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes

ByPeter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T.

Binnewies, Craig J. Benham, and David W. Ussery

Published on Standards in Genomic Sciences (2009) 1: 204-215Citation count: 35

Page 2: Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,

Background• Over 15 years of the genomic sequencing development, the

public genome database has held more than a thousand sequenced genomes.

• It is explicitly useful for biologists to analyzed multiple genomes cross different species for a broad range of interests, especially:o identify the phylogenetic relationship, genomic region causing the

pathogencity to human and animalso new targeted genes worthy for industrial and economical use.

• Such availability of the analytics tools is limited and often requires users with both analytical and programming knowledge, hence the analysis of multiple genomes is not always easy in a broad range of the biological research.

Page 3: Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,

Function of the GeneWiz browser• GeneWiz browser for visualizing genomic data of

prokaryotic chromosomes.• This tool provides several functions:

o visualizing whole genome homology of genes and proteins within a reference strain compared to other strains or species

o visualizing DNA physical properties such as curvature along the chromosome

o identifying the repeat sequences along the chromosomeo Additionally, custom numerical data such as gene expression and

regulation data can also plotted

• This web-interface service provides an interoperable method to carry out whole genome visualization

Page 4: Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,

Implementation of GeneWiz browser

• The method behind this visualization tool is to convert numerical information to color-encoded lanes in either using a linear scale with a fixed minimum and maximum range, or a dynamic scale of standard deviations.o DNA properties based on various developed methods to indicate

particular regions posing biological functionso Mapping of homologous genes by BLAST (Basic Local Alignment Search

Tool)o Mapping of short sequencing reads with the weighted coverageo Custom lanes with pre-processed data provided by users

Page 5: Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,

Workflow of GeneWiz browser• This web interface includes two parts:

1. the client is written as a JavaApplet that obtains the data remotely from the server

2. the server is written in Perl/CGI, while a compiled C-program handles the access to the binary data files.

• All input/output objects are defined in a separated XSD file (XML schema definition) within the WSDL file,

• MySQL on the server provides the storage function for pre-binning of data for each zoom level

Page 6: Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,

• The maximum uniqueness quality is shown for the actual reads (green-to-blue lane) plotted along with reference genome.

• This figure shows that a good correspondence between the in-silico and experimental reads suggests little bias towards certain chromosomal regions if read coverage is around 40 times.

Page 7: Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,

• BLAST comparison of 14 closely related bacteria chromosomes.

• This figure clearly indicates that a strong preference of deletion on the pathogenic islands exist for a few of bacteria not causing infection to human.

Page 8: Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,

• A final example illustrates how the marks indicating the uniqueness of DNA physical properties can be used to integrate known regulatory elements and gene annotations to draw a more complete picture of a particular region for gene expressions.

Page 9: Title: GeneWiz browser: An Interactive Tool for Visualizing Sequenced Chromosomes By Peter F. Hallin, Hans-Henrik Stærfeldt, Eva Rotenberg, Tim T. Binnewies,

Summary• Most biologists believe that a visualization of the multidimensional

genomic information is necessary, but the use of an analytic tool is relatively difficult to them.

• GeneWiz browser is superior to numerous tools which are all the command-line programs generating publication quality static images and vector graphics for the genomic visualization.o easily navigate using mouseo zooming function to allow users to interpret the genomic information at varying

scaleso an automatic workflow that can be directly called from the users via the client

part

• This tool can be relevant in many pangenomic (cross-sequenced-species) as well as in metagenomic (cross-unsequenced-species) studies, by giving a quick overview of clusters of insertion sites, genomic islands and overall homology between a reference sequence and a data set.