gui gominer and high-throughput gominer analysis of alternative splice variants barry zeeberg, ari...

20
GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang Liu, Alessandro Ferrucci, William Reinhold, and John Weinstein plus a lot of help from Rich Einstein and Mike Brenner of

Upload: morgan-teresa-wilkins

Post on 05-Jan-2016

241 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

GUI GoMiner andHigh-Throughput GoMiner

Analysis of Alternative Splice Variants

Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang Liu, Alessandro Ferrucci, William Reinhold, and John Weinstein

plus a lot of help from Rich Einstein and Mike Brenner of ExonHit

Page 2: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

The World According to a Microarray:

• Genes are not Genes• Genes are a Mixture of Splice Variants

Page 3: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

Patterns of alternative

splicing

Page 4: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

The Ostrich Effect• Tend to hide our head in the sand• Treat microarray data as if a gene did not

have multiple alternative splice forms• But altered expression of one splice variant

can be more important than altered expression of the “gene”> i.e., lumping together all splice forms in

one monolithic measurement is bad to do

Page 5: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

Motivation: The Problem• In many disease states, differential expression of

individual splice variants may be more relevant than differential expression of genes

• Traditional microarrays are not designed to permit elucidation of individual splice variants

• State-of-the-art microarrays are being developed to permit elucidation of individual splice variants

• A major limitation is that software tools are not available to exploit the potential information content of the state-of-the-art microarrays

Page 6: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

Our Solution: Three Components

• Develop a database (EVDB) and web application (SpliceMiner) that maps probe sequences to known splice variants

• Enhance GoMiner with a mechanism to process splice variants

• Connect these two “ends” with the appropriate integration approach

Page 7: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

Our Solution: Three Components• Develop a database (EVDB) and web

application (SpliceMiner) that maps probe sequences to known splice variants

• Enhance GoMiner with a mechanism to process splice variants

• Connect these two “ends” with the appropriate integration approach

Page 8: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

SpliceMiner Home Page

HGNC symbol chromosomal coordinates

Remember these: used later inGoMiner “Tilde” mechanism!!

Page 9: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

“Batch” is key to analysis of microarray results

Page 10: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

Our Solution: Three Components• Develop a database (EVDB) and web

application (SpliceMiner) that maps probe sequences to known splice variants

• Enhance GoMiner with a mechanism to process splice variants

• Connect these two “ends” with the appropriate integration approach

Page 11: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

GoMiner andHigh-Throughput GoMiner

• GoMiner organizes lists of 'interesting' genes (for example, under- and overexpressed genes from a microarray experiment) for biological interpretation in the context of the Gene Ontology

• High-Throughput GoMiner is an enhancement of GoMiner which efficiently performs the computationally-challenging task of automated batch processing of an arbitrary number of microarray experiments

Page 12: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

GoMiner “Tilde” (“~”) Mechanism

• GoMiner traditionally dereplicates input files so that only one instance of a gene name is processed

• When multiple alternatively spliced forms are to be analyzed, however, dereplication would result in a loss of relevant information

• Consequently, we have added a new feature to GoMiner to retain full information about the alternative splice variants by replicating the input of each gene according to the number of alternative exons

Page 13: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

Example of Tilde Mechanism

• As a specific example, suppose that a microarray platform contained probes that were unique for two different splice variants of BRCA1

• Then the two splice variants would be designated as 'BRCA1~1' and 'BRCA1~2'

• The '~' tells GoMiner to treat these as different entries, rather than to de-replicate them, but to ignore the suffix when querying the GO database

• By this mechanism, all splice variants are counted when computing the Fisher exact p value

Page 14: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

A Publication using Tilde Mechanism

• Study of “exon expression” regulated by Nova, a key neuronal splicing factor

• Reference: Nova regulates brain-specific splicing to shape the synapse, Ule et al., Nature Genetics 37, 844 - 852 (2005)

Page 15: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

GoMiner Detected Differences in Neurologically-Important GO Categories between Wild Type

and Nova Knockouts

Page 16: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

Significance of Nova paper• First description of a regulatory module

operating at the level of information content mediated by RNA exon usage

• Levels of Nova-regulated RNAs are unchanged in knockout versus wild-type brains: alternative exon usage as a means of modulating the quality of synaptic protein interactions

• Regulation of quality, not quantity

Page 17: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

Our Solution: Three Components

• Develop a database (EVDB) and web application (SpliceMiner) that maps probe sequences to known splice variants

• Enhance GoMiner with a mechanism to process splice variants

• Connect these two “ends” with the appropriate integration approach

Page 18: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

Generalization of theTilde Mechanism

• A Previous slide noted that two splice variants could be designated as ‘BRCA1~1’ and ‘BRCA1~2’

• But the suffix can be an arbitrary string that carries biological information, not just used as an ordinal index

• So we can use the output of SpliceMiner (HGNC symbol, GenBank accession, chromosomal coordinates) to construct a string of the correct form, with a suffix that is highly informative

• Using the output from SpliceMiner as the input to GoMiner will connect the two “ends” and permit splice variant-based GO categorization

Page 19: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

Conclusions

• The new era of microarray research will demand analysis of differential expression of exons and transcripts, rather than genes

• We are developing resources to map probe sequences to exons and transcripts

• GoMiner can integrate this information with GOA to allow the molecular biologist to leverage both knowledgebases for enhanced analysis and interpretation of microarray data

Page 20: GUI GoMiner and High-Throughput GoMiner Analysis of Alternative Splice Variants Barry Zeeberg, Ari Kahn, Michael Ryan, David Kane, Curtis Jamison, Hongfang

CollaboratorsGBG:

Ari Kahn

Michael Ryan

David Kane

Hongfang Liu

William Reinhold

John Weinstein

GMU:

Curtis Jamison

UMBC:

Alessandro Ferrucci

ExonHit:

Rich Einstein

Mike Brenner