go based data analysis iowa state workshop 11 june 2009

37
GO based data analysis Iowa State Workshop 11 June 2009

Upload: eustacia-booker

Post on 29-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GO based data analysis Iowa State Workshop 11 June 2009

GO based data analysis

Iowa State Workshop

11 June 2009

Page 2: GO based data analysis Iowa State Workshop 11 June 2009

All tools and materials from this workshop are available online at the AgBase database Educational Resources link.

For continuing support and assistance please contact:

[email protected]

This workshop is supported by USDA CSREES grant number MISV-329140.

Page 3: GO based data analysis Iowa State Workshop 11 June 2009

AgBase protein annotation processProtein identifiers or

Fasta format

GORetriever

Annotated Proteins

GOanna

Proteins with no annotations

GOSlimViewer

Page 4: GO based data analysis Iowa State Workshop 11 June 2009

Hypothesis generating

Gene Ontology enrichment analysis

GO terms that are statistically (Fisher’s exact test) over or underrepresented in a set of genes

Annotation Clustering

group similar annotations based on the hypothesis that they should have similar gene members   

Page 5: GO based data analysis Iowa State Workshop 11 June 2009

Some resources

DAVID: http://david.abcc.ncifcrf.gov/ GOStat: http://gostat.wehi.edu.au/ EasyGO: http://bioinformatics.cau.edu.cn/easygo/ AmiGO http://amigo.geneontology.org/cgi-bin/amigo/term_enrichment

(does not use IEA) Onto-Express & OE2GO http://vortex.cs.wayne.edu/projects.htm GOEAST http://omicslab.genetics.ac.cn/GOEAST http://www.geneontology.org/GO.tools.shtml Comparison of enrichment analysis tools : Nucleic Acids Research, 2009,

Vol. 37, No. 1 1–13 (Tool_Comparison_09.pdf)

DAVID and EasyGO analysis included DAVID&EasyGo.ppt

Page 6: GO based data analysis Iowa State Workshop 11 June 2009

Database for Annotation, Visualization and Integrated Discovery

Page 7: GO based data analysis Iowa State Workshop 11 June 2009
Page 8: GO based data analysis Iowa State Workshop 11 June 2009
Page 9: GO based data analysis Iowa State Workshop 11 June 2009

http://vortex.cs.wayne.edu/ontoexpress

Onto-Express analysis instructions areAvailable in onto-express.ppt

Page 10: GO based data analysis Iowa State Workshop 11 June 2009

Species represented in Onto-Express

Page 11: GO based data analysis Iowa State Workshop 11 June 2009

For uploading your own annotations use OE2GO

Page 12: GO based data analysis Iowa State Workshop 11 June 2009

Comparison

Onto-Express , EasyGO, GOstat and DAVID Test set: 60 randomly selected chicken genes Used AgBase GO annotations as baseline

annotations

Vandenberg et al (BMC Bioinformatics, in review)

Page 13: GO based data analysis Iowa State Workshop 11 June 2009
Page 14: GO based data analysis Iowa State Workshop 11 June 2009

Networks & Pathways

Iowa State Workshop

11 June 2009

Page 15: GO based data analysis Iowa State Workshop 11 June 2009

Multiple data analysis platforms

Proteomics

Transcriptomics

ESTs

LIST

Page 16: GO based data analysis Iowa State Workshop 11 June 2009

Our original aim…. …understand biological phenomena….

Bits and pieces of information Do not have the full picture How do we get back to BIOLOGY in this

digital information landscape?

Page 17: GO based data analysis Iowa State Workshop 11 June 2009

What do we know about biological systems …. biological systems are dynamic, not static how molecules interact is key to understanding

complex systems

Francis Crick, 1958

Page 18: GO based data analysis Iowa State Workshop 11 June 2009

Types of interactions protein (enzyme) – metabolite (ligand)

metabolic pathways

protein – protein cell signaling pathways, protein complexes

protein – gene genetic networks

Page 19: GO based data analysis Iowa State Workshop 11 June 2009

Sod1 Mus musculus

STRING Database

http://string.embl.de/

Page 20: GO based data analysis Iowa State Workshop 11 June 2009
Page 21: GO based data analysis Iowa State Workshop 11 June 2009

PLoS Computational Biology March 2007, Volume 3 e42

Database/URL/FTPDIP http://dip.doe-mbi.ucla.eduBIND http://bind.ca MPact/MIPS http://mips.gsf.de/services/ppi STRING http://string.embl.deMINT http://mint.bio.uniroma2.it/mintIntAct http://www.ebi.ac.uk/intactBioGRID http://www.thebiogrid.orgHPRD http://www.hprd.orgProtCom http://www.ces.clemson.edu/compbio/ProtCom3did, Interprets http://gatealoy.pcb.ub.es/3did/Pibase, Modbase http://alto.compbio.ucsf.edu/pibaseCBM ftp://ftp.ncbi.nlm.nih.gov/pub/cbmSCOPPI http://www.scoppi.org/iPfam http://www.sanger.ac.uk/Software/Pfam/iPfamInterDom http://interdom.lit.org.sgDIMA http://mips.gsf.de/genre/proj/dima/index.htmlProlinks http://prolinks.doe-mbi.ucla.edu/cgibin/functionator/pronav/Predictome http://predictome.bu.edu/

Page 22: GO based data analysis Iowa State Workshop 11 June 2009

Pathways & Networks

A network is a collection of interactions

Pathways are a subset of networks Network of interacting proteins that carry out biological

functions such as metabolism and signal transduction

All pathways are networks of interactions

NOT ALL NETWORKS ARE PATHWAYS

Page 23: GO based data analysis Iowa State Workshop 11 June 2009

Biological Networks

Networks often represented as graphs Nodes represent proteins or genes that code for

proteins Edges represent the functional links between

nodes (ex regulation) Small changes in graph’s topology/architecture

can result in the emergence of novel properties

Page 24: GO based data analysis Iowa State Workshop 11 June 2009

Yeast Protein-Protein Interaction Map

Nature 411, 2001,

H. Jeong, et al

Page 25: GO based data analysis Iowa State Workshop 11 June 2009

KEGG http://www.genome.jp/kegg/pathway.html/BioCyc http://www.biocyc.org/Reactome http://www.reactome.org/GenMAPP http://www.genmapp.org/BioCarta http://www.biocarta.com/

Pathguide – the pathway resource list http://www.pathguide.org/

Some resources

Page 26: GO based data analysis Iowa State Workshop 11 June 2009
Page 27: GO based data analysis Iowa State Workshop 11 June 2009

Gallus gallus is missing

PathguideStatistics

Page 28: GO based data analysis Iowa State Workshop 11 June 2009

Reactome

Page 29: GO based data analysis Iowa State Workshop 11 June 2009

What is feasible with my specific dataset?

Page 30: GO based data analysis Iowa State Workshop 11 June 2009

Systems Biology Workflow

Nanduri & McCarthy CAB reviews, 2008

Page 31: GO based data analysis Iowa State Workshop 11 June 2009

Systems Biology Workflow

For a given species of interest what type of data is available???

Page 32: GO based data analysis Iowa State Workshop 11 June 2009

Retrieval of interaction datasets

Evaluate PPI resources such as Predictome

Prolinks for existence of species of interest If unavailable, find orthologous proteins in

related species that have interactions!

Page 33: GO based data analysis Iowa State Workshop 11 June 2009

I have interactions what next?

Evaluate the quality of interactions i.e. type of method used for identification….what exactly are these methods?

Page 34: GO based data analysis Iowa State Workshop 11 June 2009

I have interactions what next?

Evaluate the quality of interactions i.e. type of method used for identification….what exactly are these methods?

STRING Database

Page 35: GO based data analysis Iowa State Workshop 11 June 2009

PPI Identification

Experimental Computational

Gene Coexpression

TAP assays

Sequence coevolution

Yeast two hybrid Phylogenetic profile

Gene Cluster

Rosetta stone method

Text mining

TAP assays

Yeast two hybrid (Y2H)

Protein arrays

PLoS Computational Biology March 2007, Volume 3 e42

Page 36: GO based data analysis Iowa State Workshop 11 June 2009

PPI database comparisons

Proteins: Structure, Function and Bioinformatics 63:490-500 2006

Page 37: GO based data analysis Iowa State Workshop 11 June 2009

I have interactions what next?

Evaluate the quality of interactions i.e. type of method used for identification….what exactly are these methods?

Visualize these interactions as a network and analyze…

what are the available tools?