go based data analysis

Post on 20-Jan-2016

31 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

GO based data analysis. Iowa State Workshop 11 June 2009. All tools and materials from this workshop are available online at the AgBase database Educational Resources link. For continuing support and assistance please contact: agbase@cse.msstate.edu. - PowerPoint PPT Presentation

TRANSCRIPT

GO based data analysis

Iowa State Workshop

11 June 2009

All tools and materials from this workshop are available online at the AgBase database Educational Resources link.

For continuing support and assistance please contact:

agbase@cse.msstate.edu

This workshop is supported by USDA CSREES grant number MISV-329140.

AgBase protein annotation processProtein identifiers or

Fasta format

GORetriever

Annotated Proteins

GOanna

Proteins with no annotations

GOSlimViewer

Hypothesis generating

Gene Ontology enrichment analysis

GO terms that are statistically (Fisher’s exact test) over or underrepresented in a set of genes

Annotation Clustering

group similar annotations based on the hypothesis that they should have similar gene members   

Some resources

DAVID: http://david.abcc.ncifcrf.gov/ GOStat: http://gostat.wehi.edu.au/ EasyGO: http://bioinformatics.cau.edu.cn/easygo/ AmiGO http://amigo.geneontology.org/cgi-bin/amigo/term_enrichment

(does not use IEA) Onto-Express & OE2GO http://vortex.cs.wayne.edu/projects.htm GOEAST http://omicslab.genetics.ac.cn/GOEAST http://www.geneontology.org/GO.tools.shtml Comparison of enrichment analysis tools : Nucleic Acids Research, 2009,

Vol. 37, No. 1 1–13 (Tool_Comparison_09.pdf)

DAVID and EasyGO analysis included DAVID&EasyGo.ppt

Database for Annotation, Visualization and Integrated Discovery

http://vortex.cs.wayne.edu/ontoexpress

Onto-Express analysis instructions areAvailable in onto-express.ppt

Species represented in Onto-Express

For uploading your own annotations use OE2GO

Comparison

Onto-Express , EasyGO, GOstat and DAVID Test set: 60 randomly selected chicken genes Used AgBase GO annotations as baseline

annotations

Vandenberg et al (BMC Bioinformatics, in review)

Networks & Pathways

Iowa State Workshop

11 June 2009

Multiple data analysis platforms

Proteomics

Transcriptomics

ESTs

LIST

Our original aim…. …understand biological phenomena….

Bits and pieces of information Do not have the full picture How do we get back to BIOLOGY in this

digital information landscape?

What do we know about biological systems …. biological systems are dynamic, not static how molecules interact is key to understanding

complex systems

Francis Crick, 1958

Types of interactions protein (enzyme) – metabolite (ligand)

metabolic pathways

protein – protein cell signaling pathways, protein complexes

protein – gene genetic networks

Sod1 Mus musculus

STRING Database

http://string.embl.de/

PLoS Computational Biology March 2007, Volume 3 e42

Database/URL/FTPDIP http://dip.doe-mbi.ucla.eduBIND http://bind.ca MPact/MIPS http://mips.gsf.de/services/ppi STRING http://string.embl.deMINT http://mint.bio.uniroma2.it/mintIntAct http://www.ebi.ac.uk/intactBioGRID http://www.thebiogrid.orgHPRD http://www.hprd.orgProtCom http://www.ces.clemson.edu/compbio/ProtCom3did, Interprets http://gatealoy.pcb.ub.es/3did/Pibase, Modbase http://alto.compbio.ucsf.edu/pibaseCBM ftp://ftp.ncbi.nlm.nih.gov/pub/cbmSCOPPI http://www.scoppi.org/iPfam http://www.sanger.ac.uk/Software/Pfam/iPfamInterDom http://interdom.lit.org.sgDIMA http://mips.gsf.de/genre/proj/dima/index.htmlProlinks http://prolinks.doe-mbi.ucla.edu/cgibin/functionator/pronav/Predictome http://predictome.bu.edu/

Pathways & Networks

A network is a collection of interactions

Pathways are a subset of networks Network of interacting proteins that carry out biological

functions such as metabolism and signal transduction

All pathways are networks of interactions

NOT ALL NETWORKS ARE PATHWAYS

Biological Networks

Networks often represented as graphs Nodes represent proteins or genes that code for

proteins Edges represent the functional links between

nodes (ex regulation) Small changes in graph’s topology/architecture

can result in the emergence of novel properties

Yeast Protein-Protein Interaction Map

Nature 411, 2001,

H. Jeong, et al

KEGG http://www.genome.jp/kegg/pathway.html/BioCyc http://www.biocyc.org/Reactome http://www.reactome.org/GenMAPP http://www.genmapp.org/BioCarta http://www.biocarta.com/

Pathguide – the pathway resource list http://www.pathguide.org/

Some resources

Gallus gallus is missing

PathguideStatistics

Reactome

What is feasible with my specific dataset?

Systems Biology Workflow

Nanduri & McCarthy CAB reviews, 2008

Systems Biology Workflow

For a given species of interest what type of data is available???

Retrieval of interaction datasets

Evaluate PPI resources such as Predictome

Prolinks for existence of species of interest If unavailable, find orthologous proteins in

related species that have interactions!

I have interactions what next?

Evaluate the quality of interactions i.e. type of method used for identification….what exactly are these methods?

I have interactions what next?

Evaluate the quality of interactions i.e. type of method used for identification….what exactly are these methods?

STRING Database

PPI Identification

Experimental Computational

Gene Coexpression

TAP assays

Sequence coevolution

Yeast two hybrid Phylogenetic profile

Gene Cluster

Rosetta stone method

Text mining

TAP assays

Yeast two hybrid (Y2H)

Protein arrays

PLoS Computational Biology March 2007, Volume 3 e42

PPI database comparisons

Proteins: Structure, Function and Bioinformatics 63:490-500 2006

I have interactions what next?

Evaluate the quality of interactions i.e. type of method used for identification….what exactly are these methods?

Visualize these interactions as a network and analyze…

what are the available tools?

top related