gene ontology enrichment network analysis -tutorial
DESCRIPTION
Step by step tutorial for conducting GO enrichment analysis and then creating a network from the results. Material from the UC Davis 2014 Proteomics Workshop. See more at: http://sourceforge.net/projects/teachingdemos/files/2014%20UC%20Davis%20Proteomics%20Workshop/TRANSCRIPT
![Page 1: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/1.jpg)
Dmitry Grapov, PhD
Gene Ontology Network Enrichment Analysis
![Page 2: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/2.jpg)
Download all material for the tutorial
https://sourceforge.net/projects/teachingdemos/files/2014%20UC%20Davis%20Proteomics%20Workshop/Summer%202014%20Proteomics%20Workshop.zip/download
https://sourceforge.net/projects/teachingdemos/files/
Choose 2014 UC Davis Proteomics Workshop or use the full URL below
![Page 3: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/3.jpg)
• decrease• increase
Use functional analysis to identify if the changes in variables are enriched (increased compared to random chance) for some biological pathway, domain or ontological category.
![Page 4: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/4.jpg)
Enrichment or Overrepresentation analysis
Biochemical Pathway Biochemical Ontology
![Page 5: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/5.jpg)
Major TasksUsing the proteins listed in the excel workbook: ‘proteomic data for
analysis.xlsx’ and worksheet: ‘protein IDs’
1. Conduct Gene Ontology (GO) Enrichment Analysis using DAVID Bioinformatics Resourceshttp://david.abcc.ncifcrf.gov/home.jsp
2. Investigate enriched terms using Quick GO http://www.ebi.ac.uk/QuickGO/
3. Summaries and visualize the results using REVIGO http://revigo.irb.hr/
4. Create and modify GO network using Cytoscape http://www.cytoscape.org/
![Page 6: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/6.jpg)
Protein IDsCommon protein identifier UniProt/SwissProt Accession (default in scaffold) http://www.uniprot.org/
Use Biomart to translate to other database IDS
http://www.biomart.org/
e.g. gene symbols
![Page 7: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/7.jpg)
David Bioinformatics Resources
![Page 8: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/8.jpg)
David Bioinformatics Resources
1. Upload list
2. Choose ID type
3. Select list type
4. Submit
![Page 9: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/9.jpg)
David Bioinformatics Resourcesorganism Make sure all IDs were recognized
List of biochemical databases tested for enrichment
![Page 10: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/10.jpg)
David Bioinformatics Resources
List of biochemical databases tested for enrichment
1. Choose GO
![Page 11: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/11.jpg)
David Bioinformatics Resources
http://david.abcc.ncifcrf.gov/helps/functional_annotation.html#E3
![Page 12: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/12.jpg)
David Bioinformatics Resources
List of biochemical databases tested for enrichment
1. Overview BP: Biological process
2. Select
![Page 13: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/13.jpg)
David Bioinformatics Resources
http://david.abcc.ncifcrf.gov/helps/functional_annotation.html#E3
![Page 14: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/14.jpg)
David Bioinformatics Resources1. Overview most enriched term
![Page 15: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/15.jpg)
Quick GO http://www.ebi.ac.uk/QuickGO/1. View children (lower hierarchy subsets) of this term
![Page 16: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/16.jpg)
David Bioinformatics Resources/Quick GO1. Can you identify any enriched children of this term in our DAVID output?
?
2. Download results
![Page 17: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/17.jpg)
Overview and Format Results in Excel
1. Save results 2. Open in MS Excel
![Page 18: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/18.jpg)
Overview Results
Modified Fisher’s Exact Test p-value
optionally: Check in Rx<-data.frame(user=c(1,47),genome=c(690,13528))
fisher.test(x) # p-value = 5.41e-06
(13/47) / (690/13528)
![Page 19: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/19.jpg)
Alternative to Fisher Exact Test:
Hypergeometric Test
How to calculate statistics to determine enrichment?
hit.num = 51 # number of significantly changed pathway variables
set.num = 1455 # number of variables in pathway
full = 3358 # all possible variables in organism
q.size = 72 # number of significantly changed variables
phyper(hit.num-1, set.num, full-set.num, q.size, lower.tail=F)
enrichment p-value = 1.717553e-06
![Page 20: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/20.jpg)
Visualization OptionsChallenges: •Removal of redundant information•Visualizing term relationships (term-term, term-protein)
![Page 21: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/21.jpg)
Use REVIGO to filter redundant termshttp://revigo.irb.hr/
prepare input (term, p-value)
1. Upload to
REVIGO
Supek F, Bošnjak M, Škunca N, Šmuc T. "REVIGO summarizes and visualizes long lists of Gene Ontology terms" PLoS ONE 2011. doi:10.1371/journal.pone.0021800
2. Run
![Page 22: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/22.jpg)
REVIGO: overview scatterplot
Position defined on similarity (MDS)
![Page 23: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/23.jpg)
REVIGO: overview table
Cluster leaders prioritized based on enrichment p-value
![Page 24: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/24.jpg)
REVIGO: network
• Edges: 3% of the strongest GO term pairwise similarities
• Node size: generality of term (small = specific)
• Node color: p-value
Download network
![Page 25: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/25.jpg)
Cytoscape
1. Open Cytoscape
Import REVIGO network into cytoscape
2
3 4
![Page 26: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/26.jpg)
Cytoscape: set layout and defaults
1. Set layout 3. Set network defaults
2
4 5
![Page 27: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/27.jpg)
Cytoscape: map data to network properties
1. Set Edge width and color 2. Set Node labels, size and color
![Page 28: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/28.jpg)
Cytoscape: overview network components
Download edge information
1
2
3. View in excel
Download node information
1
2
3. View in excel
![Page 29: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/29.jpg)
Bonus: Modify Edge and Node Attributes to show term to protein connections
See file ‘test edge.xlsx’ and ‘test node.xslx, for examples of upload formats
See detailed instructions at http://www.slideshare.net/dgrapov/demonstration-of-network-mapping
![Page 30: Gene Ontology Enrichment Network Analysis -Tutorial](https://reader034.vdocuments.us/reader034/viewer/2022052123/554e84d2b4c90526358b45ac/html5/thumbnails/30.jpg)
See more Statistical and Multivariate Analysis Examples athttp://imdevsoftware.wordpress.com/tutorials/
Questions?
This research was supported in part by NIH 1 U24 DK097154