![Page 1: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/1.jpg)
The Gene Wiki: Crowdsourcing human gene annotation
Andrew Su, Ph.D.The Scripps Research Institute
ISMBSpecial Session: Harnessing community
intelligence for bioinformatics#ISMB #SS7
July 17, 2012
![Page 2: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/2.jpg)
The Long Tail is a prolific source of content2
ShortHead
Long Tail
Content produced
Contributors (sorted)
News :Video:
Product reviews:Food reviews:Talent judging:
Gene annotation:
NewspapersTV/Hollywood
Consumer reportsFood criticsOlympics
Manual curation
BlogsYouTube
Amazon reviewsYelp
American IdolGene Wiki
![Page 3: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/3.jpg)
3
We can harness the Long Tail of scientists to directly participate in
the gene annotation process.
![Page 4: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/4.jpg)
Wikipedia is reasonably accurate4
![Page 5: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/5.jpg)
Wikipedia has breadth and depth5
http://en.wikipedia.org/wiki/Wikipedia:Size_comparisons, July 2008
Articles
Words(millions)
Wikipedia Britannica Online
![Page 6: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/6.jpg)
Filtering, extracting, and summarizing PubMed
Documents
Concepts
![Page 7: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/7.jpg)
Wiki success depends on a positive feedback7
Gene wiki page utility
Number ofusers
Number ofcontributors
1001
2002
![Page 8: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/8.jpg)
10,000 gene “stubs” within Wikipedia8
Protein structure
Symbols and identifiers
Tissue expression pattern
Gene Ontology annotations
Links to structured databases
Gene summary
Protein interactions
Linked references
Huss, PLoS Biol, 2008
Utility
Users
Contributors
![Page 9: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/9.jpg)
Gene Wiki has a critical mass of readers9
Total: ~4.3 million views / month
Huss, PLoS Biol, 2008; Good, NAR, 2011
Utility
Users
Contributors
![Page 10: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/10.jpg)
Gene Wiki has a critical mass of editors10
Good, NAR, 2011
Utility
Users
Contributors
Cum
ulat
ive
edits
Productive edits
Vandalism
~10,000 words added / month
4.3 million views / month
1000 edits / month
Total 1.42 million words ≈ 230 full-length articles
![Page 11: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/11.jpg)
A review article for every gene is powerful11
References to the literature
Hyperlinks to related conceptsReelin: 98 editors, 703 edits since July 2002
Heparin: 358 editors, 654 edits since June 2003
AMPK: 109 editors, 203 edits since March 2004
RNAi: 394 editors, 994 edits since October 2002
![Page 12: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/12.jpg)
Making the Gene Wiki more computable12
Structured annotationsFree text
![Page 13: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/13.jpg)
Annotator
Filling the gaps in gene annotation13
Wikilink
GO exact synonym
Gene Wiki mapping
NCBI Entrez Gene: 3362
GO:0004993
Candidate assertion
Good, BMC Genomics 2011, 12:603
![Page 14: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/14.jpg)
Annotator
Filling the gaps in gene annotation14
Wikilink
GO exact match
Gene Wiki mapping
NCBI Entrez Gene: 334
GO:0006897
Candidate assertion
Good, BMC Genomics 2011, 12:603
![Page 15: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/15.jpg)
Novel GO annotations – so what?15
11,022 annotations mined from Gene Wiki
4703 (43%) match known annotations
~100,000 annotations
from GO consortium
6319 “novel”
annotations @ 48-64% specificity
Good, BMC Genomics 2011, 12:603
![Page 16: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/16.jpg)
Gene Wiki content improves enrichment analysis16
GO term
Gene listConcept
recognitionPubMed abstracts
Enrichment analysis
GO:0007411
axon guidance
(GO:0007411)
264 genes
Linked genes through PubMed
P = 1.55 E-20
811 articles
Yes No
Yes 13 2
No 251 12033
![Page 17: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/17.jpg)
Gene Wiki content improves enrichment analysis17
GO term
Gene listConcept
recognitionPubMed abstracts
Gene Wiki
+
Enrichment analysis
GO:0006936 GO:0006936
muscle contraction
(GO:0006936)
87 genes
Linked genes through PubMed
Linked genes through
PubMed + Gene Wiki
P = 1.0 P = 1.22 E-09
251 articles
87 articles
![Page 18: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/18.jpg)
Gene Wiki content improves enrichment analysis18
p-value (PubMed only)
p-value (PubMed + GW)
Muscle contraction
More significant with PubMed + GW
More significant with PubMed only
![Page 19: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/19.jpg)
Gene Wiki+ for integrative queries19
http://genewikiplus.org
mwsync
![Page 20: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/20.jpg)
Dynamic queries across genes, diseases, SNPs20
![Page 21: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/21.jpg)
21
![Page 22: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/22.jpg)
22
TOP 100 GENES
![Page 23: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/23.jpg)
Gene Wiki+ for integrative queries23
http://genewikiplus.org
mwsync
{{#ask: [[Category:Human_proteins]] [[is_associated_with:: <q>[[Category:Breast_cancer]]</q>]] [[HasSNP:: <q>[[is_associated_with:: <q>[[Category:Breast_cancer]]</q>]] </q>]]}}
…
OMIMPharmGKB
![Page 24: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/24.jpg)
OMIMPharmGKB
Gene Wiki+ for integrative queries24
http://genewikiplus.org
mwsync
![Page 25: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/25.jpg)
The Long Tail of scientists is a valuable source of
information on gene function
25
![Page 26: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/26.jpg)
Crowdsourcing a gene annotation portal26
![Page 27: ISMB2012: The Gene Wiki: Crowdsourcing human gene annotation](https://reader036.vdocuments.us/reader036/viewer/2022062307/554e80d8b4c9054a698b5461/html5/thumbnails/27.jpg)
27
Doug Howe, ZFINJohn Hogenesch, U PennJon Huss, GNFLuca de Alfaro, UCSCAngel Pizzaro, U PennFaramarz Valafar, SDSUPierre Lindenbaum,
Fondation Jean DaussetMichael Martone, RushKonrad Koehler, Karo BioWarren Kibbe, Simon Lim, NorthwesternMany Wikipedia editors
WP:MCB Project
Collaborators
Erik ClarkeBen GoodSalvatore Loguercio
Ian MacleodMax NanisChunlei Wu
Group members
Funding and Support
(BioGPS: GM83924, Gene Wiki: GM089820)
Contacthttp://sulab.org
[email protected]@andrewsu+Andrew Su
ISMB travel support