mygene.info talk at ismb/bosc 2013

Download MyGene.info talk at ISMB/BOSC 2013

If you can't read please download the document

Upload: anewgene

Post on 30-Jun-2015

361 views

Category:

Science


0 download

DESCRIPTION

MyGene.info: Gene Annotation Query as a Service talk at ISMB/BOSC 2013

TRANSCRIPT

  • 1. MyGene.info Chunlei Wu, Ph.D. The Scripps Research Institute La Jolla, CA, USA BOSC 2013 July 20, 2013 - Making Elastic Gene API

2. Being Elastic Fast Always ON Up-to-date Scalable Extensible 3. Common procedure for gene data retrieval Entrez Ensembl UniProt Internal gene object Data sources Data parsing update regularly ... 4. Common procedure for gene data retrieval Entrez Ensembl UniProt Internal gene object Data sources Data parsing update regularly As A Service? ... Common procedure for gene data retrieval Entrez Ensembl UniProt Internal gene object Data sources Data parsing update regularly As A Service? ... 5. Working model - 1 Entrez Ensembl UniProt Data sources Merging Query Engine User queries ... 6. Working model - 1 Entrez Ensembl UniProt Data sources Merging Query Engine ... { entrezgene: 1017 } { uniprot: P24941 } A dummy merging example: { ensemblgene: ENSG00000123374 } { entrezgene: 1017, ensemblgene: ENSG00000123374, uniprot: P24941 } 7. Gene object in noSQL database key document 1017: { Symbol: CDK2, Ensembl: ENSG00000123374, RefSeq: [ NM_001798, NM_052827 ], Reporter: { U95A: [ 1792_g_at, 1833_at ], U133A:[ 211804_s_at, 2045252_at, 211803_at ] } } 8. Syncing from data-hub to query instance Merging Entrez Ensembl UniProt Data sources ... Query Engine User queries Public data hub Public query instance 9. Syncing from data-hub to query instances Merging Entrez Ensembl UniProt Data sources ... Query Engine User queries Public data hub Public query instance 10. Public query instance http://MyGene.info (currently v2 API, two endpoints) http://MyGene.info/v2/query?q= any query term(s) matching gene hits http://MyGene.info/v2/gene/ gene id(s) matching gene objects 11. Public query instance Support ALL species, from NCBI (>12K species, >13M genes) >40 annotation fields and expanding Weekly-updated Flexible query interface Simple queries Fielded queries Wildcard queries Genomic interval queries Species filter Returning fields filter Support batch queries, JSONP, CORS Committed for long-term availability ?q=cdk2 ?q=symbol:cdk2 ?q=cdk* ?q=chr1:1-100,000&species=human ?q=cdk2&species=mouse,rat ?q=cdk2&fields=symbol,homologene http://MyGene.info 12. Public query instance High-performance host (serving ~500K requests/day) http://MyGene.info 13. Public query instance MyGene.py - Python wrapper https://pypi.python.org/pypi/mygene Third-party packages pip install mygene 14. Public query instance MyGene.autocomplete - Gene query autocomplete widget https://bitbucket.org/sulab/mygene.autocomplete Third-party packages 15. Public query instance MyGene.autocomplete - Gene query autocomplete widget https://bitbucket.org/sulab/mygene.autocomplete Third-party packages 16. Working model 2 Entrez Ensembl UniProt Data sources Merging Query Engine User queriesMerging 1 Merging 2 Merging 3 Query Engine Query Engine ... 17. Syncing from data-hub to query instances Merging Entrez Ensembl UniProt Data sources ... Query Engine User queries Public data hub User queries Query Engine Private data hub Merging Merging Private query instancePublic query instance Private 1 Private 2 ... 18. Private query instance Dedicated host Same powerful query interface Third-party packages still work Public data still get sync-ed Allow to merge private data 19. To reach us? Questions on public query instance or interested in setting up your own private query instance? Please let us know: [email protected] 20. Code repositories Web front-end https://bitbucket.org/sulab/mygene.info Apache 2 licensed Data hub https://bitbucket.org/sulab/mygene.hub GPL v3 licensed 21. Acknowledgement Funding and Support R01GM083924 Sulab Andrew Su Benjamin Good Max Nannis Salvatore Loguercio Katie Fisch Tobias Meissner 22. Syncing from data-hub to query instances Merging Entrez Ensembl UniProt Data sources ... Query Engine User queries Public data hub User queries Query Engine Private data hub Merging Merging Private query instancePublic query instance Private 1 Private 2 ... 23. Private query instance Same powerful query interface Third-party packages still work Public data get sync-ed Allow to merge private dataPublic data hub Query Engine Private data hub MergingMerging Private query instance Private 1 Private 2 ... Entrez Ensembl UniProt Data sources ...