kipper: sequence database versioning for galaxy bioinformatics servers
TRANSCRIPT
![Page 1: Kipper: Sequence database versioning for Galaxy bioinformatics servers](https://reader036.vdocuments.us/reader036/viewer/2022062522/587a56461a28ab520b8b57f1/html5/thumbnails/1.jpg)
KIPPER: SEQUENCE DATABASE VERSIONING FOR GALAXY BIOINFORMATICS SERVERS
Damion DooleyHsiao Lab, BC Public Health Microbiology & Reference LaboratoryAnd UBC Department of Pathology, Vancouver, Canada
https://github.com/Public-Health-Bioinformatics
/kipper /versioned_data
![Page 2: Kipper: Sequence database versioning for Galaxy bioinformatics servers](https://reader036.vdocuments.us/reader036/viewer/2022062522/587a56461a28ab520b8b57f1/html5/thumbnails/2.jpg)
How to recreate sequencing analysis?
Retrieve or redo sequencing data
Get right software versions
Get databases as they appeared on a certain date
![Page 3: Kipper: Sequence database versioning for Galaxy bioinformatics servers](https://reader036.vdocuments.us/reader036/viewer/2022062522/587a56461a28ab520b8b57f1/html5/thumbnails/3.jpg)
Nice database vs. juggernaut
Periodically publishedVarying ability to download past versions
RDP RNA v10.1 – 11.4 (5.5 GB) Silva RNA v89 – 119 (2.6 GB)Uniref (~50 versions, ~35 GB latest)
Pseudo-versioned Version stated but no way to get past ones?No client software for insert/delete diff
NCBI nt (58 GB) NCBI nr (78 GB) Ancient juggernaut supporting immortal database and crushing
unwary sys admins in its path
![Page 4: Kipper: Sequence database versioning for Galaxy bioinformatics servers](https://reader036.vdocuments.us/reader036/viewer/2022062522/587a56461a28ab520b8b57f1/html5/thumbnails/4.jpg)
Kipper – fetch!
What is a poor server admin to do?
![Page 5: Kipper: Sequence database versioning for Galaxy bioinformatics servers](https://reader036.vdocuments.us/reader036/viewer/2022062522/587a56461a28ab520b8b57f1/html5/thumbnails/5.jpg)
Kipper data store
Metadata file
![Page 6: Kipper: Sequence database versioning for Galaxy bioinformatics servers](https://reader036.vdocuments.us/reader036/viewer/2022062522/587a56461a28ab520b8b57f1/html5/thumbnails/6.jpg)
Kipper data store
Volume file(s)
![Page 7: Kipper: Sequence database versioning for Galaxy bioinformatics servers](https://reader036.vdocuments.us/reader036/viewer/2022062522/587a56461a28ab520b8b57f1/html5/thumbnails/7.jpg)
Version listing
• Add new version:
• Retrieve a version by id:
$ Kipper rdp_rna –i download.fasta –o.
$ Kipper rdp_rna –e –n11
• Kipper is a python script
$ Kipper rdp_rna
![Page 8: Kipper: Sequence database versioning for Galaxy bioinformatics servers](https://reader036.vdocuments.us/reader036/viewer/2022062522/587a56461a28ab520b8b57f1/html5/thumbnails/8.jpg)
Galaxy - version retrieval
![Page 9: Kipper: Sequence database versioning for Galaxy bioinformatics servers](https://reader036.vdocuments.us/reader036/viewer/2022062522/587a56461a28ab520b8b57f1/html5/thumbnails/9.jpg)
Version retrieval
![Page 10: Kipper: Sequence database versioning for Galaxy bioinformatics servers](https://reader036.vdocuments.us/reader036/viewer/2022062522/587a56461a28ab520b8b57f1/html5/thumbnails/10.jpg)
Acknowledgements
This work was supported by Genome Canada / Genome BC Grant “A Federated Bioinformatics Platform for Public Health Microbial Genomics” to Fiona Brinkman, Gary Van Domselaar and William Hsiao. More information about the IRIDA project (Integrated Rapid Infectious Disease Analysis) can be found at http://www.irida.ca