lowering barriers to publishing biological data on the web
DESCRIPTION
Short 10 minute talk encouraging bioinformatics programmers to organize and reuse code targeted at making data easily available on the web. Current open source technologies are combined into a higher level framework. An example implementation using Google App Engine and existing bioinformatics libraries is presented.TRANSCRIPT
Lowering barriers topublishing biological data on
the web
Brad Chapman
Department of Molecular BiologyMassachusetts General Hospital
Boston, MA [email protected]
http://friendfeed.com/chapmanb
27 June 2009
Motivation
Motivation
I Web accessible
I Interoperable in standard formats
I Displays for browsing
I Analyses
I Scale
Current state: Reusable libraries
I Parse file formats
I Run programs
I Build analysis pipelines
I Communities
Python examples
I Biopython
I bx-python
I pygr
I PyCogent
Current state: Database schemas
I Represent biological data
I Expand analyses beyond flat files
I Interoperate with standards
BioSQL Chado
Current state: Web applications
Faster and Bigger
Proposal
I ProvideI Reusable presentation componentsI Quickly deployable frameworks
I IntegrateI Bioinformatics librariesI Database schemasI Web development frameworks
Proposal
http://biosqlweb.appspot.com/
Challenges: Design
I ReusableI Components: avoid large frameworkI Multi-language: javascript front end
I AccessibleI Automated data retrieval (REST)I Standard formats (GFF, RDF)
I AvailableI Creative Commons
http://creativecommons.org/about/licenses
I Open Data Commonshttp://www.opendatacommons.org/licenses/
Challenges: Community questions
How do we. . .
I provide plug-in components?
I leverage existing code?
I make reuse easier?
I communicate about these issues?