dawn field: the genomics standards consortium (gsc)
DESCRIPTION
Dawn Field's opening presidential address to the 13th Genomics Standards Consortium workshop in Shenzhen, 5th March 2012TRANSCRIPT
Dawn FieldNERC Centre for Ecology and Hydrology
The Genomic Standards Consortium (GSC)
Standards Driving Sciencehttp://gensc.org
GSCThe GSC is an open membership community working towards better descriptions of our collection of genomes, metagenomes and marker gene sets
The GSC is running a range of consensus-driven projects and is now making a call for community compliance/community involvement
Standards Driving Sciencehttp://gensc.org
GSC 13
The rise of the megasequencing project...
Standards Driving Sciencehttp://gensc.org
GSC 13
From Genomes to Interactions to Communities to Models
Standards Driving Sciencehttp://gensc.org
The Genome Revolution
1953
1980
1989
2001
1994
PCRInvented
SequencingInvented
InternetDevelops
1995 1996
2001
1980197019601950 1990
SimilaritySearchingDeveloped
(Blast)
Watson & Crick discover the double helix
Standards Driving Sciencehttp://gensc.org
GEBA(56
genomes)
2011
Global Ocean SurveyMetagenomes
Sargasso SeaMetagenome
Next-generation sequencing
(62 genomes)
2007200520032001 2009
1kHuman
genomes
Synthetic life
The (Meta)Genome Revolution
Standards Driving Sciencehttp://gensc.org
20222018201620142012 2020
The Megasequencing Revolution
Standards Driving Sciencehttp://gensc.org
Genomic Observatories (GOs)
A network of sites working to generate genomic observations that are well-contextualized and
compliant with global data standards
Sustained, DNA Centric, Place-Based Research
Standards Driving Sciencehttp://gensc.org
Genomic Observatories Network
Standards Driving Sciencehttp://gensc.org
GSC 13
From Genomes to Interactions to Communities to Models
Standards Driving Sciencehttp://gensc.org
Standards Driving Sciencehttp://gensc.org
GSC 14 GOs 1
Sept 17-19, 2012, Oxford e-Research Centre, University of Oxford
Launch of Genomic Observatories Network; Focus on highly contextualized site-based research, defining what a genomic observatories is, GSC standards compliance, data integration, RDF, modelling and Biocode Commons
Standards Driving Sciencehttp://gensc.org
GSC: Building a Communal Table
Communication
Community
Collaboration
Communal Table
Standards Driving Sciencehttp://gensc.org
GSC’s main areas of interestGenomics - the study of the genomes
(entire DNA sequence) of organisms
Metagenomics – the study of genetic material isolated from environmental samples
Marker studies – the study of the composition of environmental samples based on marker genes (e.g. 16S ribosomal RNA)
Standards Driving Sciencehttp://gensc.org
To exploit fully the promise of scientific data we need both innovation and community agreement on how to provide appropriate stewardship of these resources for the benefit of all. Requires the evolution of our scientific, technological and sociological thinking....
The Data Bonanza
Standards Driving Sciencehttp://gensc.org
The Data SuperMarket
Standards Driving Sciencehttp://gensc.org
Norman MorrisonThe Data SuperMarket
Standards Driving Sciencehttp://gensc.org
How to Package data?
Standards Driving Sciencehttp://gensc.org
Labels for data
<phenotype><environmental
context>
Standards Driving Sciencehttp://gensc.org
A standard is a convention that gives uniformity to an area of research or innovation.
Standards unite groups and enable collective change.
Standards provide the language in which innovation is written.
What are standards?
Standards Driving Sciencehttp://gensc.org
StandardsNot everything should be ‘standardized’Aggregation of data, information, and
knowledge requires standard ways of doing things
Standards provide foundations; Standards should drive innovation (think of electrical plugs or the internet)
Pick the right concepts to standardize – at the right time, with the right people
Requires good ‘group think’ – or ‘systems thinking’
Principles
Standards Driving Sciencehttp://gensc.org
GSC 10 Argonne,
2010
GSC 11,Hinxton,
2010
GSC 12Bremen,
2011
GSC 13BGI 2012
Community-driven solutions
Taking the ‘Common Path’ towards building consensus:
• Identify the problem• Define a community to
address it• Define scope of the solution• Implement solution• Gain adoption of solution
Standards Driving Sciencehttp://gensc.org
The GSC Mission
the implementation of new genomic standards
methods of capturing and exchanging metadata
harmonization of metadata collection and analysis efforts across the wider genomics community
Standards Driving Sciencehttp://gensc.org
The GSC fulfils its mission by
•Organizing meetings •Forming working groups•Creating Consensus Products
Standards Driving Sciencehttp://gensc.org
GSC Standards
Standards Driving Sciencehttp://gensc.org
Use of MIxS
Please provide this minimum information when you publish
• a genome• a metagenome• a gene marker study (i.e. ribosomal
genes)
Genbank, EMBL and DDBJ now accept this information and encourage its submission to their public DNA databases
Standards Driving Sciencehttp://gensc.org
New Labels for Community Data
<MIxS><MIxS>
Standards Driving Sciencehttp://gensc.org
Working groups and consensus products
The Genomic Contextual Data Markup Language (GCDML)
The Genomic Rosetta Stone (GRS)Standards in Genomic Sciences (SIGS) journal
The Microbial Earth Project (MEP)
The GSC’s Compliance and Interoperability (Developers)Working GroupM5: Metagenomics, Metadata, Meta-analysis, Models and Meta-infrastructure
The GSC’s Biodiversity Working Group
Standards Driving Sciencehttp://gensc.org
http://gensc.org/
More information:
Standards Driving Sciencehttp://gensc.org
Conclusions The era of genomics is just
beginning…
Self-organization by the scientific community can pay dividends (i.e. consensus building, large-scale co-ordination) Standards are keys to unlocking data
Standards Driving Sciencehttp://gensc.org
ConclusionsThe GSC is running a range of
consensus-driven projects and is now making a call for community compliance/community involvement
Now possible to submit compliant metadata for MIGS/MIMS/MIMARKS to DDBJ/EMBL/GenBank, GOLD, MG-Rast, CAMERA, BII, VAMPS and more
Standards Driving Sciencehttp://gensc.org
AcknowledgementsThe GSC efforts are contributed on a volunteer basis by a wide range of participants, including GSC authors, working group members, workshop participants and adopters.
Special Thanks to the GSC Board:Linda Amaral-Zettler, MBLGuy Cochrane, EMBL-EBI Jim Cole, MSUNeil Davies (Berkeley)Peter Dawyndt, University of Ghent Dawn Field, CEH (Chair of GSC)George Garrity, MSUJack Gilbert, Argonne National LabFrank Oliver Glöckner, MPI-BremenLynette Hirschman, MITRE Hans-Peter Klenk, DSMZ Renzo Kottmann, MPI-BremenRob Knight (University of Colorado
Nikos Kyrpides, DOE, JGIFolker Meyer, Argonne National LabNorman Morrison (University of Manchester)Inigo San Gil , LTERSusanna Sansone, University of OxfordLynn Schriml, University of Maryland (Treasurer of GSC)Peter Sterk, GSC (Secretary of GSC)Dave Ussery DTU Owen White, University of MarylandJohn Wooley, UCSD (PI of RCN4GSC)
Institutional Liasons to the GSC BoardIlene Mizrachi (NCBI/GenBank)Tatiana Tatusova (NCBI/RefSeq)
Standards Driving Sciencehttp://gensc.org
Acknowledgements
Coordination, workshops, working groups,infrastructure and exchange visits
Additional workshop funds
Local Hosts of GSC workshops
Sponsors of GSC Events
GSC Funding RCN4GSC
Standards Driving Sciencehttp://gensc.org
Large-Scale Data Journal/Database
Editor-in-Chief: Laurie Goodman, PhDEditor: Scott Edmunds, PhDAssistant Editor: Alexandra Basford, PhD
In conjunction with:
Now taking submissions…
Standards Driving Sciencehttp://gensc.org
www.gigasciencejournal.com
Large-Scale Data Journal/DatabaseGigaScience aims to revolutionize data dissemination, organization, understanding, and use. An online open-access open-data journal, we publish 'big-data' studies from the entire spectrum of life and biomedical sciences. A novel publication format links standard manuscript publication with an extensive database that hosts all associated data, provides data analysis tools, cloud-computing resources, and gives all datasets a DOI as a citable and trackable data publication mark.
For more information: [email protected]
@gigascience
Standards Driving Sciencehttp://gensc.org
Contact: [email protected]
GSC13 special series
• Rapid review - rolling publication after launch issue• High-visibility – published/promoted by
BMC/GigaScience• Article Processing Charge covered by BGI• Hosting of any test datasets in GigaDB
Seeking submissions highlighting best practice in genomics research: • Discussion/comment/white papers
• Cloud computing, software for data handling
• Research highlighting best practice
@gigascience