visualisation of chemical data brian mcmahon research & development officer international union...

12
Visualisation of chemical data Brian McMahon Research & Development Officer International Union of Crystallography 5 Abbey Square Chester CH1 2HU [email protected] DataCite Summer Meeting, Hannover 7-8 June 2010 Use cases for publication of crystal structures

Upload: megan-clark

Post on 27-Mar-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Visualisation of chemical data Brian McMahon Research & Development Officer International Union of Crystallography 5 Abbey Square Chester CH1 2HU bm@iucr.org

Visualisation of chemical data

Brian McMahon

Research & Development OfficerInternational Union of Crystallography5 Abbey SquareChester CH1 2HU

[email protected]

DataCite Summer Meeting, Hannover7-8 June 2010

Use cases for publication of crystal structures

Page 2: Visualisation of chemical data Brian McMahon Research & Development Officer International Union of Crystallography 5 Abbey Square Chester CH1 2HU bm@iucr.org

International Union of CrystallographyInternational Scientific Union

Publishes 8 research journals:• Acta Crystallographica Section A: Foundations of Crystallography

• Acta Crystallographica Section B: Structural Science

• Acta Crystallographica Section C: Crystal Structure Communications

• Acta Crystallographica Section D: Biological Crystallography

• Acta Crystallographica Section E: Structure Reports Online

• Acta Crystallographica Section F:Structural Biology and Crystallization Communications

• Journal of Applied Crystallography

• Journal of Synchrotron Radiation

Publishes major reference work International Tables for Crystallography (8 volumes)

Promotes standard crystallographic data file format (CIF)

Page 3: Visualisation of chemical data Brian McMahon Research & Development Officer International Union of Crystallography 5 Abbey Square Chester CH1 2HU bm@iucr.org

Crystal Structure reports - data-rich scientific articles

• 3-d positional coordinates

• Atomic motions

• Molecular geometry

• Chemical bonding

• Crystal packing

• Chemical behaviour arising from structure

• Two dedicated IUCr journals: Acta Cryst. C, E

• Important part of scientific discussion in many other titles: Acta Cryst. B, D, F

Page 4: Visualisation of chemical data Brian McMahon Research & Development Officer International Union of Crystallography 5 Abbey Square Chester CH1 2HU bm@iucr.org

Data that inform the discussion

Raw data

(image plate, diffractometer, film)

Primary data

(structure factors)

Derived data

(six-dimensional structural model)

Page 5: Visualisation of chemical data Brian McMahon Research & Development Officer International Union of Crystallography 5 Abbey Square Chester CH1 2HU bm@iucr.org

Structural data sets integral to publicationEvery structural paper has an associated data set (CIF)

These are free to access, even for subscription journals

Hence accessible from browsable Table of Contents (as well as from article)

User can download CIF data set directly

Or visualise the structure interactively in three dimensions

... using a standard view supplied by a visualisation applet (Jmol) ...

... or a 'helper' application of the reader's choosing (Mercury)

Page 6: Visualisation of chemical data Brian McMahon Research & Development Officer International Union of Crystallography 5 Abbey Square Chester CH1 2HU bm@iucr.org

Enhanced figuresIUCr journals provide an authoring toolkit for creating enhanced figures –

Data visualisations crafted by the author (but allowing the reader to interact fully with the data and create other visualisations if desired)

The enhanced figure will appear (with caption) as a normal figure in the online journal; the PDF/print editions will have an equivalent static view.Additional views and interactive features can be added by the author if desired.

Page 7: Visualisation of chemical data Brian McMahon Research & Development Officer International Union of Crystallography 5 Abbey Square Chester CH1 2HU bm@iucr.org

... from an external database using a known accession code (e.g. dn3141 for Crystallography Journals Online, 3dez for Protein Data Bank) ...

Data sourcesThe structural data (CIF) can be uploaded to the journal production office ...

... from the user's hard drive (as part of the article submission process) ...

... or from a registered data DOI (e.g. 10.2210/pdb3dez/pdb)

Page 8: Visualisation of chemical data Brian McMahon Research & Development Officer International Union of Crystallography 5 Abbey Square Chester CH1 2HU bm@iucr.org

DOIs for crystallographic structures (1)Article in IUCr Crystallography Journals Online

• Acta Cryst. (2010). C66, o274-o278  [ doi:10.1107/S0108270110015532 ] Different hydrogen-bonding modes in two closely related oximes, G. Dutkiewicz, H. S. Yathirajan, R. Ramachandran, S. Kabilan and M. Kubicki

• Describes structures of two molecules: 1-chloroacetyl-3-ethyl-2,6-diphenylpiperidin-4-one oxime, C21H23ClN2O2 (at two distinct temperatures) and 1-chloroacetyl-2,6-diphenyl-3-(propan-2-yl)piperidin-4-one oxime, C22H25ClN2O2

• IUCr identifier: dn3141 points to article and associated data sets

• DOI: 10.1107/S0108270110015532/dn3141sup1.cif

– DOI for data set (CrossRef) delivers data file directly

– One data file, three distinct data sets (file internally partitioned, data_I, data_I100K, data_II)

• Other supplementary data sets available (processed experimental data)

– DOIs: 10.1107/S0108270110015532/dn3141Isup2.hkl etc.

– Separate data file for each of three refined structures

Page 9: Visualisation of chemical data Brian McMahon Research & Development Officer International Union of Crystallography 5 Abbey Square Chester CH1 2HU bm@iucr.org

DOIs for crystallographic structures (2)Macromolecular structure from Protein Data Bank• Orotate phosphoribosyltransferase from Streptococcus mutans• PDB code: 3dez points to structure, associated data files, sequence, visualizations

etc.• DOI: 10.2210/pdb3dez/pdb

– DOI for data set (CrossRef) delivers data file directly– One data file, one distinct data set (protein structure)

• DOI of associated publication: 10.1107/S1744309110009243

– Acta Cryst. (2010). F66, 498-502  [ doi:10.1107/S1744309110009243 ] Structure of orotate phosphoribosyltransferase from the caries pathogen Streptococcus mutans, C.-P. Liu, R. Xu, Z.-Q. Gao, J.-H. Xu, H.-F. Hou, L.-Q. Li, Z. She, L.-F. Li, X.-D. Su, P. Liu and Y.-H. Dong

• PDB also archives structure factors (processed experimental data sets)– One per refined structure– No distinct DOI assigned

• PDB does not (yet) archive primary data sets– More than one per refined structure

Page 10: Visualisation of chemical data Brian McMahon Research & Development Officer International Union of Crystallography 5 Abbey Square Chester CH1 2HU bm@iucr.org

DOIs for crystallographic structures (3)Crystal structure in U. Southampton eCrystals repository• 2,2-dibutyl-1,3-propanediol

• eCrystals accession code: 643 points to structure, associated data files, visualizations etc.

• DOI: 10.3737/ecrystals.chem.soton.ac.uk/643

– DOI assigned by CrossRef delivers portal page to all associated data files

– Portal links to raw data sets (archived off-site at STFC Atlas Facility)

– Portal links to derived information (chemical structure as CML files)

• DOI of associated publication:

– Publisher-assigned DOI if structure published in peer-reviewed journal

– unpublished

Page 11: Visualisation of chemical data Brian McMahon Research & Development Officer International Union of Crystallography 5 Abbey Square Chester CH1 2HU bm@iucr.org

DOIs for crystallographic structures (4)

Unilever Cambridge Centre for Molecular Informatics / Project XYZ• Proposal for JISC project (partners: IUCr, BioMedCentral, OKF)

• Explore data overlay journal: linking, validation, rights relating to crystal structures:

– Published (variety of commercial/society publishers)

– Unpublished (repositories, laboratory collections)

– Auto-generated annotation

• DOI:

– Strategy not yet elaborated. Probably includes:

– DOI assigned by DataCite delivers portal page to all associated data files

– Portal links to derived information (chemical structure as CML files)

• DOI of associated publication:

– Publisher-assigned DOI if structure published in peer-reviewed journal

– unpublished

Page 12: Visualisation of chemical data Brian McMahon Research & Development Officer International Union of Crystallography 5 Abbey Square Chester CH1 2HU bm@iucr.org

DOIs for crystallographic structures: summaryIUCr Crystallography Journals Online• DOI for article (CrossRef)

• Separate DOI for data set (CrossRef)

Protein Data Bank• DOI for data set (CrossRef)

eCrystals / University of Southampton• DOI for data collection (CrossRef)

Unilever Cambridge Centre for Molecular Informatics / Project XYZ• DOI for data collection? (DataCite)

* Need consistent protocols for retrieving data set from larger data collection