marrying acdlabs technologies to escience projects at the royal society of chemistry
DESCRIPTION
The Royal Society of Chemistry is one of the worlds foremost scientific societies, a primary publisher for the chemical sciences and an innovator in the domain of eScience. In order to deliver on a number of our eScience projects we utilize a number of components of Advanced Chemistry Development software including nomenclature, physchem prediction, spectroscopy tools and the ACD/Ilab web-based system. This presentation will provide an overview of a number of RSC projects where ACS/Labs software has played an important role in the delivery of the systems including ChemSpider and the National Chemical Database Service for the United Kingdom. We will also provide an overview of our vision to deliver a repository for various types of experimental chemistry data and how we foresee utilizing various prediction and validation software approaches to characterize the data as well as the potential to generate predictive models from the data. This couples directly with our intention to data enable our publication archive of over 300,000 articles extracting chemicals, reactions and analytical data from the historical records.TRANSCRIPT
![Page 1: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/1.jpg)
Marrying ACD/Labs technologies to eScience Projects at the Royal Society of Chemistry
Antony WilliamsACD/Labs User Meeting
June 2013
![Page 2: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/2.jpg)
RSC eScience• Royal Society of Chemistry is a member society
(>47,000), Publisher and Innovator in eScience• Host of many online databases and services
– ChemSpider, SyntheticPages, SpectraSchool,…
• Participant in multiple grant-based projects– National Chemical Database Service– Open PHACTS – PharmaSea
![Page 3: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/3.jpg)
Multiple ACD/Labs Tools in use…
• Structure “checking” routines for data• Nomenclature generation and conversion• Physicochemical prediction algorithms• Web-based spectral display widget• “Interactive Lab” web-based prediction tools
• But first an intro to ChemSpider…
![Page 4: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/4.jpg)
ChemSpider
• 28 million chemicals with associated data…
![Page 5: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/5.jpg)
I want to know about “Vincristine”
![Page 6: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/6.jpg)
I want to know about “Vincristine”
![Page 7: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/7.jpg)
Vincristine: Identifiers and Properties
![Page 8: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/8.jpg)
Predicted Properties
![Page 9: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/9.jpg)
Vincristine: Vendors and SourcesLinked by Structure
![Page 10: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/10.jpg)
Vincristine: Patents
![Page 11: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/11.jpg)
Google Patents
![Page 12: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/12.jpg)
Vincristine: ArticlesLinked by Name
![Page 13: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/13.jpg)
RSC Databases
![Page 14: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/14.jpg)
RSC Database Linkthrough
![Page 15: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/15.jpg)
Spectra
![Page 16: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/16.jpg)
Spectra
![Page 17: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/17.jpg)
Where do data come from?
• ChemSpider users deposit data• Some contributions from NIST• Chemical vendors are starting to provide data.
Synthonix are one of our major contributors (www.synthonix.com)
![Page 18: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/18.jpg)
Crowdsourced “Annotations”• Users can add
– Compounds– Descriptions/Syntheses/Commentaries– Links to articles via DOIs – Add spectral data– Add Crystallographic Information Files– Add photos– Add MP3 files– Add Videos
![Page 19: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/19.jpg)
Crowdsourced Curation
• Crowd-sourced curation: identify/tag errors, edit names, synonyms, identify records to deprecate
![Page 20: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/20.jpg)
Spectral Uploading
• Locate the structure of interest and deposit spectrum
![Page 21: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/21.jpg)
Spectral Uploading• Various types of NMR spectra supported
![Page 22: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/22.jpg)
Regular Updates
![Page 23: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/23.jpg)
Multiple Spectra for One Structure
![Page 24: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/24.jpg)
ChemSpider ID 24528095 H1 NMR
![Page 25: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/25.jpg)
ChemSpider ID 24528095 C13 NMR
![Page 26: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/26.jpg)
ChemSpider ID 24528095 HHCOSY
![Page 27: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/27.jpg)
ChemSpider ID 24528095 HSQC
![Page 28: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/28.jpg)
ChemSpider ID 24528095 HMBC
![Page 29: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/29.jpg)
Available Spectra http://www.chemspider.com/spectra.aspx
![Page 30: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/30.jpg)
Number of Spectra
• IR 5389• HNMR 1679• CNMR 1207• UV-Vis 183• EI 90• 2D1H13CD 68
• Raman 51• NIR 32• 2D1H1HCOSY 21• 2D1H13CLR 10• CI+ve 8• PNMR 7
• 9746 spectra against 6890 compounds
![Page 31: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/31.jpg)
Some usage statistics• ca. 200 visitors at any one time, ~30,000 visits per day• Mar 4-Apr 3, 2013
– Visits = 731,656– Unique Visitors = 527,008
• Independent servers to support other projects
• Does not include web service calls
![Page 32: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/32.jpg)
ChemSpider as a Foundation
• ChemSpider is a foundation for projects:– >400 data sources aggregated and mapped – Continually curated and updated with new data– Normalized data around a structure centric data
model– Providing an API allows integration to support other
internal projects– Providing API access outside RSC extends the reach
![Page 33: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/33.jpg)
Micropublishing Syntheses
![Page 34: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/34.jpg)
ChemSpider SyntheticPages
![Page 35: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/35.jpg)
Olympicene
![Page 36: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/36.jpg)
![Page 37: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/37.jpg)
![Page 38: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/38.jpg)
Web ServicesExample: Spectral Data
![Page 39: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/39.jpg)
www.SpectralGame.comhttp://www.jcheminf.com/content/1/1/9
![Page 40: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/40.jpg)
Spectral Game
![Page 41: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/41.jpg)
Increasing Complexity
![Page 42: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/42.jpg)
SpectralGame in the hand
![Page 43: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/43.jpg)
SpectraSchool http://spectraschool.rsc.org/
![Page 44: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/44.jpg)
SpectraSchool
![Page 45: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/45.jpg)
Recently Added– THANKS ACD/Labs!• Storage and display of ASSIGNED spectra
![Page 46: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/46.jpg)
Access ChemSpider
• APIs– Programmatic access used by Mobile Apps, Funded
Consortia projects, many Academic groups
• Widgets– UI components for embedding in other websites
• Data– Data access, downloads, reuse, licensing
![Page 47: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/47.jpg)
Flexible ChemSpider API
![Page 48: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/48.jpg)
Flexible ChemSpider API
![Page 49: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/49.jpg)
Linking Names to Structures
![Page 50: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/50.jpg)
It is so difficult to navigate…
What’s the structure?What’s the structure?
Are they in our file?
Are they in our file?
What’s similar?What’s similar?
What’s the target?
What’s the target?Pharmacology
data?Pharmacology
data?
Known Pathways?
Known Pathways?
Working On Now?
Working On Now?Connections to
disease?Connections to
disease?
Expressed in right cell type?
Expressed in right cell type?
Competitors?Competitors?
IP?IP?
![Page 51: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/51.jpg)
• 3-year Innovative Medicines Initiative project
• Integrating chemistry and biology data using semantic web technologies
• Open source code, open data and open standards
• Academics, Pharma companies, Publishers
![Page 52: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/52.jpg)
ChemSpider Contributions
• The host of the chemistry services– Supplier of “standardized” chemical data files– Chemistry searching (structure, substructure etc)– Curator and data quality checking
• Presently rolling out the Open PHACTS chemical registration system
![Page 53: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/53.jpg)
• FP7 Initiative. PharmaSea: increasing value and flow in the marine biodiscovery pipeline (2012-2017)
• Improve the quality, volume and value of active agents discovered in the marine environment and increase the speed at which they can be delivered
![Page 54: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/54.jpg)
PharmaSea• Dereplication via ChemSpider• Hosting of natural products datasets• Integrated storage of analytical data (ACD/Labs)• Analytical data algorithms & integration
– Mass spec searching – predicted fragmentation– NMR feature searching – NMR prediction– Computer-assisted structure elucidation
• Integration to ACD/Structure Elucidator
![Page 55: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/55.jpg)
UK Chemical Database Service
![Page 56: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/56.jpg)
Ilab Integration – NMR DB Searching
![Page 57: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/57.jpg)
Ilab Integration – NMR Prediction
![Page 58: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/58.jpg)
National Chemistry Data Repository
• Imagine all chemistry related data from all academic projects in the UK in ONE system
• Security model for the data to be embargoed, private or public (available to the entire world!)
• Provide tools for easy data upload, review, automated validation – chemicals, reactions, spectral data, alphanumeric data
• Use the data for algorithm training…
![Page 59: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/59.jpg)
In Discussions At Present
• Develop the worlds largest online spectroscopy database of integrated data
• Does ACD/Labs have tools to help?– Automated depositions – Silent Automation– Processing and validation – Spectrus – Databasing – Spectrus DB– Web-based integration into ChemSpider
![Page 60: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/60.jpg)
Where else can we get RICH data?
![Page 61: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/61.jpg)
DERA : Data Enable the RSC Archive
• How much data is in the archive, in the publications and in the supplementary info?– How many compounds for ChemSpider?– How many syntheses for ChemSpider reactions?– How many characterization measurements?
• Property Data• Spectral Data• Graphs and charts to be used for modeling?
![Page 62: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/62.jpg)
What if we could capture it all?
![Page 63: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/63.jpg)
The Future of Data
• In Publications– Interactive plots, spectra, buy that compound,
predict that property– Validation of data going INTO publications – NMR
prediction, CASE validation, PhysProp comparisons
• From the lab– How much data NEVER gets published and is still
useful? Failed Reactions? More Open Data…
![Page 64: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/64.jpg)
Acknowledgements
• RSC eScience Team • ACD/Labs – Pranas Japartas and Karim Kassam• GGA – Indigo Toolkit and Bingo Cartridge• The community of depositors• The Open Source Community
![Page 65: Marrying ACDLabs technologies to eScience Projects at the Royal Society of Chemistry](https://reader035.vdocuments.us/reader035/viewer/2022062319/554e7e37b4c90545698b5183/html5/thumbnails/65.jpg)
Thank you
Email: [email protected] Twitter: @ChemConnectorPersonal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams