"chemspider: the free chemical database", presentation by may
TRANSCRIPT
Overview
• What is ChemSpider?
• How to conduct a search
• What data ChemSpider records can contain
• Contributing data
• Sharing reaction protocols
What is ChemSpider?
+ +
Free to use
ChemSpider brings together many different types of data, from a variety of sources
2-(Acetyloxy)benzoic acid 2-(Acetyloxy)benzolcarbonsäure [German] 2-Acetoxybenzenecarboxylic acid 2-Acetoxybenzoesäure [German] 2-Acetoxybenzoic acid 2-acetyloxybenzoic acid 50-78-2 [RN] Acetoxybenzoic acid acetyl salicylic acid Acetysalicylic acid acide 2-(acétyloxy)benzoïque [French] Acide 2-acétoxybenzoïque [French] Aspirin [Wiki] Aspirin (JP15/USP) (VAN) Benzoic acid, 2-(acetyloxy)- Kyselina acetylsalicylova [Czech] o-(Acetyloxy)benzoic Acid
What sort of data?
Headline facts
• >27 million structures = >27 million records
– from ~400 data sources
• Free – no need to register
• Provides access to chemistry information:
– anyone, any time, anywhere
Desktop PC in the lab, laptop at home, phone/tablet on the move
• Data isn’t perfect – but still has value
• Users can get involved! – Share your data
– Raise your profile
ChemSpider in more detail
• Key concept:
“all data that is added to the database must be associated a with discrete chemical structure”
• This sounds easy
... but is actually hard
• Huge advantages to aggregating information around a chemical structure
What about other chemical species? • Polymers
• Gels (& other supramolecular species)
• Ordered inorganics:
– Minerals etc.
• Organometallics
• Biological macromolecules: Proteins, antibodies, DNA
Ph Ph Phn
Why are structures so important?
• Fundamental language of chemists
– Concise
– Contain huge amount of data
• How else do we share chemical information?
– Common/trivial names
– Systematic names
– Registry numbers
These are abstractions/derivatives
What’s in a name?
Systematic Name:
(1S,3S,5R,9R,10R,11R,13R,14S,15R,17S,18R,19R,23Z,25R,27S,29S,31S, 33S,36S,37S,38R,41S,43S,49S)-11-[(4S,5E)-7-chloro-4-hydroxy-2-methylideneocta-5,7-dien-1-yl]-10,14,15,17,27,43-hexahydroxy-31-methoxy-18,36,38,43,49-pentamethyl-39-methylidene-7,35-dioxo-8,12,4 5,46,47,48,50-heptaoxaheptacyclo [39.3.1.11,5.19,13.115,19.125,29.129,33] pentacont-23-ene-3,37-diyl diacetate
Spongistatin 1
Getting started
• Quick and easy •No special software • No need to register/login to get started • http://www.chemspider.com • Or http://cs.m.chemspider.com (mobile)
ChemSpider search
• Chemical name • Tradename or synonym • Registry number or other identifiers (SMILES, InChI)
ChemSpider results
• The record view
ChemSpider results • Systematic names, synonyms and other identifiers
ChemSpider results
Validated synonyms can be used to query other services: eg. Google Scholar
ChemSpider results • Find articles on this compound from RSC
journals and books
• Also link to PubMed
ChemSpider results
• Patent citations provided by:
– SureChem
– Google Patents
ChemSpider results • User or publisher recommended articles
ChemSpider results • Supplier catalogues and links to other data sources
ChemSpider results • Predicted property data from ACD/Labs and others
• Where available we also display experimental data
Search by structure
Search by structure
• The record view for codeine
ChemSpider results • Embedded spectra
Adding data to ChemSpider
• Users can contribute data and use as a shared resource for research group
ChemSpider Synthetic Pages - CSSP • An online database of synthetic reactions
Synthetic Pages - Online publishing • Deposit synthetic procedures on Synthetic Pages
– Share/Highlight your chemistry
• Emphasis on reliable/robust chemistry
– Not so easy to gauge in traditional publications
– Platform allows users to comment/query protocol
– Share good practice (including safety information)
• Cite the syntheses in your CV (DOIs)
• Contribute quality syntheses to a growing database of reactions
ChemSpider Synthetic Pages
ChemSpider as a resource
• Download and reuse structures
• 2D and 3D images
• Spectra, crystal structures
• Synthetic procedures
Summary
• Aggregate, integrate and link data from across the internet
• Over 27 million structures from ~400 data sources
• Linked to vendors, literature, online databases (Open & Private), Open Notebook Science, patents
• Providing access to types of data that traditional databases don’t link to: Podcasts, Blogs, Videos
Summary • Access to chemistry information any time, anywhere
• Free – no need to register
• Data isn’t perfect (this reflects the datasources that we cover) – but still lots of value
• Machine processing and crowdsourced curation improves quality
• Users can share data, raise their profile, get more value from their data
ChemSpider needs you
• What can I do in ChemSpider?
– Curate and Verify names
– Deposit data
– Add links to your publications
– Tell your colleagues
– Use the site and submit feedback
Thank you Email: [email protected] Twitter: ChemSpider http://www.chemspider.com http://cssp.chemspider.com/
ChemSpider Questions
• Can I search for reactions/synthetic transformations? – Not currently
• Why would I use this rather than SciFinder®, ReAxys® or similar? – ChemSpider is not a free replacement for such tools
– Overlap on some areas covered by such tools
– Different users will get different things out of ChemSpider
– Pick the right tool for the job
• Do you have X, Y, Z data? – All the data we have is in the record
– Look for links to relevant datasources
Common ChemSpider Questions
• There are no spectra on record X, Why? – We don’t have a lab or spectrometers: We need users to help
make the data richer
• How can I trust the data in ChemSpider? – It is healthy to always assess the data that you use
– Ensure that you examine the provenance
– The data in ChemSpider is constantly open to scrutiny, criticism and revision by anyone (10’s of thousands of scientists: experts in their fields)
– We are actively trying to improve the quality of the data all of the time
– We have built-in mechanisms for facilitating quality assurance