facilitating scientific discovery through ... - openphacts.org · • let’s map together all...

136
Facilitating Scientific Discovery through Crowdsourcing and Distributed Participation Antony Williams NETTAB October 17 th 2013

Upload: others

Post on 28-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Facilitating Scientific Discovery

through Crowdsourcing and

Distributed Participation

Antony Williams

NETTAB

October 17th 2013

Page 2: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

If it was not just about me…

• Together we might:

• build an encyclopedia

• …and rate restaurants

• …share book reviews

• …and movie reviews

• …and reviews of service providers

• …organize sit-ins and social action

• …and more data might just be Open

Page 3: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

If it was not just about me…

• Together we might:

• build an encyclopedia

• …and rate restaurants

• …share book reviews

• …and movie reviews

• …and reviews of service providers

• …organize sit-ins and social action

• …and more data might just be Open

Page 4: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Crowdsource the galaxy

Page 5: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s
Page 6: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Various ways to contribute

Page 7: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s
Page 8: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Where Am I From?

Page 9: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

What can be done with Big Data

Page 10: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Patients Like Me

Page 11: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Patients Like Me

Page 12: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Let’s Change the World

• Let’s map together all historical chemistry

data and build systems to integrate new data

• Heck, let’s integrate chemistry and biology

data and add in disease data too

• Lets model the data and see if we can

extract new relationships – quantitative and

qualitative

• Let’s make it all available on the web

Page 13: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

That’s a BIG Request

Page 14: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

What About Something Smaller?

• We’re going to map the world

• We’re going to take photos of as many

places as we can and link them together

• We’ll let people annotate and curate the

map

• Then let’s make it available free on the web

• We’ll make it available for decision making

• Put it on Mobile Devices, Give it Away

Page 15: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

I’m from here…

Page 16: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Wikipedia

Page 17: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Wikipedia

Page 18: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Moel-Y-Parc

Page 19: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s
Page 20: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s
Page 21: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

I care…I want to contribute…

Page 22: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

The Power of Contribution

Page 23: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

How do you spell Afonwen?

Page 24: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

And the Welsh know!

Page 25: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Whoa…

• So the world can be mapped…

• We can enter a 3D environment within the map

• We can add annotations

• We can use the data, we can reference it, we

can extract it, we can make decisions with it

• And we can do it on our lap, in our hands

• Let’s crowdsource chemistry

Page 26: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Chemistry Data is Everywhere

Page 27: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

In a Galaxy far, far away…

• Build a structure-centric hub

• Aggregate structure-based data and integrate

• Link to additional data sources

• Patents

• Publications

• Vendors

• Models

Page 28: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

A LITTLE Chemistry First

Page 29: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Structural Diagrams

Page 30: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Structural Diagrams

Page 31: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Analytical Data

Page 32: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Does Stereochemistry Matter?

Page 33: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Does one stereocenter matter?

Distaval, Talimol, Nibrol, Sedimide, Quietoplex, Contergan, Neurosedyn, Softenon, Thalidomide

Page 34: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Structural Representations

Page 35: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

The InChI Standard

Page 36: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

InChIKeys

Search the Web by Structure

Page 37: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

The Quality of Chemical Data Online

What is the Structure of Vitamin K?

A lipid cofactor that is required for normal

blood clotting. Several forms of vitamin K

have been identified: VITAMIN K1

(phytomenadione) derived from plants,

VITAMIN K2 (menaquinone) from bacteria &

synthetic naphthoquinone provitamins,

VITAMIN K3 (menadione).

Page 38: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

What is the Structure of Vitamin K1?

Page 39: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

What is the Structure of Vitamin K1?

Page 40: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Chemical Abstracts Service

Page 41: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Wikipedia

Page 42: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Wolfram Alpha

Page 43: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

DailyMed

Page 44: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s
Page 45: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

People Use Trusted Resources…

Page 46: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

We can all help clean it up…

Page 47: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

How will it improve?

Participation

and

contribution

Page 48: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Building a chemistry hub ..issues

ALL Different, ALL “Domoic Acids”

Page 49: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

ALL Different, ALL “Domoic Acids” ONE is “correct”

Page 50: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

The EXPERTS must get it right?!

Page 51: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Web 2.0 Contribution

• We have been contributing

to the web for a along time

already – but how much in

chemistry?

• A few blogs, an increasing

amount of tweeting but

what about data sharing in

chemistry?

Page 52: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Challenging a Publication

This story is only 11 years ago!

Page 53: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s
Page 54: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Oops…

Page 55: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

>2 Years to Resolution

Page 56: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

The New Way of Challenging

Page 57: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Challenging Science…

Page 58: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Collaboration towards completion

Page 59: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Detailed constructive dialog

Page 60: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Oxidation by Sodium Hydride?

Page 61: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s
Page 62: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

The Blogosphere Analyzes…

Page 63: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

How much is in the archives?

Page 64: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Data Mine the Archives

• Imagine the power of data-mining the archives

of the publishers

• Generate nanopublications from the articles

and make available to the reasoning engines

• Now imagine extracting all the “data” and

freeing it up to the semantic web.

• MORE LATER ON THAT!

Page 65: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Open Notebook Science Analysis

Page 66: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Oxidation by Sodium Hydride?

THIS IS A COP OUT!!!

Page 68: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

The Blogosphere “Discusses”…

Page 70: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Question Everything Online:

www.dhmo.org

Page 71: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Chemistry is Dangerous!

http://tinyurl.com/cl2awnj

Page 72: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Chemistry is Dangerous

Florida DJs May Face Felony for April

Fools' Water Joke

“… told their listeners that "dihydrogen

monoxide" was coming out of the taps

throughout the Fort Myers area.”

Page 73: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

How do you recognize good vs bad?

Page 74: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Is this real?

Page 75: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Junk vs Real

• “We then established a collaboration with

professor Sum Ting Wong, a fugitive from

the North Korean University Hu Yu Hai

Ding”

• “..identified as the new protein Wai So

Dim”

Page 76: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

What is real, what is fake?

Page 77: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Helping to change science

• Participation and contribution

• Immediacy of action

• Platforms for contribution

• Openness…whatever that is

Page 78: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Getting Called Out in Public…

Rules for Licensing Data

Page 79: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Challenged in the Twittersphere

Page 80: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Annotating Articles Today…

Page 81: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Attribution to me…

Page 82: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Back to Chemistry….

• ANYBODY can annotate a record on ChemSpider

• Registered users can deposit new data

• Compounds

• Reactions

• Links

• Analytical Data

• Images

• Movies,

• Data sets

• Registered users can validate existing data

Page 83: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

CURATION Search “Vitamin H”

Page 84: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

“Curate” Identifiers

Page 85: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

“Curate” Identifiers

Page 86: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Data Validation is Exacting Work

Page 87: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

It is so difficult to navigate…

What’s the

structure?

Are they in

our file?

What’s

similar?

What’s the

target? Pharmacology

data?

Known

Pathways?

Working On

Now? Connections

to disease?

Expressed in

right cell type?

Competitors?

IP?

Page 88: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

• Open PHACTS Project

• Develop a set of robust standards

• Integrate Chemistry and Biology data by implementing the

standards in a semantic integration hub

• Deliver services to support drug discovery programs in

pharma and public domain

• INITIALLY 22 partners, 8 pharmaceutical companies, 3

biotechs

Guiding principle is open access, open usage, open source

- Key to standards adoption -

Page 89: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

ChemSpider serving RDF and

the semantic web

• Using RDF permalinks

• http://www.chemspider.com/Chemical-

Structure.7787.rdf

• Using a Search Term

• http://www.chemspider.com/rdf.ashx?q=cyclohexane

• http://rdf.chemspider.com/cyclohexane

Page 90: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

RDF and the semantic web

Page 91: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

The issue of identifiers

Page 92: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s
Page 93: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s
Page 94: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s
Page 95: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Aspirin on ChemSpider

Page 96: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

PubChem Drugbank ChemSpider

Imatinib

Mesylate

What is Gleevec

Page 97: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Strict Relaxed

Analysing Browsing

LinkSet#1 {

chemspider:gleevec hasParent imatinib ...

drugbank:gleevec exactMatch imatinib ...

}

chemspider:gleevec drugbank:gleevec

Dynamic Equality

Page 98: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

CVSP : chemical validation

Free chemistry validation platform that performs:

• Structure validation • Atoms

• Bonds

• Valence

• Stereo

• If aromatic - check that uniquely dearomatized

• Strongest acid not ionized first in partially-ionized system

• Cross-matching of SDF fields • synonyms

• InChIs

• Smiles

But what about Data Quality???

Page 99: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Validation

Page 100: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

• Custom processing let’s user to put together workflow from pre-defined standardization modules list

Standardization

Page 101: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Data Review

Page 102: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

DrugBank dataset (6516 records)

~60 records that can’t be dearomatized unambiguously

DB04283 DB04462

DrugBank

Page 103: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

~30 records with bonds that do not make sense

DB04283

DDB04009

Nonsensical bonds

Page 104: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

~40 records where InChIs did not match the structure DrugBank ID: DB00755 InChI=1S/C20H28O2/c1-15(8-6-9-16(2)14-19(21)22)11-12-18-17(3)10-7-13-20(18,4)5/h6,8-9,11-12,14H,7,10,13H2,1-5H3,(H,21,22)/b9-6+,12-11+,15-8+,16-14+

DruGBank ID: DB00614

Mismatches – which is correct?

Page 105: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

DB08128

J. Brechner, IUPAC Graphical Representation of stereochem. configurations Section: ST-1.1.10

DB06287

7 records with 2 stereo bonds at chiral atoms

Stereobond issues

Page 106: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Chemistry Validation and Standardization

Platform (CVSP)

at cvsp.chemspider.com

• Validation

• Standardization

• Parent generation

RDF Export

Data

Chemistry Data out to OPS

Page 107: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Data is being

imported from

ChemSpider to

Open PHACTS in

RDF/turtle

Page 108: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

– VoID is an RDF Schema vocabulary for expressing metadata

about RDF datasets.

• skos:exactMatch (Simple Knowledge Organisation System)

e.g. To link compounds in OPS with compounds in ChEBI.

• skos:closeMatch

e.g. To link Stereo Insensitive Parents to their Children within OPS.

• skos:relatedMatch

e.g. To link Parent compounds that contain others as Fragments.

– Recommendations on VoID specified by Manchester Uni. here: http://www.cs.man.ac.uk/~graya/ops/2012/ED-datadesc/

RDF/VOID

Page 109: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

RDF Export

Page 110: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Ongoing Updates to Linksets

• Ongoing data checking activities on ChemSpider

• Does Crowdsourcing work?

• But WHY would people validate ChemSpider?

• What can be done to encourage participation?

• How will we deliver on Barend’s Million Minds

approach – he has ideas…see later

Page 111: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Crowdsourcing – does it work?

• >200 people EVER have deposited or curated

data

• Database hosts make the largest contributions

• ChemSpider staff do the most curation

Page 112: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Contributions

Page 113: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Curation Activities

• 2009 – 8255 curations by 43 people

• 2010 – 10014 curations by 66 people

• 2011 – 16025 curations by 116 people

• 2012 – 13127 curations by 74 people

Page 114: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Depositors

• 2009 91 unique depositors

• 2010 120 unique depositors

• 2011 99 unique depositors

• 2012 120 unique depositors

• “The crowd is small – very small”

Page 115: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Rewards and Recognition

• The badgesonomy culture of recognition is

growing.

• Badges are commonplace

• FourSquare

• Klout

Page 116: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Rewards and Recognition

• Rewards and Recognition

integrated across all

ChemSpider related projects

• Including paths to expose

such recognition on

AltMetrics platforms

• MAY work for young

scientists

Page 117: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

The Alt-Metrics Manifesto

http://altmetrics.org/manifesto/

Page 118: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

ImpactStory

Page 119: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Future Recognition on

Page 120: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Scientists AltMetrics

Page 121: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Usage, Citations, Social Media, etc

Page 122: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Detailed Usage Statistics

Page 123: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

What makes a Scientist Notable?

What will it be in the future???

Page 124: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

We Are All Being Quantified…

Page 125: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

How I am Quantified…

Page 126: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s
Page 127: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s
Page 128: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Likely Enabled by

• Persistent unique digital identifier

• Integrates to workflows such as manuscript

and grant submission

• Supports automated linkages with your

professional activities

Page 129: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Summary

• A grand vision for semantic data linking is

coming to fruition. We have a lot do.

• Data quality enhancements can come through

crowdsourcing (and intelligent robots)

• Participation may be driven by new

approaches to rewards and recognition

• Publishers are understanding the value of data,

but still guard it, and generally don’t have

systems to handle it. It’s changing.

• There is so much value in the historical data

Page 130: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Inside our Publication Archive

• How much data is in RSC archive, in the

publications and in the supplementary info?

• How many compounds for ChemSpider?

• How many syntheses for ChemSpider

reactions?

• How many characterization

measurements?

• Property Data

• Spectral Data

• Graphs and charts to be used for modeling?

Page 131: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

What if we could capture it all?

Digitally Enhancing the RSC Archive

Page 132: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Start with data in publications

Page 133: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Turn “Figures” Into Data

Spectral

FIGURE

Extracted

Spectrum

Page 134: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Text Mining

The N-(β-hydroxyethyl)-N-methyl-N'-(2-trifluoromethyl-1,3,4-thiadiazol-5-yl)urea prepared in Example 6 , thionyl chloride ( 5 ml ) and benzene ( 50 ml ) were charged into a glass reaction vessel equipped with a mechanical stirrer , thermometer and reflux condenser .

The reaction mixture was heated at reflux with stirring , for a period of about one-half hour .

After this time the benzene and unreacted thionyl chloride were stripped from the reaction mixture under reduced pressure to yield the desired product N-(β-chloroethyl)-N-methyl-N'-(2-trifluoromethyl-1,3,4-thiaidazol-5-yl)urea as a solid residue

Page 135: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

ChemSpider Reactions

Page 136: Facilitating Scientific Discovery through ... - openphacts.org · • Let’s map together all historical chemistry data and build systems to integrate new data • Heck, let’s

Thank You

Email: [email protected]

Twitter: ChemConnector

Personal Blog: www.chemconnector.com

SLIDES: www.slideshare.net/AntonyWilliams