28 october 2005jeremy frey, university of southampton1 “the combechem experience” cicc workshop...

19
28 October 2005 Jeremy Frey, University of Southampton 1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

Upload: anna-bruce

Post on 20-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 1

“The CombeChem Experience”

CICC Workshop

28 October 2005

Bloomington Indiana

Page 2: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 2

Chemical Data & Chemical Grids• Chemical data, information & knowledge

– Experiments, Simulation & Computation

• Exponential growth in generation of data– Need automatic capture of meta data

• Start in the laboratory – pervasive physical grid

• Computational chemistry very significant source Software to be used by chemists so must be simple to support & maintain – autonomic

Page 3: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 3

Chemical Semantic Grid

• RDF (Resource description framework)– From the semantic web world– Best system for the description of chemical data and

processes– Achieves the same as XML + unique identifiers +

linking up in a simpler manner

• Large scale triple stores (so far up to 50 Million triples of molecular structures and properties)

• Need for scalable software solutions

Page 4: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 4

He is charged with expressing contempt for meta-data

Page 5: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 5

Permanent, documented and primary record of laboratory

observations

Page 6: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 6

Observations are nevercollected on note pads,

filter paper or other temporary paper for later transfer into a

notebook

If you are caught using the “scrap of paper” technique,

your improperly recorded data may be confiscated by your TA

Page 7: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 7

Digital record at source don’t try to add metadata after the fact

Page 8: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 8

Record the chemical processes as well as the data in RDF

Physical World

RDF

Page 9: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 9

Old technology does not scale

Problems with relational databases- information too variable and rapidly changing types- multimedia, images are the output of current experiments

Page 10: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 10

Create large semantically rich database of structures and properties

URI - INChi

Page 11: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 11

Property in RDF

• <c:OrganicMolecule rdf:about="file:///storage/ba8efc2ce0edada69d63b02d1b8630c6.rdf">

• <c:has-inchi>1.12Beta/C12H13NO2/c1-2-15-8-9-5-6-11(14)12-10(9)4-3-7-13-12/h1H3,2H2,3-7H,8H2,14H</c:has-inchi>

• <c:has-cas>22049-19-0</c:has-cas>• <c:has-empirical-formula>C12H13NO2</c:has-empirical-formula>• <c:has-stereocentres>0</c:has-stereocentres>• <c:has-property>• <c:MeltingPoint>• <c:has-information>• <c:Information>• <c:has-value>150</c:has-value>• <c:has-uncertainty>• <c:Range>• <c:has-value>16</c:has-value>• </c:Range>• </c:has-uncertainty>• </c:Information>• </c:has-information>• </c:MeltingPoint>• </c:has-property>• </c:OrganicMolecule>

Currently testing on 200,000 compounds but about to go up by order of magnitude

3Store is a scaleable solution

Page 12: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 12

You see that dark spooky image on the screen? That’s your credit history coming back to haunt you?

ProvenanceRecord experimentsMake data available(e-crystals, e-Bank)

Page 13: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 13

Security and trust for experiments and data

Experiments on the Gridnational crystallography service

Page 14: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 14

Chemistry Data private or public,

open or controlled access

Page 15: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 15

Subversive and furtive exploitation of data

Data

CAS

PubMed

CML

RDF

Page 16: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 16

E-BankE-crystals

R4L

Page 17: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 17

Standards?Interoperable?Convertible?Useable?

Page 18: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 18

Linking Chemistry to the Life-Sciences and the Environment

• Need to link up small and large molecule chemistry– Bio-Informatics– Medical informatics

• Need to link in place and time– Environmental Informatics– Spatial-Temporal issues at a cellular and

organism level

• Statistical Modelling

Page 19: 28 October 2005Jeremy Frey, University of Southampton1 “The CombeChem Experience” CICC Workshop 28 October 2005 Bloomington Indiana

28 October 2005 Jeremy Frey, University of Southampton 19

Making sure Chemistry will not suffer from a data crunch

All I’m saying is that now is the time to develop the technology to deflect an asteroid