an introduction to chemistry ontology colin batchelor, royal society of chemistry [email protected]...

121
An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry [email protected] 2009-07-21

Upload: julius-burns

Post on 28-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

An introduction to chemistry ontology

Colin Batchelor,Royal Society of Chemistry

[email protected]

Page 2: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

2

About you

Ontology experience Chemistry experience What are you doing at the moment?

Page 3: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

3

An ontological toolkit for chemistry

1. Classes vs. instances2. Regular polysemy3. Granularity4. What classes should go into the ontology?

Ontological dependence and dispositions5. Identity6. BFO’s independent continuants and chemistry7. An actually-existing chemical ontology: ChEBI8. The Sequence Ontology9. Drugs (if we have time)

Page 4: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

4

Page 5: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

5

1. Classes and instances

Page 6: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

6

Instances and classes

TokensInstancesParticulars

This particular cat, this particular portion of water, this particular skin cell, this particular molecule of nitrogen

TypesClassesUniversalsKinds

Cats in general, portions of water in general, skin cells in general, molecules of nitrogen in general

Page 7: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

7

Classes and instances

This alpha-D-glucose molecule here is an instance_of…

a D-glucopyranose moleculea D-glucose moleculea glucose moleculean aldohexose moleculean aldose moleculea monosaccharide moleculea sugar moleculea carbohydrate moleculea natural product moleculean organic molecular entity

Page 8: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

8

Why this is confusing

An arbitrary pyridine molecule (in a bottle) is an instance_of…

a pyridines moleculean azaarene moleculea monocyclic

heteroarene moleculea mancude organic

heteromonocyclic parent molecule

Page 9: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

9

No superclasses

N-hydroxy-L-aspartic acid is_a hydroxamic acids

hydroxamic acids is_a organic functional classes

therefore N-hydroxy-L-aspartic acid is_a organic functional classes

(source: ChEBI, May 2008) This has now been fixed.

Page 10: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

10

Page 11: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

11

2. Regular polysemy

Page 12: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

12

Regular polysemy

Let’s say I have a banana. If I feed it to a child, it will get banana all down its front. One might have one table made of oak, and another made of banana. Last week it was frosty so I took in a plant from the back garden to the kitchen. It is a dwarf banana, so it is a banana but not a banana banana. If it ever grows any fruit, the fruit will be a dwarf banana banana. As opposed to a banana banana banana.

Page 13: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

13

Regular polysemy: banana

bananaFRUIT(COUNT)

bananaFRUIT(MASS) (grinding)bananaWOOD (grinding)bananaTREE

bananaCONTRASTIVE REDUPLICATION FOCUS

Other regular polysemies are available (e.g. colour, leaf, brand names).

Page 14: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

14

Regular polysemy: banana

“How do you feel now?”“Banana’d out.”

“The banana milkshake is still waiting for the bill.”

Page 15: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

15

Regular polysemy: Grinding

“tastes like chicken” 447000 Google hits

“tastes like a chicken” 2060(actually “tastes like a chicken tamale” and the like)

“tastes like beef” 8550

“tastes like cow” 1310

“tastes like a cow” 726

“tastes like a beef” 291(but like “tastes like a chicken”)

Page 16: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

16

Regular polysemy:EXACT vs. CLASS

pyridine

fully-specified nameclosed-world name (anything

not mentioned is a hydrogen atom)

a pyridine

underspecified nameopen-world name (anything

not mentioned could be anything)

Page 17: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

17

Regular polysemy:A sixfold classification

(from Corbett, Batchelor and Copestake, LREC 2008)

EXACTbenzene molecule

CLASSthe benzene 13a

PARTattacks the benzene ring

SPECIESatmospheric carbon

POLYMER SURFACERu(0001)

Page 18: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

18

Page 19: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

19

3. Granularity

Page 20: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

20

Regular polysemy:GRAIN vs. BULK

“The IR and Raman spectra show that the metal interacts with the oxygen atom of the amide group and allow the vibrations of the complexed CdX2 to be characterized.”

BULK for GRAIN.

“Pure americium has a silver and white luster. At room temperatures it slowly tarnishes in dry air.”

GRAIN for BULK.

Page 21: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

21

portion of wine

portion of water portion of ethanol

waterCHEBI:15377

water–hydroxide + proton equilibrium

ethanolCHEBI:16236

ethanol–ethoxide ion + proton equilibrium

hydroxideCHEBI:16234

protonCHEBI:24636

ethoxideCHEBI:52092

icbo

icao

has_participant

has_grain

has_part

Bulk granularity

Molecular granularity

hydrogen atom

oxygen atom

has_particbo

icao

proton transferfrom ethanolto ethoxide

proton transferfrom ethoxide

to ethanol

has_participanthas_participant

Page 22: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

22

Chloroform, ethanol

Bulk chloroform (as sold) is stabilized with small quantities of amylene or methanol or ethanol.

Bulk ethanol (as sold) contains small amounts of benzene.

Page 23: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

23

Page 24: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

24

4. What classes should go into the ontology? Dispositions and ontological dependence.

Page 25: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

25

Natural kinds

“To say that a kind is natural is to say that it corresponds to a grouping or ordering that does not depend on humans. We tend to assume that science is successful in revealing these kinds;”

A. Bird and E. Tobin, “Natural Kinds” in Stanford Encyclopedia of Philosophy (Spring 2009 Edition), ed. E. N. Zalta, http://plato.stanford.edu/archives/spr2009/entries/natural-kinds/

Page 26: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

26

Bad natural kinds

Molecules containing exactly 21 atoms Molecules where every atom is from a

different element Portions of material that boil at 300 K and 3

atm. Molecules that are completely surrounded

by water. The largest molecule in a given beaker.

Page 27: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

27

Good natural kinds

Dienes Carboxylic acids Diatomic molecules Ring-containing molecules Aromatic molecules

… why?

Page 28: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

28

What is a disposition?

Says BFO:

“A realizable entity that essentially causes a specific process or transformation in the object in which it inheres, under specific circumstances and in conjunction with the laws of nature. A general formula for dispositions is: X (object has the disposition D to (transform, initiate a process) R under conditions C.”

Page 29: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

29

Realizable entities

For every realizable entity (dispositions, roles, functions, powers, virtues, tendencies, and so forth) there must be a process.

The process must be alienable (there must be no mutual ontological dependence between the bearer of the entity and the process).

Page 30: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

30

Alienable and inalienable processes

Alienable processes: Breathing Swimming Sleeping

Inalienable processes: Being somewhere (location) Pointing somewhere (orientation) (of pairs of objects) being a given distance apart (relative

location) (of pairs of objects) pointing in different directions (relative

orientation)

Page 31: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

31

Ontological dependence

X is ontologically dependent on Y if X cannot exist without Y existing.

Qualities cannot exist without the entities that exhibit them.

Roles cannot exist without their players.Functions cannot exist without their exercisers.Dispositions cannot exist without the entities that

realize them.

Page 32: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

32

Mutual ontological dependence (1)

George Best cannot exist without George Best’s life existing.

conversely

George Best’s life cannot exist without George Best existing.

Page 33: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

33

Determination, specialization and dispositions (1)

Dispositions are relational entities.

They depend ontologically on both their bearer and the conditions in which they are realized.

Page 34: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

34

Determination, specialization and dispositions (2)

Material X has disposition D to undergo process M (melting) under conditions pressure P and temperature T.

Here the conditions are determinable.

Page 35: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

35

Determination, specialization and dispositions (3)

Material X has disposition D to undergo process M (dissolution at rate k) under conditions pressure P and temperature T.

Here the conditions are determinable and the process is a quantified specialization of the process of dissolution.

Page 36: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

36

Surefire dispositions vs. tendencies

A more thorough account of dispositions:

X has disposition D to undergo process P at rate k with probability p under conditions C.

For surefire dispositions, p = 1.0.For tendencies, p < 1.0.

The disposition D(C) and its associated rates kD(C)

and probability pD(C) can be complicated mathematical functions of C.

Page 37: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

37

Antidotes, finks and fuses

Antidotes: Poison has the disposition to kill people unless the antidote is applied.

(Careful: does this mean X has disposition D to undergo process P except where it doesn’t?)

Finks: where the conditions C are exactly those that prevent X from undergoing process P. Example: a fuse.

Page 38: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

38

Mutual ontological dependence (2)

There are many mutually ontologically dependent dispositions in chemistry.

Being an acid is the disposition to donate a proton to (or to receive an electron pair from) a base.

Being a base is the disposition to receive a proton from (or to donate an electron pair to) an acid.

Page 39: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

39

Grounding dispositions

There are no bare dispositions.

All dispositions ought to have a categorical (= quality or independent continuant) basis.

Metals conduct because of their band structure (= quality).

Dienes take part in Diels–Alder reactions because of their electronic structure (= quality).

Objects fall under gravitation because of their mass (= quality).

Page 40: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

40

What grounds many of these natural kinds?

Structure

or, better,

Parthood

Page 41: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

41

Exercise

What dispositions do the qualities inhering in the parts that define these classes ground?

CHEBI:22479 amino cyclitol glycosidesCHEBI:47788 3-oxo steroidCHEBI:33726 canonical amino acid residue anionCHEBI:25384 monocarboxylic acidCHEBI:33791 gold coordination entityCHEBI:30879 alcohol

Page 42: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

42

Page 43: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

43

5. Identity

Page 44: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

44

Chemical entities and identity criteria

Synchronic (class)

Synchronic identity conditions are for different objects i and j at time t.

Diachronic (instance)

Diachronic identity conditions are for what may be the same object i at times t1 and t2.

Page 45: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

45

Synchronic identity for small molecules

What follows is an account of identity criteria as understood by chemists expressed in terms of qualities.

Page 46: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

46

Synchronic identity for small molecules

Cyclohexane = A molecule consisting of six carbon atoms and twelve hydrogen atoms with the carbons joined up in a ring.

Hexene = A molecule consisting of six carbon atoms and twelve hydrogen atoms with the carbons joined in a chain.

Page 47: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

47

Boats and chairs

Boat cyclohexane = A cyclohexane with the carbon atoms in the boat conformation.

Chair cyclohexane = A cyclohexane with the carbon atoms in the chair conformation.

Page 48: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

48

Qualities vs. realizable entities

Qualities are those entities that have no temporal parts, inhere in objects, and are present (though may change) if the objects they inhere in exist at all.

Realizable entities may never be realized. The only way to realize them is as an alienable process.

Page 49: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

49

Qualities are (indirectly) multiply-realizable

There are many tests, many processes that can be realized (many dispositions) grounded on qualities, such as:

Triangularity Being 2 m tall. Containing six and only six carbon atoms

Page 50: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

50

Qualities that determine synchronic identity

Constitution (of a molecule) Being cyclic (of a part) Being linear (of a part)

Hence cyclohexane and the hexenes are different synchronically, despite each molecule containing the same number of each kind of atom.

Loss or gain of an atom or atoms changes the class of the molecule.

Page 51: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

51

Qualities that are incidental to synchronic identity

Conformation

Hence boat and chair cyclohexane are the same molecule.

Orientation (including relative orientation)

Hence exo- and endo-puckered rings (relative to some plane) are the same ring.

Page 52: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

52

Synchronic identity and granularity

Molecules with different numbers of atoms belong to different natural kinds.

Portions of material with different numbers of atoms may well belong to the same natural kind.

Page 53: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

53

Criteria for diachronic identity

We could take the same criteria as for synchronic identity.

So an ethanoic acid molecule M in a portion of vinegar becomes an ethanoate ion M′ becomes a different ethanoic acid molecule M″ becomes a different ethanoate ion M′″ and so on in the course of a day.

Page 54: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

54

Thought experiment

Consider a large molecule M somehow connected to a piece of glass. It is undergoing protonation and deprotonation processes at a given rate.

1. Turn your back.2. Now look again. Is it the same molecule?

If structure determines identity then:it is the same molecule with probability 1 – p ora different molecule with probability p.

That can’t be right.

Page 55: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

55

Some ways out

Cyclic processes Biological processes Information macromolecules Synthetic reaction schemes

Page 56: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

56

Cyclic processes and equilibria

If we have a reaction

A + X ⇌ B

for which the energetic barrier in both directions is low, and where X is a proton, or an otherwise stable species, then A and B are the same molecule (diachronically).

Page 57: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

57

Growth processes, homologation

Other diachronic identity criteria emerge from functional processes:

RNA transcription (the nascent RNA maintains its identity on gaining bases)

Protein translation (the nascent polypeptide maintains its identity on gaining amino acid residues)

Polyketide synthesis (the nascent polyketide maintains its identity on gaining acetyl and propionyl subunits)

But this requires a notion of function.

Page 58: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

58

Information macromolecules

Transfer RNAs preserve their identity under base modification.

DNAs preserve their identity under base loss, base repair, proofreading and so forth.

Polypeptides preserve their identity under methylation, phosphorylation and so forth.

Page 59: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

59

Diachronic identity and RXNO

The Name Reaction Ontologyhttp://www.rsc.org/ontologies/RXNOhttp://rxno.googlecode.com/classifies reactions according to what they do

to the ‘skeleton’ of the molecule.

Page 60: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

60

Protecting groups and identity criteria

Protecting groups replace functional groups in molecules in order to ensure that only the right parts of the molecule react in a particular way.

But a given protecting group can be of similar size to the rest of the molecule.

In some steps of some syntheses, the vast majority of the mass of the relevant molecule may be protecting groups.

Page 61: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

61

The paradox of the heap

Take a gold nanoparticle with 55 atoms.

Remove atoms one-by-one (assume we can do this).

At what stage does it cease to be a nanoparticle?

Page 62: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

62

A further problem:Bonding and temperature

What counts as a bond depends on the temperature of the system.

The helium dimer is held together by a bond weaker than the interactions between different ends of the boats and chairs we saw earlier.

Page 63: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

63

Page 64: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

64

6. BFO’s independent continuants and chemistrywww.ifomis.org/bfo

Page 65: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

65

A hierarchy of bonding

covalent nc–ne bonding beatsnc–me (n ≠ m) bonding beatsionic bonding beatshydrogen bonding beatsvan der Waals bonding beatsother, rare kinds of bonding

Page 66: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

66

Aside: mechanical connection

Catenanes are“Hydrocarbons having two or more rings

connected in the manner of links of a chain, without a covalent bond.” (IUPAC Gold Book)

But in order to break apart a catenane, you need to break one of the covalent bonds in one of the rings.

Page 67: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

67

BFO’s material entities in review

ObjectAn independent continuant that is spatially extended, maximally self-connected and self-contained (the parts of a substance are not separated from each other by spatial gaps) and possesses an internal unity.

ObjectAggregateAn independent continuant that is a mereological sum of separate object entities and possesses non-connected boundaries.

Page 68: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

68

BFO’s material entities in review

FiatObjectPartAn independent continuant that is part of an object but is not demarcated by any physical discontinuities.

BoundaryAn independent continuant that is a lower dimensional part of a spatial entity, normally a closed two-dimensional surface. Boundaries are those privileged parts of object entities that exist at exactly the point where the object is separated off from the rest of the existing entities in the world.

Page 69: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

69

How do these map on to the chemical situation?

Positions are replaced by spatially-dispersed wavefunctions.

Spatial gaps and (absolute) physical discontinuities are replaced by regions where the wavefunctions have values very close to zero.

Page 70: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

70

Folding

When molecules (RNA molecules, polypeptide chains) fold, the new conformation is held together by hydrogen bonding.

This means that there are no spatial discontinuities (except at the outer edges of the folded molecule), only regions where the electron density is lower than in the rest of the molecule (but certainly nowhere near zero).

Page 71: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

71

Folding and boundaries

This makes it hard to talk of bona fide boundaries, though there are certainly fiat boundaries, say between nucleotides.

Best to talk in terms of tangential proper parts of the unfolded molecule.

Page 72: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

72

Looking ahead: RNAO

The RNA Ontology is (amongst other things) an ontology of conformations of RNA molecules, their fiat parts and their tangential proper parts.

http://roc.bgsu.edu/

N. Leontis and co-workers, “The RNA Ontology (RNAO)”, this meeting.

Page 73: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

73

Non-covalently, non-mechanically bound systems

These lie in between objects and object aggregates.

They are held together by forces weaker than those that hold objects together.

Examples: double-stranded DNA, clathrates, pseudorotaxanes.

Page 74: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

74

Perhaps…

Molecular entity Physically-associated molecular aggregate (e.g.

double-stranded DNA) Non-associated molecular aggregate (e.g. the

products of a reaction) Molecular part

Tangential proper part of unfolded molecular entity Non-tangential proper part of unfolded molecule entity

Page 75: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

75

Parts and orbitals

The wavefunctions of electrons in atoms are called orbitals.

Molecules are built out of a combination of atoms sharing electrons.

Molecular orbitals can be built (mathematically) by taking a linear combination of atomic orbitals (LCAO).

Page 76: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

76

Parts and orbitals and atoms

In a free carbon atom, all of the orbitals have spherical symmetry about the nucleus.

In a bound carbon atom, some of the orbitals have polyatomic symmetry.

But is the bound–free distinction more like the boat–chair distinction or the banana–banana–banana–… distinction?

Page 77: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

77

Methyl, the radical (exact)

In a methyl radical, the mereological sum of the orbitals is coextensive with the species.

Page 78: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

78

Methyl, the group (part)

In a methyl group, the mereological sum of the orbitals extends into the rest of the molecule.

Page 79: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

79

Page 80: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

80

7. An actually-existing chemical ontology: ChEBIwww.ebi.ac.uk/chebi

Page 81: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

81

Background

ChEBI (an OBO Foundry candidate, and possibly the ontology that most other ontologies depend on most) contains about 23000 classes, which include

subatomic particles atoms exotic atoms molecules (EXACT), with InChIs and SMILES molecules (CLASS) molecular parts roles

The full name (Chemical Entities of Biological Interest) is both a pun and no longer accurately describes the scope.

Page 82: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

82

Quantification, modality

Parthood in ChEBI meant at least three things:

1. is necessarily chemically part of

carbonyl group part_of carbonyl compounds

Page 83: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

83

Quantification, modality

2a. Is sometimes chemically part of:

Lead(2+) ion part_of lead diacetate

(most lead(2+) ions aren’t)

Page 84: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

84

Basics for a chemical ontology:2. Parthood relations

2b. Is extremely rarely chemically part of

Electron part_of muonium

3. Is part of a mixture

Kanamycin A part_of kanamycin

Page 85: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

85

How it got fixed

By defining relationships according to pattern: all instances of X have a relationship with some Y. (Smith et al., “Relations in biomedical ontologies”, 2005)

carbonyl compound has_part carbonyl group Lead diacetate has_part lead(2+) (?!) Muonium has_part electron Kanamycin has_part kanamycin A (?!)

Page 86: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

86

is_a had many meanings!

1. An amount of a compound has a biological role: tris is_a buffer.*

2. An amount of a compound has an application: sodium dodecyl sulfate is_a detergent.*

3. A less-abstract type is an example of a more abstract type: propane is_a alkanes. This is OK.

4. ?!: metals is_a atoms.*

* Not a property of a lone atom or molecule!

Page 87: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

87

How is_a got/is being fixed

In cases 1 and 2, the has_role relation was introduced to connect molecules with realizable entities (and maybe some relational qualities). More work needed to distinguish single-molecule from collective from bulk relational entities.

Case 3 is OK.

In case 4, “metal” will be replaced by “metal atom”.

Page 88: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

88

is_a completeness

(June 2009 release, taken from http://www.berkeleybop.org/ontologies/obo-all/chebi/chebi.stats)

Number of subclass orphans: 1325 (out of 22452) = 5.9%

Average number of is_a parents per class: 1.288Maximum number of is_a parents per class: 9

Page 89: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

89

Terms without definitions, and only children

(June 2009 release)

Without definitions: 20813 (92.7%)Only children: 1608 (7%)

Page 90: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

90

Two kinds of relation in CHEBI

Between molecules (and ‘roles’) and between connection tables:

is_a has_part has_role icbo icao is_tautomer_of

Only between connection tables:

is_enantiomer_of is_substituent_group_f

rom has_fundamental_par

ent has_parent_hydride

Page 91: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

91

Connection tables

The cyano group:

Marvin 10260616462D 3 2 0 0 0 0 999 V2000 0.0000 0.0000 0.0000 C 0 0 0 0 0 0 0 0 0 0 0 0 0.8250 0.0000 0.0000 N 0 0 0 0 0 0 0 0 0 0 0 0 -0.8250 0.0000 0.0000 * 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 0 0 0 0 1 3 1 0 0 0 0M END

Connection tables describe how the atoms in a molecule are connected (as long as they’re connected by covalent, ionic or some kinds of multicentre bonding).

They are a kind of specification.

Page 92: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

92

Generically-dependent continuants

Generically-dependent continuants, unlike specifically-dependent continuants (qualities, dispositions and so forth) can be transferred between bearers.

Molecular connection tables depend ontologically on hard discs, computer memory, punchcards, datagrams, printouts.

Page 93: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

93

Molecules and connection tables (1)

Pyridine

This pyridinemolecule here

instance_of pyridine.mol

connectiontable

instance_of

this laptop

depends_on

is_about

Page 94: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

94

Molecules and connection tables (2)

centaur.mol

connectiontable

instance_of

this laptop

depends_on

is_about

Page 95: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

95

Names, identifiers and molecules

Cambridge changes are changes in the description of an object.

Example: Becoming the tallest spy in Finland.

All changes are Cambridge changes but not vice versa.

Names, identifiers (InChIs, InChIKeys, SMILES, SMARTS, CAS RNs, Chemspider IDs) and connection tables are not properties but only Cambridge properties.

Page 96: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

96

Formal derivation

IUPAC nomenclature is based on formal derivation.

These are actions like: removing all hydrogen atoms, removing all substituents, replacing non-carbon atoms in a ring with carbon atoms, replacing multiple bonds with single bonds.

They cannot in general be carried out on real molecules.

Page 97: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

97

Actual derivation

Contrast the processes which take place in biological systems on natural product molecules:

methylation cyclization condensation and many more…

Page 98: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

98

ChEBI as will be: automatic classification

Systematic (IUPAC) names for molecules reflect their constitution and therefore their parts.

Therefore it is possible to classify much of a large ontology like ChEBI automatically given specifications for the classes.

Page 99: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

99

ChEBI as will be: automatic classification

Further reading:

Villanueva-Rosales and Dumontier, OWLED 2007.

Low et al., “OWL-DL Ontology for Classification of Lipids”, this meeting (full paper)

Hastings et al., “Towards Automatic Classification of Entities within the ChEBI Ontology”, this meeting (poster session)

Page 100: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

100

ChEBI as will be: the problem of natural products

However, whereas systematic names such as benzene, thiophene, pyridine and pentacene unambiguously reflect skeleta…

the same is not true of natural product names. Not all chromomycins, aureolic acids or anthracyclines necessary share the same skeleton.

Some kind of actual derivation relations will be needed here.

Page 101: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

101

Page 102: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

102

8. The Sequence Ontologywww.sequenceontology.org(thanks to Karen Eilbeck, University of Utah for graphics)

Page 103: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

103

Background

Contains around 2000 terms describing sequence features such as genes, variations such as chromosomal aberrations, effects of sequence variation and sequence feature attributes.

Used by a large number of model organism databases (Flybase, Wormbase, SGD, TAIR…) for genomic annotation work.

Page 104: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

104

Sequences not molecules

The Sequence Ontology treats sequences such as genes, exons and introns rather than the molecules (DNA, messenger RNAs) that they depend on.

Working assumption: sequences are a kind of generically-dependent continuant.

Page 105: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

105

Page 106: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

106

working_draft.obo

http://song.cvs.sourceforge.net/*checkout*/song/ontology/working_draft.obo

Page 107: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

107

Four kinds of sequence

Biological sequences depend on single molecules used by the replication machinery of the cell. Their sequence information may be stored in databases, memory sticks and so forth.

Biomaterial sequences describe molecules such as plasmids and vectors which are used for genetic engineering.

Experimental features are those information artefacts created by sequencing experiments.

Sequence variations depend on more than one molecule: see next slide.

Page 108: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

108

Mutual ontological dependence (3)

A deletion depends on two molecules.It has two parts.

The deleted sequence depends on its carrier molecule and the deletion junction on the other.

The deletion junction depends on its carrier molecule and the deleted sequence on the other.

Likewise substitutions, inversions, indels.

SNPs and CNVs depend on many molecules.

Page 109: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

109

Looking ahead

Robert Hoehndorf, Janet Kelso and Heinrich Herre, “A Formal Ontology of Sequences”, this meeting.

Karen Eilbeck and Chris Mungall, “Evolution of the Sequence Ontology Terms and Relationships”, this meeting.

Page 110: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

110

Page 111: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

111

9. On drugs

Page 112: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

112

Huge invisible ontology of drugs (1)

I go into a shop in the High Street, hand over a piece of paper signed by a physician to the woman behind the counter, am told to come back in ten minutes, go off and read the paper in a café, return and am given a white card box containing a piece of paper listing terrible things that may happen to me and a blister package containing small lumps of calcium carbonate.

Page 113: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

113

Huge invisible ontology of drugs (2)

a shop in the High Street: only some organizations are licensed to sell drugs

a physician: only some people are allowed legally to prescribe drugs

the woman behind the counter: only some people are allowed to prepare drugs

a piece of paper listing terrible things that may happen to me: drugs in most jurisdictions must come with a list of side effects

a blister package: drugs must be packaged so as not to degrade

small lumps of calcium carbonate: the active ingredients must be further packaged as pills, linctus and so forth

Page 114: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

114

Huge invisible ontology of drugs (3)

Hansard, 19th February 1997

“When I visited the prescription pricing agency in Newcastle, it had a number of interesting examples of items that GPs had prescribed. You, Mr. Deputy Speaker, might consider it unusual if, when you next visited your GP, he prescribed you a pint of Guinness, but it has been done—and done within the letter of the rules. Another example involved a prescription for a Christmas pudding.” The Minister for Health (Mr. Gerald Malone)

Page 115: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

115

Huge invisible ontology of drugs (4)

Brute objects and processes (X) are those that are mind-independent.

Status functions (Y, BFO: roles) are realized by brute objects X in institutional contexts (C).

Constitutive rule: “X counts as Y in context C”—Searle (1995)

Page 116: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

116

Huge invisible ontology of drugs (5)

X (a lump of calcium carbonate containing various small organic molecules) counts as Y (a drug) in contexts:

C1 registration by the appropriate authorityC2 being listed in a formularyC3 prescription by a physician or self-medication (for

over-the-counter drugs)C4 preparation according to the standards in a

pharmacopoeia

Page 117: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

117

Huge invisible ontology of drugs (6)

antitussive disposition= suppresses cough when applied to throat

+cause–intentionality

antitussive agent function= suppresses cough when applied to throat and I want to stop coughing

+cause+intentionality

antitussive drug status function= designed, registered and prescribed to suppress cough when applied to throat and I want to stop

coughing?cause+intentionality+collective intentionality

antitussive placebo function= administered to suppress cough when applied to throat

?cause?intentionality?collective intentionality

Page 118: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

118

Huge invisible ontology of drugs (7)

Regulative rules vs. constitutive rules

Regulative rules regulate an already existing activity (Searle: driving)

Constitutive rules define an activity (Searle: chess)

Regulative rules, intended to regulate existing practices of medication, quack doctors, midwives and so forth depend on constitutive rules.

Page 119: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

119

Huge invisible ontology of drugs (8)

See also:

C. Denney et al.,“Creating a Translational Medicine Ontology”,this conference (poster session).

Page 120: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

120

Page 121: An introduction to chemistry ontology Colin Batchelor, Royal Society of Chemistry batchelorc@rsc.org 2009-07-21

121

Thanks to…

Alan RuttenbergAnn CopestakeBarry SmithCelia GittermanChris MungallChristi DenneyDavid BardenDuncan HullElena BeisswangerHilary BurchJane LomaxJane RichardsonJanna HastingsJesse StombaughKaren EilbeckKirill Degtyarenko

Luc SchneiderMarcus EnnisMarijke KeetMatthew Batchelor (no relation)Michael AshburnerMichel DumontierNeocles LeontisNico AdamsPeter CorbettRobert HoehndorfRob KnightStefan SchulzSusie StephensSuzanna LewisTom Bittner

Aileen DayJeff WhiteRichard Kidd