stephanie harris crystal grid workshop southampton, 17 th september 2004 development of molecular...
TRANSCRIPT
![Page 1: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/1.jpg)
Stephanie Harris
Crystal Grid Workshop
Southampton, 17th September 2004
Development of Molecular Geometry Knowledge Bases from
the Cambridge Structural Database
![Page 2: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/2.jpg)
Molecular Geometry Knowledge Bases Library of chemically well-defined geometric information Limited user input Rapid retrieval of statistical data
Cambridge Structural Database Stored geometric information for ~300,000 structures Search using Conquest Substructure search, user input required
![Page 3: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/3.jpg)
Molecular Geometry Knowledge Base: Mogul
Bond lengths, valence angles and torsion angles Compiled from the CSD
Published bond length tables: Organic and metal containing structures Published late 1980s Compiled from CSD of ~50,000 structures Cannot be accessed by computer programs
Applications Model building Refinement restraints Structure validation Comparative values
![Page 4: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/4.jpg)
Mogul 1.0
Whole molecule input Graphical (cif, SHELX, mol2 files) or command-line interface Integration with client applications, e.g. Crystals Quick, automatic retrieval of statistical data, histogram distributions, CSD structures
Search Algorithm All non-metal fragments in the CSD coded Set of keys code chemical environments Fragments with identical keys are chemically identical Use hierarchical search tree Generalised searching if insufficient hits
![Page 5: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/5.jpg)
Mogul Search
.S1.C7
N
S
N
O O
O
N
pTol
CN
Search
![Page 6: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/6.jpg)
Metal – Ligand Bond lengths
To be considered: Ligand type: Carboxylate Metal Oxidation State: Co(II) Metal coordination number: 6 Ligand trans: Oxygen ligand Spin State?
Co-O bond length?
N
N
Co
O
O
OH2
OH2
C
Me
O
C(O)Me
![Page 7: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/7.jpg)
Method
Analysis of M-L bond lengths.
For a range of metal and ligand types identify factors which influence M-L bond lengths and evaluate their importance.
For a defined Metal-Ligand group sub-divide bond length distribution to produce ‘chemically meaningful’
datasets: • Unimodal distributions.• ‘Reasonably small’ sample standard deviations.
From hand-crafted examples develop an algorithm to produce a molecular geometry knowledge base for metal complexes.
![Page 8: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/8.jpg)
Data Tree
Metal-Ligand Group
Bin A1
Sharpened distributionsSmaller sample standard deviations
Bin A2
Bin B2 Bin B3Bin B1 Bin B4
Bin C1 Bin C2
![Page 9: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/9.jpg)
1. Ligand, L
2. Coordination mode of ligand
3. Effective Metal Coordination Number
4. Metal Oxidation State
5. Metal clusters and cages
6. Spin state
7. Jahn-Teller effect
8. Metal coordination geometry
9. Ligand trans to L
Criteria Influencing M-L Bond Lengths
M = 6 M = 6
![Page 10: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/10.jpg)
Ligand Template Library
Ligand• Non-metal atom or fragment bonded to a metal.• Two ligands are the same if they have same connectivity
(topology) and stereochemistry.
Method• All ligands in CSD to be classified. • Classify according to contact atom coordinated to metal.• Ligands with multiple contact atoms can be present in more
than one ligand group. e.g. SCN-
M A
B
B
B
O O- - O O
![Page 11: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/11.jpg)
Cambridge Structural Database Approximately 22,000 formulaeApproximately 780,000 ligands
No. of occurrences of unique formulae in CSD
Total Number of Ligands
Number of formulae
550,000 (70%) 70
100 – 999 109,263 (14%) 394
10 – 99 76,000 (10%) 3000
1 – 9 45,700 (6%) 18,937
Ligand Template Hierarchy• Exact ligand templates (724)• R-substituted templates (H’s replaced with ‘innocent’ R groups)• Generic templates (ALL ligands classified)
![Page 12: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/12.jpg)
Cobalt Carboxylate Bond Lengths
Co OC
OCsp3
Co-O (Å)
No. ofFrags.
Co-O: 1.929(62) Å619 Fragments
![Page 13: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/13.jpg)
Co OC
OCsp3
1.929(62) Å
Co(III)Co(II)
2.049(58) Å 1.904(20) Å
IICoLL
LLOC(O)C
L
IIICoLL
LLOC(O)C
L2.073(42) Å 1.904(20) Å
IICoLL
LLOC(O)C
OIIICoLL
LLOC(O)C
O
IIICoLL
LLOC(O)C
N2.074(32) Å
1.910(15) Å
1.895(17) Å
![Page 14: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/14.jpg)
Chlorides Fe-Cl
2.242(68) Å Fe
Cl
L LL
III
2.189(24) Å
NFe
2.166(84) ÅHigh Spin
2.225(29) Å
Fe(II)L5py Pyridines e.g. Fe(spin state)
Cu(II)-OH2
2.232(225) Å
Copper complexes (Jahn-Teller effect)Standardisation of Cu connectivity
Tertiary phosphines, Carbon-ligands
![Page 15: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/15.jpg)
Metal-Ligand Knowledge Base
1. CSD data adjustment: Standardisation of metal connections Assignment of metal as part of a metal cluster Assignment of metal oxidation state
2. Classification of ligands by ligand template library
3. Perform algorithm on all possible M-L fragments to produce knowledge base
![Page 16: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/16.jpg)
Metal-Ligand Group
From ligand template library:Generic or more specific
e.g. Carboxylates:
C C
O
O
C Et
O
O
Algorithm:
C C
O
O
sp3
![Page 17: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/17.jpg)
Metal-Ligand Group
Division on Oxidation State
‘Metal Clusters’
Division on Metal effective coordination number
Division on spin and Jahn-Teller effect
• Only for particular metals, oxidation states and coordination numbers.
• Not found for all ligand types.• Not searchable in CSD.Flag users, effects evident by: bimodal histogram, high SSD, outliers.
![Page 18: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/18.jpg)
Metal-Ligand Group
‘Metal Clusters’
Division on Oxidation State
Division on Metal effective coordination number
Division on spin and Jahn-Teller effect
Division on Metal coordination geometry
E.g. 4-coordinate geometry:Tetrahedral, square planar, disphenoidal
![Page 19: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/19.jpg)
Metal-Ligand Group
‘Metal Clusters’
Division on Oxidation State
Division on Metal effective coordination number
Division on spin and Jahn-Teller effect
Division on Metal coordination geometry
Divide on trans ligand to L
Final Ligand divisionMore specific ligande.g. alkyl carboxylate
![Page 20: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/20.jpg)
Generalised Searching
• No hits or insufficient number of hits.
• Allows the retrieval of data on related fragments.
• Hierarchical search tree structure
• Move up to a higher, less specific level of data tree.
• Order of algorithm important. Should order of criteria be changed? Should order depend on M-L group?
E.g. Should oxidation state always be the first main division?
![Page 21: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/21.jpg)
Conclusions
• Pre-processing of structural data from the CSD to construct molecular geometry knowledge bases.
• Knowledge bases to contain chemically well-defined datasets.
• Limited user input required.
• Quick, automatic retrieval of statistical data, distributions.
• Efficient analysis of large number of chemical fragments.
• Outliers, high SSD? Further Analysis – Computational Chemistry.
• Further development to include extra chemical information e.g. computational data.
![Page 22: Stephanie Harris Crystal Grid Workshop Southampton, 17 th September 2004 Development of Molecular Geometry Knowledge Bases from the Cambridge Structural](https://reader035.vdocuments.us/reader035/viewer/2022062307/55160465550346a2308b4d67/html5/thumbnails/22.jpg)
Acknowledgements
Bristol University:
Guy Orpen
Natalie Fey
X-Ray Crystallography Group
Cambridge Crystallographic Data Centre:
Robin Taylor
Frank Allen
Ian Bruno
Greg Shields