finding transition states algorithmically for automatic reaction mechanism generation

Post on 11-Jun-2015

761 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

A presentation given by Prof. Richard West at the 8th International Conference on Chemical Kinetics in Seville, Spain, on 12th of July 2013. The slides were designed to accompany the oral presentation and do not quite stand alone, so please email if you have any questions.

TRANSCRIPT

Computational Modeling in Chemical Engineering

.edu/comocheng

Finding Transition States Algorithmically for Automatic Reaction Mechanism GenerationPierre L. BhoorasinghRichard H. West

1

Can you predict TS geometries from molecular groups alone?

2

(this would be great)

Length of bond being broken, at TS for Hydrogen abstraction

Can you predict TS geometries from molecular groups alone?

3

Radi

cal

Molecule

Length of bond being broken, at TS for Hydrogen abstraction

!"!#$ !"!%% !"!&' !"!($

!")() !")'& !"*+$ !"*!#

!")(' !")$% !"*%%

!")'+ !"*+& !"*&) !"*&$

Can you predict TS geometries from molecular groups alone?

3in Å with M06-2X/6-31+G(d,p)

Can you predict TS geometries from molecular groups alone?

4

!"!#$ !"!%% !"!&' !"!($

!")() !")'& !"*+$ !"*!#

!")(' !")$% !"*%%

!")'+ !"*+& !"*&) !"*&$

!"#$# !"#$%!"!#$ !"#$%

in Å with M06-2X/6-31+G(d,p)

You can predict TS geometries from molecular groups alone!

5

!"!#$ !"!%% !"!&' !"!($

!")() !")'& !"*+$ !"*!#

!")(' !")$% !"*%%

!")'+ !"*+& !"*&) !"*&$

!"#$%

in Å with M06-2X/6-31+G(d,p)

You can predict TS geometries from molecular groups alone!

6

But...

... you gave me a distance, not a geometry.

... I gave you 15 numbers then asked you for 1.

Automatic Transition State Theory (TST) would be a game-changer.

• Insight and predictions require detailed kinetic models.•Error-free detailed models require automatic generation.•Automatic generation requires reasonable estimates of millions of reaction rates.•Current estimates are often unreasonable due to scarcity of data.

7

Automatic TS searches remain an important energy research goal

“An accurate description of the often intricate mechanisms of large-molecule reactions requires a characterization of all relevant transition states... Development of automatic means to search for chemically relevant configurations is the computational-kinetics equivalent of improved electronic structure methods.”- Basic Research Needs for Clean and

Efficient Combustion of 21st Century Transportation Fuels.

US Dept of Energy (2006)8

Automatic TS searches remain an important energy research goal

“...transformation from by-hand calculations of single reactions to automated calculations of millions of reactions would be a game-changer for the field of chemistry, and would be a good ‘Grand Challenge’ target...”

- Combustion Energy Frontier Research Center (2010)

9

First Annual Conference of the Combustion Energy Frontier Research

Center (CEFRC)

September 23-24, 2010

Princeton

An introduction toReaction Mechanism Generator

Automatically builds detailed kinetic models

facebook.com/rmg.mitr m g . s o u rc e f o r g e . n e t

10

⇌RMG

Molecules are represented as graphs

CH3CH2. C C*

H

H

H H

H

=

11

Thermochemistry is often estimated by Benson group contributions

C-(C)(H)3

C-(C)2(H)2

Cb-(H)

C-(C)(Cb)(O)(H)

12

Reaction families propose all possible reactions with given species

bond breaking and hydrogen abstraction

intramolecularH-abstraction

13

•Template for recognizing reactive sites

•Recipe for changing the bonding at the site

•Rules for estimating the rate

14

Reaction families propose all possible reactions with given species

•Template for recognizing reactive sites

•Recipe for changing the bonding at the site

•Rules for estimating the rate

Octane autoxidation has many pathways

15

•Some pathways go further than others.

16

Faster pathways are explored further

AB

CD

E

FG

H

AB

CD

E

F

17

Edge requires many reaction rates

100 species1,000 reactions

18

Edge requires many reaction rates

100 species1,000 reactions

15,000 species180,000 reactions

18

Rate estimates are based on the local structure of the reacting sites.

•Hydrogen abstraction: XH + Y. → X. + YH•Rate depends on X and Y.

19

OH

O

20

Rate estimation rules are organized in a tree

Part of the tree for X

Part of the tree for Y21

Ideal tree: lots of data

22

Typical tree: sparse data

23

24

So that was RMG...

...but what about TS geometries?

Single method not feasible for all reaction types

Intra-H migration

Intra-OH migration

Birad recombination

Intra R addition exocyclic

Intra R addition endocyclic

1,2 birad to alkene

Beta scission

Diels-alder

Radical recombination

Radical addition

Peroxyradical HO2 elimination

1+2/2+2 cycloaddition

Cyclic ether formation

1,2 insertion

1,3 insertion CO2/ROR

Radical addition COO radical recombination

H abstraction

Dispropotionation

25

But a single method can apply to multiple reaction types

A B A B + C A + B C + DIntra-H migrationIntra-OH migrationBirad recombinationIntra R addition exocyclicIntra R addition endocyclic1,2 birad to alkene

Beta scissionDiels-alderRadical recombinationRadical addition

1+2/2+2 cycloadditionCyclic ether formation1,2 insertion1,3 insertion CO2/RORRadical addition COO radical recombination

H abstractionDispropotionation

Peroxyradical HO2 elimination

26

Want robust and user-friendly3D representation

• Internal coordinates•Alter distances and angles

•Cartesian coordinates•Translate, rotate atoms

•Distance geometry•Alter only distances

Atom X Y Z

1 x1 y1 z1

2 x2 y2 z2

3 x3 y3 z3

4 x4 y4 z4

27

Use RDKit’s geometry editing toolsfor atom positioning

⇌RMGMolecule

Connectivity3D

Structure

28

Use RDKit’s geometry editing toolsfor atom positioning

⇌RMGMolecule

Connectivity

Atoms List

Atom

s Li

st Upper limits

Lower limitsGenerate bounds matrix

Embedin 3D

28

Use RDKit’s geometry editing toolsfor atom positioning

⇌RMGMolecule

Connectivity

Atoms List

Atom

s Li

st Upper limits

Lower limitsGenerate bounds matrix

Atoms List

Atom

s Li

st

Embed in 3D

Editbounds matrix

28

C H H H H O O HCHHHHOOH

0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0

Edit multiple distances to preciselyposition atoms involved in reactions

29

C H H H H O O HCHHHHOOH

0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0

Edit multiple distances to preciselyposition atoms involved in reactions

29

C H H H H O O HCHHHHOOH

0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0

Edit multiple distances to preciselyposition atoms involved in reactions

29

C H H H H O O HCHHHHOOH

0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0

Edit multiple distances to preciselyposition atoms involved in reactions

29

C H H H H O O HCHHHHOOH

0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0

Edit multiple distances to preciselyposition atoms involved in reactions

2.0

2.1

29

C H H H H O O HCHHHHOOH

0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0

Edit multiple distances to preciselyposition atoms involved in reactions

2.0

2.1

29

C H H H H O O HCHHHHOOH

0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0

Edit multiple distances to preciselyposition atoms involved in reactions

2.0

2.1

2.5

2.6

29

C H H H H O O HCHHHHOOH

0 1.12 1.12 1.12 1.12 1000 1000 10001.1 0 1.86 1.86 1.86 1000 1000 10001.1 1.78 0 1.86 1.86 1000 1000 10001.1 1.78 1.78 0 1.86 1000 1000 10001.1 1.78 1.78 1.78 0 1000 1000 10003.65 2.9 2.9 2.9 2.9 0 1.33 1.043.65 2.9 2.9 2.9 2.9 1.31 0 1.973.15 2.4 2.4 2.4 2.4 1.02 1.89 0

Edit multiple distances to preciselyposition atoms involved in reactions

2.0

2.1

2.5

2.6

29

Double-ended algorithms findtransition state estimates

Reactants

Products30

Double-ended algorithms findtransition state estimates

Reactants

Products30

R

P

Position molecules fordouble-ended searches

31

R

P

Best guess: just either side of TS

32

Method tested withsemi-empirical calculations

•Two double-ended algorithms tested•QST2 at PM6 in Gaussian09•SADDLE at PM7 in MOPAC2012

•Reaction path analysis validated the saddle points

Generate Bounds Matrix

Edit Bounds Matrixclose to TS

Embed Matrix in

3DReaction

from RMGOptimize TS geometry

Generate Bounds Matrix

Edit Bounds Matrixclose to TS

Embed Matrix in

3D

Double-ended Search

Reactants

Products

IRCCalculation

33

Path analysis algorithms descendto find the reactants and products

R

P

34

Path analysis algorithms descendto find the reactants and products

R

P

34

Path analysis algorithms descendto find the reactants and products

R

P

34

Path analysis algorithms descendto find the reactants and products

R

P

34

TS search and refinement

Reaction path analysis

Compare to desired reactants & products

Embed geometry either side of TS

Get bounds matrix

Fail

Succeed

FailFail

H. .OH otherradical

.OH

otherradical

A closer look at the automatic TS search process for H abstraction

35

338 Reactions from the NIST Database

TS search and refinement

Reaction path analysis

Compare to desired reactants & products

Embed geometry either side of TS

Get bounds matrix

Fail

Succeed

FailFail

H. .OH otherradical

.OH

otherradical

A closer look at the automatic TS search process for H abstraction

35

VdW collisions

338 Reactions from the NIST Database

TS search and refinement

Reaction path analysis

Compare to desired reactants & products

Embed geometry either side of TS

Get bounds matrix

Fail

Succeed

FailFail

H. .OH otherradical

.OH

otherradical

A closer look at the automatic TS search process for H abstraction

35

VdW collisions

No TS at this ES level

338 Reactions from the NIST Database

TS search and refinement

Reaction path analysis

Compare to desired reactants & products

Embed geometry either side of TS

Get bounds matrix

Fail

Succeed

FailFail

H. .OH otherradical

.OH

otherradical

A closer look at the automatic TS search process for H abstraction

35

VdW collisions

No TS at this ES level

338 Reactions from the NIST Database

TS search and refinement

Reaction path analysis

Compare to desired reactants & products

Embed geometry either side of TS

Get bounds matrix

Fail

Succeed

FailFail

H. .OH otherradical

.OH

otherradical

A closer look at the automatic TS search process for H abstraction

35

VdW collisions

No TS at this ES level

338 Reactions from the NIST Database

TS search and refinement

Reaction path analysis

Compare to desired reactants & products

Embed geometry either side of TS

Get bounds matrix

Fail

Succeed

FailFail

H. .OH otherradical

.OH

otherradical

A closer look at the automatic TS search process for H abstraction

35

VdW collisions

No TS at this ES level

338 Reactions from the NIST Database

Bond perception

Species matching returned false negativesdue to incorrect bond order perception.

CH4

36

R

P

Observed:

Expected:

Species matching returned false negativesdue to incorrect bond order perception.

Connectthe dots

CH4

36

R

P

Observed:

Expected:

Species matching returned false negativesdue to incorrect bond order perception.

Connectthe dots

Perceivebond order

CH4

36

R

P

Observed:

Expected:

Species matching returned false negativesdue to incorrect bond order perception.

Connectthe dots

Perceivebond order

CH4

CH4

Checkvalencies

36

R

P

Observed:

Expected:

Species matching returned false negativesdue to incorrect bond order perception.

Connectthe dots

Perceivebond order

CH4

CH4

Checkvalencies

36

R

P

Observed:

Expected:

Species matching returned false negativesdue to incorrect bond order perception.

Connectthe dots

Perceivebond order

CH4

CH4

Checkvalencies

Checkvalencies

CH4

36

R

P

Observed:

Expected:

TS search and refinement

Reaction path analysis

Compare to desired reactants & products

Embed geometry either side of TS

Get bounds matrix

Fail

Succeed

FailFail

H. .OH otherradical

.OH

otherradical

Most failures involve reactions withsmall molecules

37

VdW collisions

No TS at this ES level

Bond perception

Small radicals need to be closer to the molecule they are abstracting from

38

•All abstractions by H. failed•Many with other small radicals (eg. .OH) also failed

Small radicals need to be closer to the molecule they are abstracting from

38

•All abstractions by H. failed•Many with other small radicals (eg. .OH) also failed

TS search and refinement

Reaction path analysis

Compare to desired reactants & products

Embed geometry either side of TS

Get bounds matrix

Fail

Succeed

FailFail

H. .OH otherradical

.OH

otherradical

Learn from the successful saddle pointsto improve automatic searches

39

VdW collisions

No TS at this ES level

Bond perception

Semi-empirical estimates used forDFT calculations

40

•Check semi-empirical geometry validity•Use geometry as input to DFT calculations•Check DFT geometry validity

Generate Bounds Matrix

Edit Bounds Matrixclose to TS

Embed Matrix in

3DReaction

from RMGOptimize TS geometry

Generate Bounds Matrix

Edit Bounds Matrixclose to TS

Embed Matrix in

3D

Double-ended Search

Reactants

Products

IRCCalculation

Optimize TSgeometry at

DFT

IRC Calculation

at DFT

Trends observed in DFTsaddle point geometries

41

Structure method:Basis set:

M06-2X6-31+G(d,p)

X

Y•

H

Trends observed in DFTsaddle point geometries

41

Structure method:Basis set:

M06-2X6-31+G(d,p)

X

Y•

H

Trends observed in DFTsaddle point geometries

41

Structure method:Basis set:

M06-2X6-31+G(d,p)

X

Y•

H

Trends observed in DFTsaddle point geometries

41

Structure method:Basis set:

M06-2X6-31+G(d,p)

X

Y•

H

Estimate geometry directly viagroup additive distance estimates

42

Generate Bounds Matrix

Edit Bounds Matrixclose to TS

Embed Matrix in

3DReaction

from RMGOptimize TS geometry

Generate Bounds Matrix

Edit Bounds Matrixclose to TS

Embed Matrix in

3D

Double-ended SearchReactants

Products

IRCCalculation

Generate Bounds Matrix

Edit Bounds Matrix

for TS

Embed Matrix in

3D

•Database arranged in tree structure as for kinetics•Trained on successfully optimized transition states•Direct guess much faster than double ended search•Success depends on training data

Comparison of the developed methods

43

Double-Ended Searches Direct Estimates

Input requirements 2 rough estimates 1 good estimate

Distance specifications One rule for all Group based estimates

Optimization Methods

QST2, SADDLE, Surface Walking Surface Walking

Computational Speed Slower Faster

Small radical reactions Problematic Better

Multiple conformers Problematic Possible

Contributions

•Explained Reaction Mechanism Generator RMG.•Created framework to find TS geometries using RMG and RDKit for distance geometry.•Categorized reaction families, and chose H-abstraction as first target.• Implemented double-ended TS searches that work with no training data.• Identified trends in functional group contributions to TS geometries.• Implemented direct guesses based on group additive estimates, and started to train group values.

44

Department of Chemical Engineering

top related