structure revision of asperjinone using computer assisted structure elucidation methods
DESCRIPTION
The elucidated structure of asperjinone, a natural product isolated from thermophilic Aspergillus terreus, was revised using the expert system Structure Elucidator. The reliability of the revised structure was confirmed using 180 structures containing the (3,3-dimethyloxiran-2-yl)methyl fragment as a basis for comparison and whose chemical shifts contradict the suggested structure.TRANSCRIPT
1
Structure Revision of Asperjinone using Computer-Assisted Structure Elucidation (CASE)
Methods.
Mikhail Elyashberg, Kirill Blinov, Sergey Molodtsov‡ and Antony J. Williams.§*
Advanced Chemistry Development, Moscow Department, 6 Akademik Bakulev Street, Moscow
117513, Russian Federation,
‡Novosibirsk Institute of Organic Chemistry, Siberian Division, Russian Academy of Sciences, 9
Akademik Lavrent'ev Av., Novosibirsk, 630090 Russian Federation
§Royal Society of Chemistry, 904 Tamaras Circle, Wake Forest, NC-27587, USA
Corresponding author:
Antony J. Williams
904 Tamaras Circle, Wake Forest, NC-27587, USA
Phone: +1 (919) 201-1516
Fax:
Email: [email protected]
2
ABSTRACT
The elucidated structure of asperjinone (1), a natural product isolated from thermophilic
Aspergillus terreus, was revised using the expert system Structure Elucidator. The reliability of
the revised structure (2) was confirmed using 180 structures containing the (3,3-dimethyloxiran-
2-yl)methyl fragment (3) as a basis for comparison and whose chemical shifts contradict the
suggested structure (1).
3
Computer Assisted Structure Elucidation (CASE)1, 2 methods are widely used to identify
the structures of newly isolated natural products as well as new products of organic synthesis. In
the past decade it has been shown based on multiple comparisons2, 3 that the most advanced
CASE expert system is ACD/Structure Elucidator.3, 4 The system was developed with the
intention of elucidating the chemical structures of organic molecules from their MS, 1D and 2D
NMR spectra, generally employed in combination. In the literature there are many examples
documenting the successful application of Structure Elucidator, not only for the elucidation of
complex natural products but also for the purpose of structure revision.5, 6 Recently the
successful computer-assisted structure elucidation of an organic synthesis product whose
structure seemed undecipherable by traditional 2D NMR methods was described,7 while Codina
et al8 utilized the system for the analysis of a complex organic mixture.
Further development of the system is driven primarily by continuously challenging the
program with new structural problems described in the literature and, because of the general
complexity of the compounds described, especially new compounds reported in the Journal of
Natural Products. During the course of this work we utilized spectroscopic data reported by Liao
et al9 for deducing the structure of a new natural product named as asperjinone 1 and presented
in Figure 1. This compound was isolated, along with other 12 known compounds, from
Aspergillus terreus. As a result of our analysis using Structure Elucidator the structure of 1 was
revised and we suggest that structure 2 is the correct structure (see Figure 1).
4
Figure 1. The previously proposed structure of asperjinone (1) and the revised structure, 2.
Even though an expert system in general mimics human thinking during the molecular
structure elucidation process from spectroscopic data, the associated mathematical algorithms act
in other ways. The program automatically forms a set of “axioms” and hypotheses on the basis of
the available spectroscopic data and then deduces all (without any exception) structures which
are logical corollaries of the initial set of “axioms”. The molecular formula C22H20O6 and the
NMR data presented in Table 1 obtained from the reported work9 were used as input into the
Structure Elucidator software.
Table 1. 1D and 2D Spectroscopic data used for the structure elucidation of asperjinone9 (600
MHz, Acetone-d6).
Position C Type H (J in Hz) HMBCa
1 165.7 C
2 140.7 C
3 137.5 C
4 166.8 C
5 29.2 CH2 3.97, d (11.2)
3.98, d (11.2)
C-2, 3, 4, 1 ",2"
1' 119.0 C
5
2',6' 131.5 CH 7.63, d (8.1) C-2, 1 ',2',4'
3',5' 115.8 CH 7.01, d (8.1) C-1 ',4'
4' 160.3 C
1" 127.5 C
2" 129.6 CH 6.99, m C-4",6",7"
3" 120.9 C
4" 152.2 C
5" 117.0 CH 6.66, d (8.6) C-3",4"
6" 127.3 CH 6.99, m C-5
7" 31.2 CH2 2.67, dd (16.9,
8.0) 2.94, dd
(16.9, 5.0)
C-2",3",4",8",9"
8" 68.8 CH 3.76, m
9" 77.0 C
10" 19.7 CH3 1.22, s C-8",9",11"
11" 25.3 CH3 1.33, s C-8",9",10"
aHMBC correlations, optimized for 6 Hz, are from the proton(s) stated to the indicated carbon.
The Molecular Connectivity Diagram (MCD) automatically created by the program is presented
in Figure 2. The MCD shows atoms with their chemical shifts and their associated properties.
These include the hybridization states and the possibility of neighboring with heteroatoms as
well as HMBC connectivities between atoms. sp3-hybridized carbons are colored in blue, sp2 in
violet and atoms with ambiguous hybridization (sp3 or sp2) are colored in light blue. The symbol
“ob” indicates that a given atom has a heteroatom as a neighbor. The symbol “fb” shows that
such a heteroatom neighbor is forbidden. Two atoms (colored in pale blue) in the MCD –
C(119.0) and C(120.9) were classified as having ambiguous hybridization because the
mentioned chemical shifts are characteristic both for the C=C double bonds (sp2) and for C(sp3)
atom if it is included into an O-C-O fragment. Carbons with chemical shifts falling into the
interval 152-167 ppm are likely connected with at least one oxygen atom. The information
presented in MCD was used by the program for the purpose of structure generation.1 As a result
all structures in agreement with the HMBC correlations and atom properties were produced. No
6
expert considerations common for a traditional approach regarding HMBC correlations were
introduced. No structural inputs regarding the presence of aromatic rings or other conceivable
rings in the structure were made.
Figure 2. The Molecular Connectivity Diagram (MCD) extracted from the spectroscopic data.
Atoms are artificially arranged in such a manner which approximately corresponds to atom
positions in revised structure 2.
The following results from the structure generation process were obtained:
k=365826411939, tg = 1 m 50 s. This indicates that 3658 isomeric structures were generated
in 1 m 50 s, and 2641 structures were stored on disc after spectral and structural filtering.4
13C NMR chemical shifts were then calculated for the stored structures using an incremental
approach10 (this procedure took 8 sec) and duplicate structures were removed to give 1939
structures. During the latter procedure an isomer with the minimal deviation between the
7
experimental and calculated chemical shifts was selected as the “best” representative of a set of
identical structures. The output structural file was ranked in ascending order of the chemical shift
deviation. 13C chemical shifts were predicted for all 1939 structures using a neural network based
program (14 seconds calculation time) and then for the first 15 structures of the ranked file using
a HOSE code based program1 (1 minute calculation time). The first 9 structures of the ranked file
are displayed in Figure 3. Atoms for which = |Ccalc-Cexp| value, the difference between
experimental and calculated chemical shifts, is less than 3 ppm marked by green circles, yellow
circles corresponds for =3-15 ppm and red for >15 ppm. The figure shows that the first
ranked structure (fully green) is characterized by the smallest deviations calculated by HOSE
code and neural network based methods, while the structure proposed by Liao and co-workers9
was placed in third position by the ranking procedure. The deviation is almost twice the size of
that given for the structure ranked in first position.
To confirm the revised structure, 2, we performed a search for the (3,3-dimethyloxiran-2-
yl)methyl fragment existing in structure 1 in the ACD/NMR Database containing 425,000
structures with assigned 13C and 1H chemical shifts.
O
CH3
CH3
R
8"
9"7"10"
11"
8
CH3
CH3
O
O
O
O
OH
OH
dN(13C): 1.372
dA(13C): 1.384
1
CH3
CH3
O
OO
OOH
OH
dN(13C): 2.273
dA(13C): 2.814
2
CH3
CH3
O
O
O
O
OH
OH
dN(13C): 2.434
dA(13C): 2.859
3
CH3
CH3
O OO
O
OH
OH
dN(13C): 2.574
dA(13C): 2.494
4 CH3 CH3
OO
O
O
OH
OH
dN(13C): 2.696
dA(13C): 2.438
5
CH3 CH3
O
O
O
O
OH
OH
dN(13C): 2.752
dA(13C): 2.558
6
CH3
CH3
OO
O
O
OH
OH
dN(13C): 2.833
dA(13C): 2.630
7
CH3
CH3O
O
O
O
OH
OH
dN(13C): 2.890
dA(13C): 2.507
8
CH3CH3
OOO
O
OH
OH
dN(13C): 2.915
dA(13C): 2.541
9
Revised Proposed
Figure 3. The first 9 structures of the output file ranked by deviations calculated using a neural
network and HOSE code based 13C NMR prediction programs. Colored circles on the atoms
display chemical shift differences. Green color denotes the difference less than 3 ppm, yellow -
between 3 and 15 ppm, and read - more than 15 ppm. Designation of deviations: dA – HOSE
code based algorithm, dN – neural network based algorithm.
9
The program selected almost 180 structures, from which such ca. 150 structures were
chosen that exhibit the closest similarity with the environment of the oxirane fragment. For these
structures, a scatter plot was created (see Figure 4). Here 13C chemical shifts related to the C-8”
and C-9” atoms of structure 1 are presented for all selected structures. The chemical shift values
(69 and 77 ppm) assigned to the corresponding atoms C-8” and C-9” in the original structure 1
are also shown by their labels on the right side of the graph.
Figure 4. A scatter plot of the 13C chemical shift values related to atoms 8” and 9” of the original
structure 1. Series 1 (blue circles) corresponds to atom 9” (C 77 ppm in structure 1), series 2
(violet triangles) – to atom 8” (C 69 ppm in structure 1).
Inspection of the scatter plot convincingly confirms the incorrectness of the original
structure: the chemical shifts of C-8’’ (68.8 ppm in structure 1) are observed in the range of 60-
65 ppm while for C-9’’(77.0 ppm in structure 1) the corresponding range is 57-59 ppm.
On the other hand, corroboration of the revised structure 2 was found in the Supporting
Information of the original work9. One of the compounds separated by the authors9 along with
asperjinone (designated as butyrolactone V) was characterized and its 13C and 1H NMR chemical
shifts were assigned to the structure of butyrolactone V. This compound contains the revised
10
structural component of structure 2. Both structures supplied with the assigned 13C chemical
shifts (for butyrolactone V only partial assignment is shown) are presented in Figure 5.
Figure 5. Comparison of chemical shift in revised part of structure 2 with those in butyrolactone
V.
The structure comparison leaves no doubts regarding the correctness of structure 2. Moreover,
oxirane 1JCH couplings are typically ~180 Hz, far larger than other oxygen-bearing aliphatic
carbon and the existence of an oxirane ring in the asperjinone structure proved to be erroneous.
We believe that the true structure of asperjinone is as shown in 2, that is: 3-[(3-hydroxy-2,2-
dimethyl-3,4-dihydro-2H-chromen-6-yl)methyl]-4-(4-hydroxyphenyl)furan-2,5-dione. The
application of a CASE system to the structure elucidation of this natural product would have
allowed the authors to avoid this incorrect structure as an output from their analysis. It should be
noted that as far as we know this is the first example when reliable structure revision was
performed only with the aid of CASE system without additional experiments and quantum
chemical NMR shift calculations. Our research shows how it is important to verify the structure
11
of a new compound at least by NMR chemical shift prediction using fast and fully automatic
empirical methods.1
EXPERIMENTAL SECTION. All calculations were performed using the expert system
ACD/Structure Elucidator v.12 installed on PC 2.8 GHz, RAM 3 Gb.
REFERENCES AND NOTES.
1. Elyashberg, M. E.; Williams, A. J.; Blinov, K. A. Contemporary Computer-Assisted
Approaches to Molecular Structure Elucidation. RSC Publishing: Cambridge, 2012.
2. Elyashberg, M. E.; Williams, A. J.; Martin, G. E. Prog. NMR Spectr. 2008, 53, 1-104.
3. Steinbeck, C. Nat. Prod. Rep. 2004, 21, 512-518.
4. Elyashberg, M. E.; Blinov, K. A.; Molodtsov, S. G.; Williams, A. J.; Martin, G. E. J.
Chem. Inf. Comput. Sci. 2004, 44, 771-792.
5. Williams, A. J.; Elyashberg, M. E.; Blinov, K. A.; Lankin, D. C.; Martin, G. E.;
Reynolds, W. F.; Porco, J. A., Jr.; Singleton, C. A.; S, Su. J. Nat. Prod. 2008, 71, 581-588.
6. Elyashberg, M.; Williams, A.; Blinov, K. Nat. Prod. Rep. 2010, 27, 1296–1328.
7. Elyashberg, M. E.; Blinov, K. A.; Molodtsov, S. G.; Williams, A. J. Magn. Reson. Chem.
2012, 50, 22-27.
8. Codina, A.; Ryan, R. W.; Joyce, R.; Richards, D. S. Anal. Chem. 2010, 82, 9127-9133.
9. Liao, W.-Y.; Shen, C.-N.; Lin, L.-H.; Yang, Y.-L.; Han, H.-Y.; Chen, J.-W.; Kuo, S.-C.;
Wu, S.-H.; Liaw, C.-C. J. Nat. Prod. 2012, 75, 630-635.
10. Smurnyy, Y. D.; Blinov, K. A.; Churanova, T. S.; Elyashberg, M. E.; Williams, A. J. J.
Chem. Inf. Model. 2008, 48, 128-134.