1 advanced structure search. 2 structure search in beilstein
TRANSCRIPT
2
Structure search in BEILSTEINStructure search in BEILSTEIN
<253 non-H atoms
Search type controls substitution:
EXA, FAMCSS
SSS
4
G-groupsG-groups
• Elements
• Shortcuts
• Variable groups (Ak, Cy, X, etc.)
• Structural fragments (that you draw)
• Other G-groups
7
Run a sss sample search
=> l1SAMPLE SEARCH INITIATED 18:20:45SAMPLE SCREEN SEARCH COMPLETED - 1648 TO ITERATE 60.7% PROCESSED 1000 ITERATIONS 14 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01 FULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 30525 TO 35395PROJECTED ANSWERS: 173 TO 749
CSS SearchCSS Search
9
Run a css full search
=> l1 css fullFULL SEARCH INITIATED 18:20:22FULL SCREEN SEARCH COMPLETED - 33296 TO ITERATE 100.0% PROCESSED 33296 ITERATIONS 4 ANSWERS SEARCH TIME: 00.00.01 L3 4 SEA CSS FUL L1 => d scan
CSS SearchCSS Search
11
CSS SearchCSS SearchBe carefull :
You can pay CSS search as a SSS search and get results as a Family search.
Look at the following example:
12
CSS SearchCSS Search
=> Uploading C:\Program Files\stnexp\Queries\css.str chain nodes :7 8 9 ring nodes :1 2 3 4 5 6 chain bonds :4-7 5-9 6-8 ring bonds :1-2 1-6 2-3 3-4 4-5 5-6 exact bonds :1-2 1-6 2-3 3-4 4-5 4-7 5-6 5-9 6-8 L1 STRUCTURE UPLOADED
13
CSS SearchCSS Search=> s l1 full css L3 4 SEA CSS FUL L1 => d costCOST IN U.S. DOLLARS SINCE FILE TOTAL ENTRY SESSIONCONNECT CHARGES 0.37 0.52NETWORK CHARGES 0.06 0.12SEARCH CHARGES 160.90 160.90 ------- -------FULL ESTIMATED COST 161.33 161.54
14
CSS SearchCSS Search=> s l1 full fam L5 4 SEA FAM FUL L1 => d costCOST IN U.S. DOLLARS SINCE FILE TOTAL ENTRY SESSIONCONNECT CHARGES 0.74 0.89NETWORK CHARGES 0.12 0.18SEARCH CHARGES 64.00 64.00 ------- -------FULL ESTIMATED COST 64.86 65.07
=> l3 or l5L6 4 L3 OR L5
17
Search Question: Locate analogs of the following substances to be used as possible synthetic intermediates
N
R'
CH3
R' =
O
R'' , CN
R'' = NHCH3 OCH2Ph AK
G-groupsG-groups
18
Challenges
R’ represents several classes of cpds
The carboxy derivatives are further defined
Solution
Create a G-group
Embed a G-group within a G-group
G-groupsG-groups
19
• Draw the fragment(s)
• Label them as fragments: assign the @ to the points of attachment for each fragment
• Save the G-group
G-groupsG-groups
31
=> Uploading "advstr1.str" in the current fileL1 STRUCTURE UPLOADED=> D L1L1 HAS NO ANSWERSL1 STR
N
G2
Me
O
G1
NH Me
O
C H 2
P h
A k
@
@
@
@
1
2
3
4
G1 [ @1 ] , [ @2 ] , [ @3 ]G2 C N, [ @4 ]
G1G2
G-groupsG-groups
32
=> S L1 SS SAMSAMPLE SEARCH INITIATEDFULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**L2 50 SEA SSS SAM L1
=> D SCAN L2 50 ANSWERS REGISTRY COPYRIGHT 1998 ACSIN Ethanone, 1-(5-chloro-1-methyl-1H-indol-3-yl)-MF C11 H10 Cl N O
A c
Me
C l
N
G2 = An acyl analog
G-groupsG-groups
33
L2 50 ANSWERS REGISTRY COPYRIGHT 1998 ACSIN 1H-Indole-2-carboxylic acid, 3-cyano-1-methyl-, ethyl ester (9CI)MF C13 H12 N2 O2
G2 = A cyano analog
OE t
Me
C N
C
O
N
G-groupsG-groups
34
=> S L1 SSS FULLFULL SEARCH INITIATED 13:43:28 FILE 'REGISTRY' FULL SCREEN SEARCH COMPLETED - 12107 TO ITERATE100.0% PROCESSED 12107 ITERATIONS 797 ANSWERSSEARCH TIME: 00.00.15
L3 1166 SEA SSS FUL L1
=> FILE CAPLUS
=> S L3/RCT 342 L3 2185962 RCT/RLL4 171 L3/RCT (L3 (L) RCT/RL)
=> D IBIB ABS HITSTR 38
Role indexing to limit retrievalRCT = reactant
Which substance from L3 is indexed to this record?
HITSTR provides hit CAS RN, index name and structure
G-groupsG-groups
35
• Drawing specific structures– Fragments with 2 points of attachment– Variable points of attachment on multiple rings
G-groupsG-groups
37
G-groupsG-groups
1. Draw all the fragments.
2. Label with @ point of attachments.
3. Create separate G-groups.
4. “Orient” fragments with 2 points of attachment during the SAVE operation.
38
G-groupsG-groups
The quinoline ring is drawn twice to account for 2 points of attachment.
•All fragments in G1 (Y) are given 2 points of attachment.•The amide fragment in Y is drawn twice, to account for both orientations.•Carbons are left open for substitution.
39
G-groupsG-groups
1. In the Define New G-Group dialog box, click Fragments.2. Use Next Fragment to navigate through the fragments in the structure, including the desired fragments.
40
G-groupsG-groups
The Ak variable was chosen from the Variables menu.
Fragments with 2 points of attachment show up as [*1-*2].
41
G-groupsG-groupsOrientation of a fragment takes place during the SAVE operation. It cannot be bypassed!
1
2
Two nodes are highlighted.Show Fragment is used to see a fragment and select the node attachment to G2.
42
G-groupsG-groups
3•Select the appropriate node for each fragment. •In the case of the amide (an unsymmetrical fragment), the node selection is carefully chosen to orient the selected node to the highlighted node in the structure window (G2).
43
G-groupsG-groupsEach end of each fragment is highlighted during query verification. This shows the orientation of the fragment to the rest of the structure.
44
Search Question: Locate benzyl substituted N-containing
ring systems described by compound AA
N
R
N
N
N
N
N
AA
R = N
N - C
NULL
CH 2
Ph
G-groupsG-groups
45
Challenges
R is describing 3 ring systems
How to describe “NULL”
Solution
Create a G-group, with fragments, embedded in ring
•Start with a five members ring
•Use a [0-1] repeating group on G in a six members ring
G-groupsG-groups
46
• Draw the 5-member ring desired
• Draw fragments to account for other ring sizes
• Label the fragments with two points of @ttachment
• Define the G-group; put it in the ring
• Verify the orientation of the fragments
G-groupsG-groups
47
Challenges (cont)
The N-C fragments must be orientated to retrieve 1,3 systems
Solution
Use two Points of Attachments and assign orientation during the SAVE process
G-groupsG-groups
51
=> Uploading "advstr2.str" in the current fileL6 STRUCTURE UPLOADED=> DL6 HAS NO ANSWERSL6 STR
G1
NN
N
@ @
@ @
C H2P h1 2
3 4
G1 C , [ @1 - @2 ] , [ @3 - @4 ]
G-groupsG-groups
52
=> S L6 SS SAMSAMPLE SEARCH INITIATEDFULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**L7 50 SEA SSS SAM L1=> D SCANL7 50 ANSWERS REGISTRY COPYRIGHT 1999 ACSIN 2H-1,3-Diazepin-2-one, o o oMF C36 H36 N4 O4 S
A seven membered Diazepine ring
P h
P h
N
ON
HO
OHO
N
S
H 2 N
RS
S
R
G-groupsG-groups
53
L7 50 ANSWERS REGISTRY COPYRIGHT 1998 ACSIN 2-Pyrrolidinone, 1-(benzoyloxy)-4-methyl-5,5- bis(phenylmethyl)- (9CI)MF C26 H25 N O3
A five membered pyrrolidine ring
Me
P h
P h
P h
C H 2
C H 2N
O C
O
O
G-groupsG-groups
54
• Draw the 6-member ring desired
• Draw fragments to account for other ring sizes
• Label the fragments with two points of @ttachment
• Define the G-group; put it in the ring
• Use a [0-1] repeating group on G
• Verify the orientation of the fragments
G-groupsG-groups
56
G-groupsG-groups
=> fil reg => Uploading C:\Program Files\stnexp\Queries\w8.str L1 STRUCTURE UPLOADED => l1SAMPLE SEARCH INITIATED 13:07:26SAMPLE SCREEN SEARCH COMPLETED - 12012 TO ITERATE L2 20 SEA SSS SAM L1
62
G-groupsG-groups=> l1 sss fullL3 12 SEA SSS FUL L1
Cl
ClCH2 C NH
O
C
NH
NH2G1 = H, OH, Me ??????
How can you get the desired compounds?
63
G-groupsG-groups
Ring IsolatedBond all exact
=> l1 css full L3 7 SEA CSS FUL L1
Cl Cl
G1G1 H,Me,OH
64
G-groupsG-groups
=> l1 css full L3 7 SEA CSS FUL L1
Cl
Cl
OH
S
S
Cl
Cl
OH
R
S
Cl
Cl
OH
ClClRS
Cl
Cl
OH
S
R
ClCl
ClClRR
65
G-groupsG-groups
Ring IsolatedBond all exact
=> l1 sss fullL3 739 SEA SSS FUL L1
Cl Cl
G1G1 H,Me,OH
66
G-groupsG-groups
MeCl
Cl
Cl
ClCl
CH
OH
Cl
ClCl Cl
ClCl
Br
Br
Br
Br
Cl Cl
Cl ClCl
P
O
OO
S
CC
OO
OHOHCl
Cl
Cl
Cl
68
G-groupsG-groups=> l1SAMPLE SEARCH INITIATED 13:05:49SAMPLE SCREEN SEARCH COMPLETED - 1642 TO ITERATE 60.9% PROCESSED 1000 ITERATIONS 4 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01 FULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 30410 TO 35270PROJECTED ANSWERS: 4 TO 284 L2 4 SEA SSS SAM L1
70
G-groupsG-groupsN
1) On the Nitrogen can be attached only H or Ak2) The ring is isolated3) Any substitution allowed in the open sites
72
G-groupsG-groups => fil reg => c6/ea and nrs=1 and nc=1 L1 1536728 C6/EA AND NRS=1 AND NC=1
=> Uploading C:\Program Files\stnexp\Queries\zanzola.strL2 STRUCTURE UPLOADED
73
G-groupsG-groups=> l2 sample subset=l1 PROJECTIONS (WITHIN SPECIFIED SUBSET):ONLINE **COMPLETE** L3 50 SEA SUB=L1 SSS SAM L2 => d scan
Me
Me
Me
C CH
O N CH2 C CH
NHS
O
O
76
G-groupsG-groups
=> Uploading C:\Program Files\stnexp\Queries\zanzolabis.strL4 STRUCTURE UPLOADED
=> l4 sample subset=l1PROJECTIONS (WITHIN SPECIFIED SUBSET):ONLINE **COMPLETE** L5 50 SEA SUB=L1 SSS SAM L4 => d scan
79
• Highlight all atoms in repeating group
• Select [ ] m-n[ ] m-n from the DrawDraw menu
• Enter the repeat values
• Use unspecifiedunspecified bond between atoms in repeating group to allow any bonding in the repetition
To specify a repeating group:
Repeating GroupsRepeating Groups
80
A repeating group may range from 0-200-20 only, however:
• may repeat a single atom or group of atoms
• may have multiple repeating groups
Repeating GroupsRepeating Groups
81
Repeating GroupsRepeating Groups
Search Question: Locate studies on carboxylic acids in milk that have the following structures:
CH3R'
O
OH Where R’ = an unsubstituted carbon chain of 10-40 atoms
with any type of bonding between the atoms
82
Challenges
R = 10-40 atom carbon chain with any type of bonding
Solution
Use [ ] m-n (repeating grouprepeating group)
Repeating GroupsRepeating Groups
83
A repeating group may range from 0-200-20 only, however:
• may repeat a single atom or group of atoms
--(C)0-10-1(C--C)5-205-20---
Repeating GroupsRepeating Groups
85
=> FILE REGISTRY=> Uploading acid.strL1 STRUCTURE UPLOADED
=> D L1L1 HAS NO ANSWERSL1 STR
O
OH
5-20Me 0-1
Repeating GroupsRepeating Groups
86
D SCANL2 46 ANSWERS REGISTRY COPYRIGHT 1998 ACSIN Chitosan, octadecanoate (salt) (9CI)MF C18 H36 O2 . x Unspecified CM 1 *** STRUCTURE DIAGRAM IS NOT AVAILABLE CM 2
(CH2)16 MeHO2C
Repeating GroupsRepeating Groups
87
L2 34 ANSWERS REGISTRY COPYRIGHT 1999 ACSIN 9-Dodecenoic acid (7CI, 8CI, 9CI)MF C12 H22 O2
(CH2)7 EtHO2C CH CH
Repeating GroupsRepeating Groups
89
• Specify substitution with G-groups
• Use H
• Use Non-H Attachments (ConnectivityConnectivity)
• Exclude atoms
• CSS - closed substructure search
Blocking SubstitutionBlocking Substitution
93
Search Question: Locate substances with the following general structure characteristics
N N
N
R'
Two Br'sR' = anything except an additional Bromine
The N substituent may be in a ring or a chainno further substitution
on this ring - substitution is allowed at all other open positions
Blocking SubstitutionBlocking Substitution
94
Challenges
R’ is any atom except Br
The N may be in a ring or chain
Solution
Use excludeexclude Br
Assign Ring/chainNode CharacteristicsNode Characteristics
Blocking SubstitutionBlocking Substitution
95
Challenges (cont)
Two Br, with no additional substitution on one ring.Substitution is allowed at other open sites.
Solution
1: Use G1=H/BrG1=H/Br on 4 positions and a SSS searchSSS search
2: Use a Variable Point of Variable Point of AttachmentAttachment and a CSS CSS searchsearch with VPA’s and Non-hydrogen AttachmentsNon-hydrogen Attachments
(Connectivity)(Connectivity)
Blocking SubstitutionBlocking Substitution
96
=> Uploading dye1.strL1 STRUCTURE UPLOADED
=> DL1 HAS NO ANSWERSL1 STR
N
N
Br
N
G1
G1
G1
G1
G1 H,Br
Approach 1: G1=H/Br
Blocking SubstitutionBlocking Substitution
98
=> L1 SSS SAMSAMPLE SEARCH INITIATEDFULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**L2 50 SEA SSS SAM L1
=> D SCANL2 50 ANSWERS REGISTRY COPYRIGHT 1998 ACSIN 3-Pyrrolidinol, 1-[4-[(4-nitrophenyl)azo]phenyl]-, (R)- (9CI)MF C16 H16 N4 O3
NO2
HO
N
NN
R
All G1 values are H
Blocking SubstitutionBlocking Substitution
99
• Variables (Ak, Cb, G-groups)• VPA’s• Repeating groups• Variable bonds• Excluded atoms• Substitutions set at Non Hydrogen Atoms (Connectivity)
• Variables (Ak, Cb, G-groups)• VPA’s• Repeating groups• Variable bonds• Excluded atoms• Substitutions set at Non Hydrogen Atoms (Connectivity)
A CSS blocks substitution at Open SitesOpen Sites, but allows for:
Blocking SubstitutionBlocking Substitution
100
SSS CSSSSS CSS• G-groups• variable groups• VPA• REP groups• Exclusions• variable bonds
• Allows ANY substitution at any open positions
• yes• yes• yes• yes• yes• yes
• NO substitution at open positions (unless set by connectivity)
Blocking SubstitutionBlocking Substitution
101
Exclude an atom
• From the Draw menu, choose Atom or Variable• Select the desired Atom or Variable• Click EXCLUDE
Blocking SubstitutionBlocking Substitution
102
Allow substitution in CSS search
• Highlight the atom of interest (or click wright)
• Click QueryDef then Non-H Attachments
• Set to minimum of 1
Blocking SubstitutionBlocking Substitution
103
• Build ring system and variably attached fragment• Highlight the attachment atom• With shift key, highlight attachment points• Select VPA from Draw menu
Specify variable points of attachment to a ring system by adding a VPA:
Blocking SubstitutionBlocking Substitution
104
=> FILE REGISTRY=> Uploading dye2.strL3 STRUCTURE UPLOADED
=> D L3L3 HAS NO ANSWERSL3 STR
Approach 2: CSS, Non-hydrogen attachments and VPA
N
N
Br
Br
Br
N
Blocking SubstitutionBlocking Substitution
Conn. Min. 2 Conn. Min. 1 R/C
Conn. Min. 2
105
=> S L3 CSS FULLFULL SEARCH INITIATED 16:21:29FULL SCREEN SEARCH COMPLETED - 1303 TO ITERATE100.0% PROCESSED 1303 ITERATIONS 338 ANSWERS
L4 62 SEA CSS FUL L3
Blocking SubstitutionBlocking Substitution
Me NMe2
Br
Br
NN
N OO
107
Blocking SubstitutionBlocking Substitution
N
N
NBr
Br
Br
HG1
1
2
G1 [@1],[@2]
Consider this structure.Same conditons as previous one
108
Blocking SubstitutionBlocking Substitution
=> Uploading C:\Program Files\stnexp\Queries\gbrexluded.strL5 STRUCTURE UPLOADED => l5 full cssL6 84 SEA CSS FUL L5 => l6 not l4L7 22 L6 NOT L4
Me
Me
Br
BrN N
H2N
109
Exclude an atom
• From the Draw menu, choose Atom or Variable• Select the desired Atom or Variable• Click EXCLUDE
Blocking SubstitutionBlocking Substitution
BUT if you exclude an atom or variable BUT if you exclude an atom or variable you exclude also Hydrogenyou exclude also Hydrogen
110
O
O
C
C
C
Conn. = E1 Max2
CN
C ClC H3
C
O
Ph
CF3
No Yes
No YesYes Yes
No No
No No
Skills Practice
Query
In Registry
111
C
C
C
C
C
C
Ak
If you wish Ak unsubstituted could you use the connectivity?If yes, which should be the value?
If you put the connectivity E=1 do you get also branched chains?If yes how can you isolate them?
114
Alkyl, Cycloalkyl not substitudednot substitudedOther Substitutions allowed
R1=H, Alkyl
R2= H, Alkyl, Cycloalkyl
A = N, Alkyl
Skills Practice
116
Structure Search Limits Scope of Search
Iterations Answers
Sample (online, subset, range)
2000 50
Full (online, subset, range)
1,000,000 1,000,000
Batch (online, subset)
1,500,000 1,500,000
System LimitsSystem Limits
117
No. of Substances Crossed to Search
CAS Files Non-CAS Files
REGISTRY file crossover
300,000 (CAPlus) 40,000 (CASREACT)
10,000
=> HELP SLIMIT
=> HELP CROSSOVER
System LimitsSystem Limits
118
Strategies when search limits are reached:
Query modification
• Structure drawing techniques
• ScreensSTN system-related options
• Batch searching
• Range searching
• Subset searching
System LimitsSystem Limits
119
=> S L1 SSS FULLFULL SEARCH INITIATED 15:30:15 FILE 'REGISTRY' FULL SCREEN SEARCH COMPLETED - >1,000,000 TO ITERATE15.3% PROCESSED 153323 ITERATIONS 1 ANSWERS40.0% PROCESSED 400000 ITERATIONS 31 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.01.07
FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: EXCEEDS 1000000PROJECTED ANSWERS: EXCEEDS 419
L2 31 SEA L1 SSS FULL
System LimitsSystem Limits
120
Strategies when search limits are reached:
– Query modification
• Structure drawing techniques
System LimitsSystem Limits
121
• Ring isolation
• Changing bond values
• Additional substitution
System LimitsSystem Limits
Strategies when search limits are reached:
122
Locate references discussing the use of steroidal substances with the following structure as therapeutic agents. Structures must have O-substitution at the specified position. No C-O bond order is specified.
O
System LimitsSystem Limits
123
=> FILE REGISTRY=> Uploading steroid1.strL1 STRUCTURE UPLOADED
=> D L1L1 HAS NO ANSWERSL1 STR
O
•Unspecified C-O bond
•Unspecified bonds for keto/enol
•Default isolated/embedded ring
•C-O bond is set to ring/chain
System LimitsSystem Limits
124
=> S L1 SSS SAMSAMPLE SEARCH INITIATED 07:35:47 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 40288 TO ITERATE
2.5% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01
FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: 793876 TO 817644PROJECTED ANSWERS: 101993 TO 110727
L2 50 SEA SSS SAM L1
System LimitsSystem Limits
125
Ring Isolation• No additional fused, bridged or spiro-fused
attachments
• Improves efficiency of first stage of the search - screening
• Reduces numbers of substances to iterate, and usually reduces number of answers
System LimitsSystem Limits
126
=> Uploading stlisol.strL9 STRUCTURE UPLOADED
=> D L9L9 HAS NO ANSWERSL9 STR
O
•Unspecified C-O bond
•Unspecified bonds for keto/enol
•C-O bond is set to ring/chain
•Ring is isolated
System LimitsSystem Limits
127
=> S L9 SSS SAMSAMPLE SEARCH INITIATED 07:51:14 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 17972 TO ITERATE
5.6% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01
FULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 351444 TO 367436PROJECTED ANSWERS: 54995 TO 61463
L10 50 SEA SSS SAM L9
System LimitsSystem Limits
128
=> S L9 SSS FULLFULL SEARCH INITIATED 14:12:28 FILE 'REGISTRY' FULL SCREEN SEARCH COMPLETED - 363218 TO ITERATE100.0% PROCESSED 363218 ITERATIONS 58645 ANSWERSSEARCH TIME: 00.00.11
L11 58645 SEA SSS FUL L9=> FILE CAPLUS
=> S L11/THU 59530 L2 368975 THU/RLL12 3082 L11/THU (L11 (L) THU/RL)
System LimitsSystem Limits
129
=> D SCANL12 3082 ANSWERS CAPLUS COPYRIGHT 2001 ACSIC ICM A61K-031/575CC 63-6 (Pharmaceuticals)TI Eye drop for the treatment of gray cataractST eye drop gray cataract bile extIT Bile Cataract
o o oIT 57-88-5, Cholesterol, biological studies 81-25-4, Cholic acid 149-91-7, Gallic acid, biological studies 635-65-4, Bilirubin, biological studies 25312-65-6, Cholanic acid RL: BAC (Biological activity or effector, except adverse); THU(Therapeutic use); BIOL (Biological study); USES (Uses)
System LimitsSystem Limits
130
Changing Bond Values
• Change from unspecified to a specific value
• Improves efficiency of first stage of the search - screening, reducing number of substances to be iterated
• Change ring/chain to chain OR ring also reduces number of substances to be iterated
System LimitsSystem Limits
131
=> Uploading stlnode.strL11 STRUCTURE UPLOADED
=> D L11L11 HAS NO ANSWERSL11 STR
•Single C-O bond
•Single ring bonds
•C-O bond is set to chain
•Ring is isolatedO
System LimitsSystem Limits
132
=> S L11 SSS SAMSAMPLE SEARCH INITIATED 07:52:33 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 15539 TO ITERATE
6.4% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01
FULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 303339 TO 318221PROJECTED ANSWERS: 44325 TO 50151
L12 50 SEA SSS SAM L11
System LimitsSystem Limits
133
Additional Substitution
• Addition of NON-HYDROGEN substituents decreases number of substances to be iterated
• Provides a wider selection of screens
System LimitsSystem Limits
134
=> Uploading str1subs.strL13 STRUCTURE UPLOADED
=> D L13L13 HAS NO ANSWERSL13 STR
•Single C-O bond
•Single ring bonds
•C-O bond is set to chain
•Ring is isolated
•Another C-O bondO
O
System LimitsSystem Limits
135
=> S L13 SSS SAMSAMPLE SEARCH INITIATED 13:03:35 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 10290 TO ITERATE
9.7% PROCESSED 1000 ITERATIONS 11 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01FULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 199734 TO 211866PROJECTED ANSWERS: 1625 TO 2901
L14 11 SEA SSS SAM L13
System LimitsSystem Limits
136
• Techniques that DO NOT decrease number of substances iterated:– Blocking substitution with hydrogen– Adding stereochemistry– Adding atom attributes
• These techniques have no effect on number of substances to be iterated, but do typically reduce the number of answers
System LimitsSystem Limits
137
Strategies when search limits are reached:
– Query modification
• Screens (structure filters)
System LimitsSystem Limits
138
• Add additional “structure filters” or screens to the query
• Adds screens beyond those automatically generated by STN
• Narrows potential answer set
• Added during the structure SAVE process in STN Express
System LimitsSystem Limits
139
Locate all structures containing this phenolic structure fragment.
The benzylic C-O bond can be single or double. If single, the O can be part of a ring. The phenyl ring can be part of a larger ring system.
OH
O
System LimitsSystem Limits
140
• Enter structure-searchable file, upload query• Run SAMPLE search, evaluate answers• Add relevant structure filters• Upload revised query• Run SAMPLE search, evaluate answers• Run a FULL file structure search
System LimitsSystem Limits
141
=> FILE REGISTRY=> Uploading phenol1.strL1 STRUCTURE UPLOADED
=> D L1L1 HAS NO ANSWERSL1 STR
OH
O•Benzylic bond is unspecified
•O node is set to ring/chain
•Ring is isolated/embedded
System LimitsSystem Limits
142
=> S L1 SSS SAMSAMPLE SEARCH INITIATED 16:11:32 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 32922 TO ITERATE
3.0% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01
FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: 647671 TO 669209PROJECTED ANSWERS: 140410 TO 150620
L2 50 SEA SSS SAM L1
System LimitsSystem Limits
143
Step Action
1 Return to Structure Drawing
2 From FILE menu, click OPEN
3 From FILE menu, click SAVE
4 In SAVING dialog box, click checkmarkin “Refine Using Structure Filters”checkbox. Click SAVE
System LimitsSystem Limits
145
Step Action
5 From “Refine Using Structure Filters” dialog box select desired options
Structure characteristics for atoms with Hydrogens attached
Number of occurrences
Number of rings
Isotopes
Polymers
System LimitsSystem Limits
146
STN Express may suggest structure
filters
May select others from
this list
Or from this list
System LimitsSystem Limits
147
=> ....Testing the current file.... screenENTER SCREEN EXPRESSION OR (END):end
=> SCREEN 1700 AND 1943 AND 2005 AND 1838 L3 SCREEN CREATED
=> SCREEN 2043 L4 SCREEN CREATED
=> Uploading C:Filesfilter.strL5 STRUCTURE UPLOADED
=> QUE L5 AND L3 NOT L4L6 QUE L5 AND L3 NOT L4
Filters are converted to their “Screen” number
The Query command combines the structure
and screen terms
System LimitsSystem Limits
148
=> D L6
L6 HAS NO ANSWERSL3 SCR 1700 AND 1943 AND 2005 AND 1838L4 SCR 2043L5 STR
L6 QUE ABB=ON PLU=ON L5 AND L3 NOT L4
OH
O
System LimitsSystem Limits
149
=> S L6 SSS SAMSAMPLE SEARCH INITIATED 16:12:05 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 18039 TO ITERATE
5.5% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01
FULL FILE PROJECTIONS: ONLINE **COMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 352769 TO 368791PROJECTED ANSWERS: 131431 TO 141317
L7 50 SEA SSS SAM L5 AND L3 NOT L4
System LimitsSystem Limits
150
=> S L6 SSS FULLFULL SEARCH INITIATED 16:12:18 FILE 'REGISTRY' FULL SCREEN SEARCH COMPLETED - 362253 TO ITERATE100.0% PROCESSED 362253 ITERATIONS 134146 ANSWERSSEARCH TIME: 00.00.06
L8 134146 SEA SSS FUL L5 AND L3 NOT L4
=> D SCAN L8 134146 ANSWERS REGISTRY COPYRIGHT 2001 ACSIN Benzoic acid, 5-chloro-2-hydroxy-, [3-[[4- (dimethylamino)phenyl]amino]-1- methyl-3-oxopropylidene]hydrazide (9CI)MF C19 H21 Cl N4 O3
Me NMe2
Cl
C NH
O
N C
OH
CH2 C NH
O
System LimitsSystem Limits
151
• Strategies when search limits are reached:– STN system-related options
• Batch searching
System LimitsSystem Limits
153
Locate patents discussing the use of organometallic substances containing the following structural fragment as polymerization catalysts.
R = Fe, Ti, Zr, Cr, Mg, Ni, Pd, As, Cu, Mo
R
System LimitsSystem Limits
154
• Enter structure-searchable file, upload query
• Run SAMPLE search, evaluate answers
• Run a FULL file structure search in BATCH mode
• Refine results
System LimitsSystem Limits
155
=> FILE REGISTRY=> Uploading metallo.strL1 STRUCTURE UPLOADED
=> D L1L1 HAS NO ANSWERSL1 STR
•A G-group is used to define the metals
•Up to 20 options may be defined for a G-group
G1
G1 Fe,Ti,Zr,Cr,Mg,Ni,Pd,As,Cu,Mo
System LimitsSystem Limits
156
=> S L1 SSS SAMSAMPLE SEARCH INITIATED 16:23:37 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 21936 TO ITERATE
4.6% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.02
FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 429897 TO 447543PROJECTED ANSWERS: 127189 TO 136919
L2 50 SEA SSS SAM L1
System LimitsSystem Limits
157
=> D SCAN L2 50 ANSWERS REGISTRY COPYRIGHT 2001 ACSIN 1-Butanaminium, N-[(1'-bromoferrocenyl)methyl]-N,N-dimethyl-, bromide (9CI)MF C17 H25 Br Fe N . BrCI CCS Bu-n
Me
Me
Br
CH2 NC
Fe
CH
CH
CHCH
CCHCH
CH
CH
+-
2+
-
Br-
System LimitsSystem Limits
158
• Batch search is run overnight
• Results are saved in an answer set
• ACTIVATE the saved answer set for display or additional searching
• No additional cost for a batch search
System LimitsSystem Limits
159
=> BATCHENTER QUERY L# FOR BATCH REQUEST OR (END):L1ENTER BATCH REQUEST NAME OR (END):?Enter the name you wish to use for the BATCH request. The name must:
o o oENTER BATCH REQUEST NAME OR (END):METALLO/BMETALLO/BENTER TYPE OF SEARCH (SSS), CSS, FAMILY, OR EXACT:SSSENTER SCOPE OF SEARCH (FULL) OR RANGE:FULLQUERY L2 HAS BEEN SAVED AS BATCH REQUEST 'METALLO/B'
System LimitsSystem Limits
160
=> D SAVED/A
NAME CREATED NOTES/TITLE---------- ---------- --------------------------METALLO/AMETALLO/A 17 APR 2001 118512 ANSWERS IN FILE REG
=> FILE REGISTRY
=> ACTIVATE METALLO/A
L1 STRL2 118512 SEA FILE=REGISTRY SSS FUL L1
=> D SCAN
System LimitsSystem Limits
161
• Up to 300K REG answers may be crossed over to a CAS database
• The results of a structure search may be refined with additional substance criteria, e.g. – ELS - Element Symbol
System LimitsSystem Limits
162
=> S L2 AND (ZR OR TI)/ELS 109090 ZR/ELS 199331 TI/ELSL3 28115 L2 AND (ZR OR TI)/ELS
=> FILE CAPLUS
=> S L3(L)(CAT/RL OR CATAL?) AND POLY? AND PATENT/DT
L4 3510 L3(L)(CAT/RL OR CATAL?) AND POLY? AND PATENT/DT
This answer set may now be crossed into the
CAplus file
System LimitsSystem Limits
163
=> D 1 L5 BIB ABS HITSTR L5 ANSWER 1 OF 3510 CAPLUSAN 2001:235582 CAPLUSTI Metallocene polymerization catalysts for polyolefin preparationIN Yamamoto, Kazuhiro; Maruyama, Yasuo; Kanno, ToshihikoPA Nippon Polychemicals Co., Ltd., JapanSO Jpn. Kokai Tokkyo Koho, 13 pp. CODEN: JKXXAFDT PatentLA JapaneseFAN.CNT 1 PATENT NO. KIND DATE APPLICATION NO. DATE ------------ ---- -------- --------------- -----PI JP2001089512 A2 20010403 1999JP-0266058 19990920
HITSTR shows the “hit” CAS RN’s for each answer.
System LimitsSystem Limits
164
IT INDEXING IN PROGRESSIT 37206-41-0, Bis(cyclopentadienyl)zirconium dibenzyl RL: CAT (Catalyst use); USES (Uses) (catalyst support; metallocene polymn. catalysts for polyolefin prepn.)RN 37206-41-0 CAPLUS CN Zirconium, bis(.eta.5-2,4-cyclopentadien-1-. . . . .
PhPh CH2ZrCH2
CH
CHCHCH
CH
CH
CHCHCH
CH
-4+-
-
-
System LimitsSystem Limits
165
• Strategies when search limits are reached– STN system-related options
• Range searching
System LimitsSystem Limits
166
• May be used to get the search to run to completion within ONLINE limits
• Use if the full-file projections is not too large AND projected answers are within limits
• Conduct an immediate search using two structure searches covering different segments of the file
System LimitsSystem Limits
167
What has been reported on the use of substances containing the following structural fragment as part of a catalyst or initiator in polymerization processes?
R1 = O, S, C
Additional substitution may be present. No fusion is allowed on the N-containing ring.
N N
R1
System LimitsSystem Limits
168
• Enter structure-searchable file, upload query• Run SAMPLE search, evaluate answers• Run a FULL file structure search until it reaches
system limits and stops• Identify the oldest CAS RN in the partial answer
set• Run a range search over the remaining part of the
database• Combine answer sets• Refine results
System LimitsSystem Limits
169
=> FILE REGISTRY=> Uploading C:\Program Files\Stnexp\Queries\n2ring.strL1 STRUCTURE UPLOADED
=> D L1L1 HAS NO ANSWERSL1 STR
•Use a G-group to define the options for R1
•Isolate the ring to prevent fusion
N N
G1
G1 O,S,C
System LimitsSystem Limits
170
=> S L1 SSS SAMSAMPLE SEARCH INITIATED 21:38:35 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 22582 TO ITERATE
4.4% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01
FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 442690 TO 460590PROJECTED ANSWERS: 43190 TO 48944L2 50 SEA SSS SAM L1=> D SCAN
System LimitsSystem Limits
171
=> S L1 SSS FULLFULL SEARCH INITIATED 21:12:15 FILE 'REGISTRY' FULL SCREEN SEARCH COMPLETED - 454927 TO ITERATE
87.9% PROCESSED 400000 ITERATIONS 36444 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.06
FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **COMPLETE**PROJECTED ITERATIONS: 454927 TO 454927PROJECTED ANSWERS: 40838 TO 42058
L3 36444 SEA SSS FUL L1 Newest compounds are in L3
System LimitsSystem Limits
172
=> D 36444 RN L3 ANSWER 36444 OF 36444 REGISTRY COPYRIGHT 2001 ACSRN 54833-00-054833-00-0 REGISTRY
System LimitsSystem Limits
173
Type this To search the following
RAN=,xxx-xx-x From the oldest CAS RN up through xxx-xx-x
RAN=xxx-xx-x, From xxx-xx-x through the newest CAS RN
RAN=yyy-yy-y,xxx-xx-x From yyy-yy-y through xxx-xx-x
System LimitsSystem Limits
174
=> S L1 SSS RAN=,54833-00-054833-00-0
RANGE MORE THAN 100,000. WILL BE BILLED AS A FULL FILE SEARCH.INITIATE SEARCH? Y/(N):Y
RANGE SEARCH INITIATED 21:14:02 FILE 'REGISTRY' RANGE SCREEN SEARCH COMPLETED - 54955 TO ITERATE
100.0% PROCESSED 54955 ITERATIONS 10234 ANSWERSSEARCH TIME: 00.00.03
L4 10234 SEA RAN=(,54833-00-0) SSS L1
System LimitsSystem Limits
175
=> S L3 OR L4
L5 46677 L3 OR L4
=> FILE CAPLUS
=> S L5/CAT OR (L5 (L) ?POLYM? and (?CATAL? OR ?INITIAT?))
L6 310 L5/CAT OR (L5 (L) ?POLYM? AND (?CATAL? OR ?INITIAT?))
=> D SCAN
System LimitsSystem Limits
176
• Use RANGE= in SEARCH to specify portion of the database
• Use CAS RNs in REGISTRY to define the range
• If the range > 100,000 substances, a full-file search fee is charged. If =< 100,000 a lower fee is charged
To check RN versus time, use: => help rnyearhelp rnyear, => help rnweekhelp rnweek
System LimitsSystem Limits
177
• Strategies when search limits are reached:– STN system-related options
• Subset searching
System LimitsSystem Limits
178
Structure searches may be run against theentire REGISTRY file, or:• Against a subset of the database• The subset may be defined using;
– Substance terms– Subject terms
System LimitsSystem Limits
179
System LimitsSystem LimitsIterations
Compounds in Registry80 Milions
1000000
40 Milions20 Milions
Structure 1
Structure 2
Structure 1, angular coefficient is more generalStructure 2, angular coefficient is more specific
Subset
180
Term Field Example
Presence/absenceof an atom
/ELS P/ELS (P ispresent)
Atom counts n/elementsymbolm-n/elementsymbolelementsymbol>n
3/Cl
3-7/F
F>10
Ring systemelements
/REL N/REL
System LimitsSystem Limits
181
Term Field Example
Ring elementcounts
n elementsymbol/RELn-m elementsymbol/REL>n elementsymbol/REL
2/REL
2-3 N/REL
>2 N/REL
Ring atom count /RATC 6/RATC
System LimitsSystem Limits
182
Term Field Example
Ring elementalformula
/RELF C Fe/RELF
Ring identifier /RID 46.383/RID
System LimitsSystem Limits
183
Skills Practice
Search saturated, unsubstituted alkyl alcohols, containing 10-20 carbon atoms , with a chiral center, with reported a BP at 760 Torr.
184
Uploading C:\Program Files\stnexp\Queries\Alcohol.str
chain nodes :1 2 chain bonds :1-2 exact/norm bonds :1-2 Connectivity :Connectivity :1:1 E exact RC ring/chain 2:1 E exact RC ring/chain1:1 E exact RC ring/chain 2:1 E exact RC ring/chain Match level :1:CLASS 2:CLASS Generic attributes :Generic attributes :1: 1: Saturation : Saturated Saturation : Saturated Number of Carbon Atoms : 7 or moreNumber of Carbon Atoms : 7 or more L1 STRUCTURE UPLOADED
185
=> l1SAMPLE SEARCH INITIATED 05:28:26SAMPLE SCREEN SEARCH COMPLETED - 931004 TO ITERATEFULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: EXCEEDS 1000000PROJECTED ANSWERS: EXCEEDS 0 L2 0 SEA SSS SAM L1 => c h o/elf(p)10-20/c(p)1/o and nc=1 and no rsd/faL3 22012 C H O/ELF(P)10-20/C(P)1/O AND . . . .
=> l1 full subset=l3L4 2003 SEA SUB=L3 SSS FUL L1
186
=> l4 and 760 torr/bp.p 12243468 760 TORR/BP.PL5 1694 L4 AND 760 TORR/BP.P => l5 and stereosearch/fs 5211009 STEREOSEARCH/FSL6 529 L5 AND STEREOSEARCH/FS
=> d qrd L6 ANSWER 1 OF 529 REGISTRY COPYRIGHT 2003 ACS on STN . . . . . . . . . .
(CH2)8 EtMe
OH
R
Calculated Properties Boiling Point (BP) CODE| VALUE | CONDITION | NOTE ====+==============+=================+=======BP |518.65+/-8.0 K|Press: 760.0 Torr|(1) ACD
187
Locate references discussing the photographic applications of compounds containing the following structural fragment.
The heterocyclic ring contains at least 2 N atoms, and at least 6 atoms in total. Bonds in the fragment are all chain bonds.
NHy
O
System LimitsSystem Limits
188
• Enter structure-searchable file, upload query• Run SAMPLE search, evaluate answers• Define a subset• Create the subset• Run a SAMPLE structure search and evaluate results• Run a FULL file structure search• Refine results
System LimitsSystem Limits
189
=> FILE REGISTRY=> Uploading subset.strL1 STRUCTURE UPLOADED
=> D L1L1 HAS NO ANSWERSL1 STR
•Hy with element count of Nitrogen, minimum 2 is usedN
Hy
O
System LimitsSystem Limits
190
=> S L1 SSS SAMSAMPLE SEARCH INITIATED 21:49:03 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 123620 TO ITERATE
0.8% PROCESSED 1000 ITERATIONS 9 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01
FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: EXCEEDS 1000000PROJECTED ANSWERS: EXCEEDS 20251L2 9 SEA SSS SAM L1
System LimitsSystem Limits
191
=> ....Testing the current file.... screen
ENTER SCREEN EXPRESSION OR (END):end
=> SCREEN 1994 AND 2004 AND 1838
L3 SCREEN CREATED
=> SCREEN 2043
L4 SCREEN CREATED
•3 or more N
•1 or more O
•1 or more rings
•Polymers (2043) are excluded
System LimitsSystem Limits
192
=> Uploading C:Files.strL5 STRUCTURE UPLOADED=> QUE L5 AND L3 NOT L4L6 QUE L5 AND L3 NOT L4
=> S L6 SSS SAMSAMPLE SEARCH INITIATED 21:50:01 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 73719 TO ITERATE 1.4% PROCESSED 1000 ITERATIONS 23 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: EXCEEDS 1000000
L7 23 SEA SSS SAM L5 AND L3 NOT L4
System LimitsSystem Limits
193
=> S >=2 N/REL (S) RATC>=6
15045119 REL.CNT >= 2 7536633 N/REL 3983611 >=2 N/REL (REL.CNT >= 2 (T) N/REL) 14208100 RATC>=6L8 30140633014063 >=2 N/REL (S) RATC>=6
•2 or more Nitrogen ring elements
•6 or more ring atoms
System LimitsSystem Limits
194
=> S L6 SUB=L8 SSS SAM
PROJECTIONS (WITHIN SPECIFIED SUBSET): ONLINE **COMPLETE**PROJECTED ITERATIONS (WITHIN SPECIFIED SUBSET): 333170 TO 348750PROJECTED ANSWERS (WITHIN SPECIFIED SUBSET): 8872 TO 11584
L9 30 SEA SUB=L8 SSS SAM L5 AND L3 NOT L4
System LimitsSystem Limits
195
=> S L6 SUB=L8 SSS FULL
L10 18836 SEA SUB=L8 SSS FUL L5 AND L3 NOT L4
Et2NF
ClCl
NH2
N
NNN
CH2CNH
O
NH
CH2
S
CH2O
O
System LimitsSystem Limits
196
=> FILE CAPLUS
=> S L10 AND PHOTOG?L11 118 L10 AND PHOTOG?
=> D SCANL11 118 ANSWERS CAPLUS COPYRIGHT 2001 ACSIC ICM C09B-023/00 ICS C07D-261/12; C07D-277/30; C07D-403/14; C07D- 413/14; C07D-417/14; G03C-001/14CC 41-6 (Dyes, Organic Pigments, Fluorescent Brighteners, and Photographic Sensitizers) Section cross-reference(s): 74TI Methine compounds for spectral sensitizers and silver halide photographic materials using the same o o o
System LimitsSystem Limits
197
Locate substances containing indole that have been patented for use as fabric dyes.
System LimitsSystem Limits
198
=> FILE REGISTRY=> Uploading C:\Program Files\Stnexp\Queries\indole.strL1 STRUCTURE UPLOADED
=> D L1L1 HAS NO ANSWERSL1 STR
•Unlikely this will run within system limitsN
System LimitsSystem Limits
199
=> S L1 SSS SAMSAMPLE SEARCH INITIATED 13:50:22 FILE 'REGISTRY' SAMPLE SCREEN SEARCH COMPLETED - 74242 TO ITERATE
1.3% PROCESSED 1000 ITERATIONS 50 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.02
FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: EXCEEDS 1000000PROJECTED ANSWERS: EXCEEDS 407163
L2 50 SEA SSS SAM L1
System LimitsSystem Limits
200
=> FILE CAPLUS
=> S DYE# (L) FABRIC#
239878 DYE# 96285 FABRIC#L3 17681 DYE# (L) FABRIC#
=> S L3 AND PATENT/DT
3189643 PATENT/DTL4 11925 L3 AND PATENT/DT
System LimitsSystem Limits
201
=> FILE REGISTRY
=> TRANSFER
ENTER L# (L4) OR ?:L4ENTER ANSWER NUMBERS, RANGES (1-), OR ?:1-ENTER DISPLAY FIELDS (TI) OR ?:RN
L5 TRANSFER L4 1- RN : 30879 TERMS
L6 30670 L5
System LimitsSystem Limits
202
=> S L1 SUB=L6 SSS SAMSAMPLE SUBSET SEARCH INITIATED 14:00:21 FILE 'REGISTRY' SAMPLE SUBSET SCREEN SEARCH COMPLETED - 54 TO ITERATE
100.0% PROCESSED 54 ITERATIONS 8 ANSWERSSEARCH TIME: 00.00.01
PROJECTIONS (WITHIN SPECIFIED SUBSET):ONLINE **COMPLETE**PROJECTED ITERATIONS (WITHIN SPECIFIED SUBSET): 640 TO 1520PROJECTED ANSWERS (WITHIN SPECIFIED SUBSET): 8 TO 329
L7 8 SEA SUB=L6 SSS SAM L1
=> D SCAN O O O
System LimitsSystem Limits
203
=> S L1 SUB=L6 SSS FULL
FULL SUBSET SEARCH INITIATED 14:00:40 FILE 'REGISTRY' FULL SUBSET SCREEN SEARCH COMPLETED - 998 TO ITERATE
100.0% PROCESSED 998 ITERATIONS 110 ANSWERSSEARCH TIME: 00.00.01
L8 110 SEA SUB=L6 SSS FUL L1
System LimitsSystem Limits
204
=> FILE CAPLUS
=> S L8 (L) DYE# AND L4
32219 L8 239878 DYE# 301 L8 (L) DYE#L9 34 L8 (L) DYE# AND L4
=> D BIB HITSTR
34 patents contain an indole structure as a fabric dye
System LimitsSystem Limits
205
o o oIT 117584-16-4 RL: PRP (Properties); TEM (Technical or engineered material use); USES (Uses) (dye; ink-jet printing inks containing disperse dyes for printing fabrics with high colorfastness and high black color yield)RN 117584-16-4 CAPLUS CN 1H-Indole, 3-[(2,6-dichloro-4-nitrophenyl)azo]-1-methyl-2-phenyl- (9CI) (CA INDEX NAME)
Me
Ph
NO2Cl
Cl
N N
N
System LimitsSystem Limits
207
=> fil regL1 STRUCTURE UPLOADED => dL1 HAS NO ANSWERSL1 STR
Structure attributes must be viewed using STN Express query preparation.
Me
CO2H
Me
1-2
208
=> l1SAMPLE SEARCH INITIATED 06:58:14SAMPLE SCREEN SEARCH COMPLETED - 34814 TO ITERATE 2.9% PROCESSED 1000 ITERATIONS 0 ANSWERSINCOMPLETE SEARCH (SYSTEM LIMIT EXCEEDED)SEARCH TIME: 00.00.01 FULL FILE PROJECTIONS: ONLINE **INCOMPLETE** BATCH **INCOMPLETE**PROJECTED ITERATIONS: 685172 TO 707388PROJECTED ANSWERS: 0 TO 0 L2 0 SEA SSS SAM L1
209
““Filters” to Reduce IterationsFilters” to Reduce Iterations
Filters may be added during the save process
To add filters:
• Check “Refine Using Structure Filters”• Click SAVE
210
STN Express often suggests filters
• May add others using AND or NOT logic
““Filters” to Reduce IterationsFilters” to Reduce Iterations
213
=> nc=1 and no rsd/fa and 5-8/c and c h o/elf 40901379 NC=1 27196220 NO RSD/FA 1761030 5-8/C 2839176 C H O/ELFL3 42352 NC=1 AND NO RSD/FA AND 5-8/C AND C H O/ELF => l1 full subset=l3FULL SUBSET SEARCH INITIATED 07:01:14FULL SUBSET SCREEN SEARCH COMPLETED - 4319 TO ITERATE 100.0% PROCESSED 4319 ITERATIONS 70 ANSWERSSEARCH TIME: 00.00.01 L4 70 SEA SUB=L3 SSS FUL L1
Subset Searching using DictionarySubset Searching using Dictionary
216
Structure Search MatchesStructure Search Matches
• An answer to a structure search must contain all the following from the query:
AtomsBondsConnections
• Relevant answers that may not be retrieved include:
Incompletely defined compoundsCompounds with unanticipated bonding patterns
217
• Structure searches return precise results
• Try MF searches when structure searches turn up no hits to look for:– Incompletely defined compounds– Different bonding patterns
Troubleshooting TipsTroubleshooting Tips
218
Incompletely Defined CompoundsIncompletely Defined Compounds
RN 26249-12-7 REGISTRYCN Benzene, dibromo- (8CI, 9CI) (CA INDEX NAME)OTHER NAMES:CN DibromobenzeneMF C6 H4 Br2CI IDS, COM
BrD1( )2
219
• Unknown point of attachment for one or more known substituent
• Unknown site of saturation/unsaturation for one or more bonds
• Unknown branching in specific carbon chains substituents
• Unknown site for esterification/etherification in polyacids or polyols
Incompletely Defined CompoundsIncompletely Defined Compounds
220
Queries:
IDS not retrieved:
BrD1( )2
Incompletely Defined CompoundsIncompletely Defined Compounds
222
=> FILE REGISTRY=> Uploading ids.strL1 STRUCTURE UPLOADED
=> D L1L1 HAS NO ANSWERSL1 STR
ClNH2 O N
ClNNHN ON N
ClN CH2CHMePh O S CH2CH2OSO3H
O
SO3H
SO3H
Incompletely Defined CompoundsIncompletely Defined Compounds
223
=> S L1 FAM SAML2 0...
=> S L1 FAM FULLL3 0...
Incompletely Defined CompoundsIncompletely Defined Compounds
224
To locate a CAS RN for an IDS
Conduct a MF search and refine by namefragments if needed
Incompletely Defined CompoundsIncompletely Defined Compounds
225
=> FILE REGISTRY
=> E C32H25Cl3N8O14S4
E1 1 C32H25CL3N8O/BIE2 1 C32H25CL3N8O13S4/BIE3 3 --> C32H25CL3N8O14S4/BI
ooo
=> S E3L1 3...
Incompletely Defined CompoundsIncompletely Defined Compounds
226
OSO3H
Me
Ph
Cl
Cl
ClNO
ON
H2N
NHN
N
N
N
CH2 CH
S CH2
O
O CH2
SO3HD12
=> D SCAN
=> S L1 AND 2(W)2L2 1
=> D L2 1 RN STR REFRN 163499-22-7 REGISTRY
227
• Alternating double and single bonds– Tautomers– Aromatic rings
• Many changed to “normalized” in REGISTRY
• STN Express changes query structures to allow for exact value or normalized
Bonding PatternsBonding Patterns
228
• Exceptions to “Normalized”assignment in REGISTRY
Enol-keto tautomers
Pyrazole and pyrazolium rings
1,2,4-Dithiazolium rings
1,2- and 1,3-dithiolium rings
Tropolone derivatives
Porphines
Phorbines
Phthalocyanines
Cyanine dyes
Bonding PatternsBonding Patterns
229
CA Index Name: 1H-Pyrazole, 3-chloro-5-methyl-
Author structure:
REGISTRY structure:
Bonding PatternsBonding Patterns
230
Locate references discussing the preparation of this pyrazole derivative
Bonding PatternsBonding Patterns
231
=> Uploading pyrazole.strL1 STRUCTURE UPLOADED
=> D L1
=> S L1 FAM SAML2 0…
=> S L1 FAM FULLL3 0...
Bonding PatternsBonding Patterns
232
Technique 1:Conduct a MF search and refine with name fragments
Technique 2:Build a structure replacing single/double bonds with unspecified bonds and run EXA or FAM search
Bonding PatternsBonding Patterns
233
=> FILE REGISTRY
=> E C4H5IN2
E1 2 C4H5IMG/BIE2 1 C4H5IMGNO4/BIE3 15 --> C4H5IN2/BIE4 8 C4H5IN2O/BIE5 4 C4H5IN2O2/BI
ooo=> S E3L1 15…
Bonding PatternsBonding Patterns
234
=> S L1 AND PYRAZ?L2 7…
=> D SCAN
IN 1H-Pyrazole, 3-iodo-5-methyl- (9CI)MF C4 H5 I N2
Me
I
NH
N
Compare bonding to query structure
Bonding PatternsBonding Patterns
235
=> S 1H-PYRAZOLE, 3-IODO-5-METHYL-/CNL3 1 ...
=> FILE CAPLUS
=> S L3/PREPL4 1…
Bonding PatternsBonding Patterns
236
=> Uploading pyrazole.strL1 STRUCTURE UPLOADED
=> D L1
=> S L1 FAM FULLL3 1...
N MeN
I
Bonding PatternsBonding Patterns
239
Structure queries with unattached fragments:
Same Structure Query (L1)
Different Structure Queries (L1 AND L2)
Many Structures SearchingMany Structures Searching
240
Same Structure Query (L1)
Fragments in the same structure componentFragments in the same structure component
No overlap of fragment atoms No overlap of fragment atoms
Many Structures SearchingMany Structures Searching
241
Different Structure Queries (L1 AND L2)
Fragments in the same structure component Fragments in the same structure component with no overlapwith no overlap
Fragments in the same structure component Fragments in the same structure component with overlapwith overlap
Fragments in different componentsFragments in different components
Many Structures SearchingMany Structures Searching
246
In the synth. lab. a substance has been isolated produced by bacteria.From the anal. dept. you got the following information:
FW = 1180-2000
The following fragments are inside the compound, in ring or chain:
Me
O
H
H
H
H
H
Me
O
O
H
N
HH
H H
H
H
Skills Practice
247
chain nodes :1 4 5 6 7 ring/chain nodes :2 3 chain bonds :1-2 2-5 3-4 3-6 4-7 ring/chain bonds :2-3 exact/norm bonds :2-3 3-4 exact bonds :1-2 2-5 3-6 4-7
248
chain nodes :3 4 ring/chain nodes :1 2 chain bonds :1-3 2-4 ring/chain bonds :1-2 exact/norm bonds :1-2 exact bonds :1-3 2-4
249
chain nodes :1 6 7 ring/chain nodes :2 3 4 5 chain bonds :1-2 3-6 5-7 ring/chain bonds :2-3 3-4 4-5 exact/norm bonds :2-3 3-4 3-6 4-5 exact bonds :1-2 5-7
250
chain nodes :3 4 5 6 7 8 9 ring/chain nodes :1 2 chain bonds :1-4 1-5 2-3 2-6 2-7 3-8 3-9 ring/chain bonds :1-2 exact/norm bonds :1-2 2-3 exact bonds :1-4 1-5 2-6 2-7 3-8 3-9
251
=> fil reg => Uploading C:\Program Files\stnexp\Queries\fragments.str
L1 STRUCTURE UPLOADED => dL1 HAS NO ANSWERSL1 STR * STRUCTURE DIAGRAM TOO LARGE FOR DISPLAY - AVAILABLE VIA OFFLINE PRINT * Structure attributes must be viewed using STN Express query preparation.
252
=> 1180-2000/fwL2 457267 1180-2000/FW => l1 full subset=l2FULL SUBSET SEARCH INITIATED 06:14:40FULL SUBSET SCREEN SEARCH COMPLETED - 100272 TO ITERATE 100.0% PROCESSED 100272 ITERATIONS 7 ANSWERSSEARCH TIME: 00.00.02 L3 7 SEA SUB=L2 SSS FUL L1
253
=> d 1-7
L3 ANSWER 2 OF 7 REGISTRY COPYRIGHT 2002 ACSRN 184490-65-1 REGISTRYCN .....OTHER NAMES:CN Desertomycin IFS STEREOSEARCHMF C61 H109 N O21SR CALC STN Files: CA, CAPLUS
254
Me
Me
HO
OH OH
HO
HO
PAGE 1-A
Me
Me Me Me
OH OH OH OH
HO
OO
PAGE 1-B
(CH2)3
Me
OHNH2
PAGE 1-C
Me
Me
O
O
OH
OH
OH
OH
OH
OH
S S
R
SS
PAGE 2-A
256
=> fil reg => Uploading C:\Program Files\stnexp\Queries\substitution.str chain nodes :1 2 3 4 5 6 7 chain bonds :1-2 2-3 2-4 3-6 3-7 4-5 exact/norm bonds :2-3 2-4 3-7 exact bonds :1-2 3-6 4-5 L1 STRUCTURE UPLOADED
257
=> c h n/elf and no rsd/fa and nc=1 and c<=10L2 15770 C H N/ELF AND NO RSD/FA AND NC=1 AND C<=10 => l1 full subset=l2L3 6161 SEA SUB=L2 SSS FUL L1 => d scan L3 61 ANSWERS REGISTRY COPYRIGHT 2003 ACSIN Ethanimidic acid, N-butyl-, 2,2-dimethylhydrazide (9CI)MF C8 H19 N3
n-Bu Me
NMe2N C
NH
258
=> fil reg=> Uploading C:\Program Files\stnexp\Queries\substitution.str chain nodes :1 2 3 4 5 6 7 chain bonds :1-2 2-3 2-4 3-6 3-7 4-5 exact/norm bonds :2-3 2-4 3-7 exact bonds :1-2 3-6 4-5 Connectivity :4:1 E exact RC ring/chainL1 STRUCTURE UPLOADED
259
=> c h n/elf and no rsd/fa and nc=1 and c<=10L2 15770 C H N/ELF AND NO RSD/FA AND NC=1 AND C<=10 => l1 full subset=l2L3 2525 SEA SUB=L2 SSS FUL L1 => d scan L3 25 ANSWERS REGISTRY COPYRIGHT 2003 ACSIN Ethanimidamide, N-cyano- (9CI)MF C3 H5 N3CI COM
MeNC NH C
NH
261
SMARTracker
=> FILE REGISTRY => STR 33069-62-4 :END L1 STRUCTURE CREATED => S L1 FUL FULL SEARCH INITIATED 15:42:29 FULL SCREEN SEARCH COMPLETED - 2420 TO ITERATE 100.0% PROCESSED 2420 ITERATIONS 877 ANSWERS SEARCH TIME: 00.00.06 L2 877 SEA SSS FUL L1
262
SMARTracker
=> S L2/THU AND P/DT 2047 L2 146546 THU/RL 670 L2/THU (L2 (L) THU/RL) 2188789 P/DTL3 115 L2/THU AND P/DT
263
SMARTracker => SMART SMARTracker INITIATED ENTER QUERY L# FOR SDI REQUEST OR (END):L3 ENTER UPDATE FIELD CODE (UP) OR ?:. ENTER SDI REQUEST NAME, (AA013/S), OR END:TAXOLS/S ENTER COST CENTER (NONE) OR NONE:. ENTER TYPE OF SEARCH (SSS), CSS, FAMILY, OR EXACT:. ENTER TITLE (NONE):. ENTER METHOD OF DELIVERY (OFFLINE), ONLINE, EMAIL, OR FAX:EMAIL ENTER EMAIL ID (1190C):[email protected] [email protected] RECEIVE DELIVERY NOTIFICATION? (Y)/N:N ELIMINATE PREVIOUSLY SEEN ANSWERS WITH EACH SDI RUN? Y/(N):Y ENTER PRINT FORMAT (BIB) OR ?:CBIB ABS HITSTR HIGHLIGHT HIT TERMS? (Y)/N:. ENTER MAXIMUM NUMBER OF HITS TO BE PRINTED PER RUN (100):. SORT SDI ANSWER SET (N)/Y?:N SEND SDI WITH NO ANSWERS? (Y)/N:N ENTER SDI RUN FREQUENCY: (WEEKLY), BIWEEKLY, OR ?:. ENTER SDI EXPIRATION DATE 'YYYYMMDD' OR (NONE):. QUERY 'L3' HAS BEEN SAVED AS SDI REQUEST 'TAXOLS/S'
Or => SDI XFILE