high throughput experimentation: computational requirements john m. newsam molecular simulations...
TRANSCRIPT
High Throughput Experimentation:Computational Requirements
John M. NewsamMolecular Simulations Inc.
(A Pharmacopeia subsidiary)
“Workshop on Combinatorial Methodsfor Materials Discovery”
ATP Fall National MeetingAtlanta, GA
Wednesday November 18th 1998
Potential Hindrances?
• Patent profusion– vigilance
• Unmet expectations– set reasonably
• Infrastructure cost– hindrance for academics
• Lack of standards– premature for hardware
• Inertia– resistance to change, short-term delivery focus
High-throughput Experimentation
Lead compounds for resynthesis and secondary testing
Testing requirements drive synthesis format
Library Design
QSAR#
#Quantitative Structure-Activity relationships
Pooled, parallel or discreteSynthesis
Primary Testing Performance in specific application
Physical, mechanical etc. processingProcessing
Characterization of composition, purity, phases, structure
Analytical
Infrastructure Needs
• Vertical and horizontal integration
• Adaptable
• Modular
• Geared for huge throughput
• Broadly deployable
New 1536 well HTS Format• 1536 wells, 2 l well volume
• Corning Science Products joint design
• Automated 961536 reformatterl-level fluids dispensing
• Oxidative and evaporative loss reduced
Engineering Solution
Process and Data ManagementData BaseEngines
Oracle
MaterialsAlgorithms
Display
Statistics
User Input & Workstation Interfaces
Chemistry & Materials Input
Workstation & Oracle Forms
MaterialsSpecific Tables
Analysis, Display and Data Access
Server-basedProcessing
MolecularSimulation
Luminescence data for a library of mixed metal oxides under 254nm UV irradiation
Data from E.Danielson et al., Science 279 (1998) 831
Some Specific Technology Needs
• Hits vs misses; improvement criteria• Descriptors• Experiment decision support• Abstracted feature models (AFMs)• Process optimization• Simulation for scale-up• Sensor data (unravelling response of arrays)
• Which experiments should be done ?
Making it practical: computation
– 100 R1, 100 R2, 100 R3, 100 R4 108
– 50,000 compounds/week 40 years
• How do we manage the process ?• What knowledge do the experiments yield ?
Computation Solution
‘Hard materials’
M2
M1
X
+Temp
Scaffold
‘Soft materials’
R1
R2
R3
R4
Computation Solution
Compound library design• Library Specification
– Molecular: Product or Reaction-based– Polymers, Heterogeneous catalysts ?
• Library Design– Diversity and similarity metrics– Similarity Selection– Array and mixture design
• Library Comparison• Library Focussing
– Active site model (atomic or abstracted)– QSAR Model
C2.DiversityC2.LibCompareC2.LibSelect
World Drug Index of 35,873 compounds in a space of
principal components
Abstracted Feature Models
R.C.Willson
• Abstraction of key features
• Based on activity data
• Interesting ‘active’ definition
DescriptorsTopological
Fragments
Receptor surface
Structural
Information-content
Spatial
Electronic
Thermodynamic
Conformational
Quantum mechanical
Descriptor Families
C2.Descriptor+C2.MFAC2.QSAR+C2.Synthia
Products
Plus Molecular and Quantum Methods
Descriptors - calculable molecular attributes that govern particular macroscopic properties
Computation Solution
Available, occupiable volume & framework density descriptors
(104 zeolite and zeolite-related framework types)
Correlative methods in catalyst design: Expert systems, neural networks and structure-activity relationships , in “Advances in Catalyst Design” Catalyst Advance Program (CAP) Report, The Catalyst Group, PA; in press (1998)
Structure-Activity Relationships
C2.QSAR+C2.GA
Products
Linear regression
Stepwise & multiple linear regression
Principal components analysis
Partial least squares
Genetic algorithm
Genetic function approximation
Statistical Models
Descriptors Correlative Methods Properties
E.g. K.F. Moschner and A. Cece, “Development of a General QSAR for Predicting Octanol-Water Partition Coefficients and its Application to Surfactants,” ASTM STP 1218 (1995); MSI C2 QSAR manual April 1997.
Computation Solution
0
2
4
6
8
10
12
0 2 4 6 8 10 12
Activ ity
GF
A P
redi
cti
on
Oil Field Corrosion InhibitorsOrganics
H. Gråfen et al., Werkstoff und Korrosion, Vol. 36, 407 (1985)
• Benzimidazolines function at cathodic sites• Library studied by Kuron et al. (1985)• Key descriptors
• Terminal N charge• 3-substituted N charge• Octanol-water logP• Moment of inertia
M.Doyle
Conclusion
• Computational infrastructure needs• Specific technology needs• Role of computation
– process management system– experiment decision support– data visualization and analysis– knowledge from the experimental data
• Integration