sung kyu (andrew) maeng. contents qsar introduction qsbr introduction results and discussion ...
TRANSCRIPT
Sung Kyu (Andrew) Maeng
Contents
QSAR Introduction QSBR Introduction Results and discussion Current QSAR project in UNESCO-IHE
Introduction to the (Q)SAR concept Chemicals with similar molecular
structures have similar effects in physical and biological systems→ qualitative model (SAR)
The extent of an effect varies in a systematic way with variations in molecular structure→ quantitative model (QSAR)
Activity depends on chemical structure
Biodegradation index = 4.066-0.007MW-0.314H/C r = 0.866, r2 = 0.750, Sig. < 0.005, n= 156
SAR vs QSAR
SAR is based on the “similarity” principle; The principle is assumed, but in the reality
it is not always true;- Similarity of structures- Similarity of descriptors
The authenticity depends on the type of the relationship between descriptors (numerical representation of chemicals) and activity;
The type of the relationship should be known (or derived)
SAR vs. QSARhow could we say there is a
difference ?
Three common things to this point: Both methods use numerical representation
of chemical compounds; Both methods need to decide which
representation to use; Both methods need to derive the relationship
between numerical representation (descriptors, etc.) and activity.
QSAR in water treatment processes
Results obtained from valid qualitative or quantitative structure-activity relationship models can provide the removal of PhACs in drinking water and the process selection for target compounds. Results of QSAR may be used instead of testing if results are derived from a QSAR model whose scientific validity has been established
In principle, QSARs can be used to:- provide information for use in priority setting treatments for target compounds- guide the experimental design of a test or testing strategy- improve the evaluation of existing test data- provide mechanistic information (e.g. to support the grouping of chemicals into categories)- fill a data gap needed for classification
QSAR in water treatment processes
OECD Principles for QSAR Validation
QSAR should be associated with the following information:- a defined endpoint - an unambiguous algorithm - appropriate measures of goodness-of-fit, robustness and predictivity - a mechanistic interpretation, if possible
Development of Quantitative Structure-
Biodegradation Relationships (QSBRs)
- QSBRs has been developed to predict the biodegradability of
chemicals released to natural systems using their structure-
activity relationships (SAR)
- The development of QSBRs has been relatively slow
compared with proliferation of QSARs because of the nature
of the biodegradability endpoint
- QSBR is very complex because
1. Chemical structure
2. Environmental conditions
3. Bioavailability of the chemical
QSBR
- Limitations often associated in developing QSBR
1. Only within cogeneric series of chemicals
2. The absence of standardised and uniform
biodegradation databases
- Recent years, a very intensive development of new and better
qualitative and quantitative biodegradability models was
observed
- How many QSBR have been developed ?
A literature search on QSBR was performed including literature
published showed more than 84 models
- However, only a few models provided an acceptable level of
agreement between estimated and experimental data
QSBR
- All QSBR models until 1994 were reviewed by
several researchers for their applicability
1. Group contribution method (OECD, PLS,
BIOWIN, MultiCASE)
2. Chemometric methods (CART)
3. Expert system (BESS, CATABOL, TOPKAT)
- According to the previous studies, the group
contribution method seems to be the most applied
and successful way of modeling biodegradation
QSBR
OECD hierarchical model approach
Multivariable Partial Least Approach (PLS) model
BIOWIN
MultiCASE anaerobic program
Group Contribution Method
Provide estimates of biodegradability useful in chemical screening under aerobic condition (1,2,5,6)
Provide approximate time required to biodegrade in a stream (3,4) Recently, BIOWIN was updated and now it can estimate anaerobic
biodegradation potential (7)
BIOWIN has 7 models (U.S. EPA, 2007)
BIOWIN1 BIOWIN2 BIOWIN3 BIOWIN4 BIOWIN5 BIOWIN6 BIOWIN7
linear non-linear Ultimate Primary linear Non-linear
Based on regressions against 36 preselected chemical structures plus molecular weight of experimental biodegradation data for 295 compounds (BIODEG)
Based on regressions of biodegradability estimates from a survey of experts for a suite 200 organic chemicals against the same chemical substructures plus molecular weight
Based on regressions of data from the Japanese MITI database against a modified set of chemical substructures plus molecular weight
Based on BIOWIN fragment contribution approach.
What Does the BIOWIN Model Do?
Materials and method
Finding Molecular DescriptorsSofrware Delft Chemtech, Dragon, Chem3D etc…
Selection of Molecular Descriptors1. PCA (SPSS)2. Genetic Algorithm-Variable Subset Selection (Mobydigs)
Principal Component Analysis
Structure Matrix
.952
.905
.900
.879 -.395
.780
.723
.714 .898
.379 -.855
.444 .720-.358 .713
MWeqwidthwidthMVREJdepthlengthdipoleHL_surflog_Kowpo_surfBiowin3
1 2Component
Extraction Method: Principal Component Analysis. Rotation Method: Oblimin with Kaiser Normalization.
Variables: MW, MV, log Kow, dipole, length, width, depth, equiv width, % HL surface, polar surface are
Assessment of the suitability of the data for PCA- KMO > 0.6 (KMO = 0.6), Barlett’s Test of Sphericity < 0.05 (<0.005)
Determination of the number of factors by Kaise criterion, scree plot and Montecarlo parallel analysis
Principal Component Analysis (PCA)
The two-component solution explained a total of 67% of the variance with Component 1 contributing 46% and Component2 contributing 21%; Component 1: SIZE and component 2: Hydrophobic/Hydrophilicity
HP-neuHP-ion
HL-neu
HL-ion
Classification PhACs - PCA
Dependent variable
Independent variables (Indices, Chemical descriptors)
BIOWIN3 MW, MV, log Kow, dipole, length, width, depth, equiv width, % HL surface, polar surface area
R2 STD. Error
Sig.
(p)
Rej. range
(%)
BIOWIN3 range
Equation to predict biodegradation
HL 0.76 0.21 < 0.05 6.70-98.5
(75)
1.86- 3.60
(2.8)
2.842-0.168logKow-0.008MV+1.039length
(-59+170.06width)
HP1 - - - 37.5-99.1
(86)
1.52-2.96
(2.5)
-
198+7.53log_Kow-42.75length-94.09eqwidth
HL-ionic 0.55 0.25 < 0.05 74.8-96.9
(91)
1.86-3.03
(2.6)
3.536-0.009MW+0.934length
(138.81-5.04logKow-13.84length-94.09HL_surf)
HP-ionic1
- - < 0.05 74.8-99.1
(95)
2.16-2.96
(2.7)
-
(198.38+7.53logKow-42.57length-94.09eqwidth)
HL-neutral
0.84 0.19 < 0.05 6.7-98.5
(60)
2.28-3.59
(2.9)
3.323-1.88logKow-0.004MV
(-119.89+4.53logKow+27704eqwidth)
HP-neutral
0.35 0.23 < 0.05 37.5-98.1
(79.7)
1.52-2.68
(2.3)
3.493-4.30logKow
(122.38-32.16logKow+109.73eqwidth-0.78HL_surf)
1. HP and HP-ionic compounds were not feasible to come up with equation because of collinearity problem in variables
(Violation in MLR assumptions)
Biodegradation (Aerobic)
Innovative system for removal of micropollutants – RBF and NF membrane
RBF
Membrane
months
longer
weeks
days
days - weeks
weeks - months
Organic micropollutants
QSAR
Biological treatmentPhysical/Chemical
Treatment
MembraneGAC AOP
NF RO Cl2 O3
ARRRBF /DUNE
BIOWIN
KowK O3
MW
Process selection and comparative performance assessment
QSAR Models Decision Support Framework
GIST
Analysis of PhACsLC-MS / AUTO SPE
Selection of Target compounds
Physical-chemical characteristicsVs. Water treatments
Selection of Target compounds
QSAR Tools
Selection of Water Treatments Selection of Water Treatments
Selected water Treatments
Classification, Database, Model development
PhACs removal using selected water treatments by GIST
PhACs removal using selected water treatments by
UNESCO-IHE
A decision support tool for PhACS removal for water utility
2008
2009
2010
Current QSAR project