chapter 7 assessment of groundwater...
TRANSCRIPT
176
CHAPTER 7
ASSESSMENT OF GROUNDWATER QUALITY USING
MULTIVARIATE STATISTICAL ANALYSIS
7.1 GENERAL
Many different sources and processes are known to contribute to the
deterioration in quality and contamination of water, both surface and
groundwater. So a thorough understanding of the nature and extent of
contamination in an area requires detailed hydrochemical data (Helena et al
1999). Unfortunately, very few studies have so far been undertaken combining
the effects of multiple water quality variables in order to evaluate the water
quality, the extent and nature of contamination (Shuxia et al 2003).
Conventional techniques including Stiff and Piper plots only consider major
and minor ions to assess the chemical quality of water, whether surface or
groundwater. Considering the limitations of these traditional methods to
express the water quality and also the recent advances in analytical capabilities
and the availability of larger numbers of chemical parameters, wide ranging
statistical techniques are now needed to assess the water quality, nature and
extent of contamination. In this regard, factor analysis is useful for interpreting
groundwater quality data and relating those data to specific hydro-geologic
and anthropogenic processes (Bakac 2000).
Multivariate data can be defined as an observational unit
characterized by several variables. An example of data appropriate for
multivariate analysis is the chemical quality of water, which depends on
177
factors like composition of host rock, slope of ground, movement of water,
etc. The chemical characteristics of water play a vital role vis-a-vis potable,
agricultural and industrial purposes. Cluster analysis is one statistical tool to
group similar pairs of correlation in a large symmetric matrix. It reduces even
large data set into groups with similar characteristics. It provides logical and
pair-by-pair comparison between various chemical constituents. The results
of cluster analysis can be presented in a two-dimensional hierarchical
diagram, by which the natural breaks between the groups become obvious.
An observer can pick up groups at any desired level of similarity or
dissimilarity (Parks 1966; Till 1974; Rao 2003; Bhabesh et al 2007).
Statistical associations do not necessarily establish cause-and-effect
relationships, but do present the information in a compact format as the first
step in the complete analysis of the data. That can assist in generating
hypothesis for the interpretation of hydro-chemical processes.
Statistical techniques, such as cluster analysis, can provide a
powerful tool for analyzing water-chemistry data. These methods can be
grouped into distinct populations (hydro-chemical groups) that are significant
in the geologic context, as well as from a statistical point of view. Cluster
analysis was successfully used (Alther 1979; Williams 1982; Farnham et al
2000) and applied to classify water-chemistry data (Ciineyt Giiler et al 2002).
Mapping of groundwater contamination is often complicated by
infrequent and uneven distribution of monitoring locations, analytical errors
in sample analyses, and large spatial variation in observed contaminants
over short distances due to complex hydro geologic conditions. While
numerical simulation modeling is commonly used to delineate
groundwater contamination plumes, this approach may be limited by
insufficient knowledge of local hydrostratigraphic conditions. Also,
178
managing and mapping extensive water quality datasets can be difficult due
to the multiple locations, times, and analysis that may be present.
An alternative to numerical simulation modeling uses statistical
analysis of groundwater quality data to infer zones of potential
contamination. Many studies have been conducted using Principal
Component Analysis (PCA) in the interpretation of water quality
parameters. PCA is a multivariate statistical procedure designed to
classify variables based on their correlations with each other. The goal of
PCA and other factor analysis procedures is to consolidate a large number
of observed variables into a smaller number of factors that can be more
readily interpreted. In the case of groundwater, concentrations of different
constituents may be correlated based on underlying physical and
chemical processes such as dissociation, ionic substitution or carbonate
equilibrium reactions. PCA helps to classify correlated variables into
groups more easily interpreted as these underlying processes. The number
of factors for a particular dataset is based on the amount of non-random
variation that explains the underlying processes. The more factors extracted,
the greater is the cumulative amount of variation in the original data.
Environmental monitoring system has been carrying out a lot of
water quality monitoring programs in recent years, but many of those
monitoring programs contain complicated data sets. These include physical
properties, aggregate organic constituents, nutrients and inorganic
constituents and biological and microbiological situations. These are difficult
to analyze and interpret on account of the latent interrelationships among
parameters and monitoring sites. Thus, it is necessary to extract meaningful
information from large and complicated data sets without missing useful
information. It is also essential to optimize the monitoring network by
recognizing the representative parameters, delineating monitoring sites and
179
identifying latent pollution sources (Pekey et al 2004). The application of
multivariable statistical methods offers a better understanding of water quality
for interpreting the complicated data sets. Traditional multivariable statistical
methods such as FA and Correlation matrix have been widely accepted in
water quality assessment.
The objective of the study is to extract information about:
the similarities or dissimilarities between the monitoring
periods and monitoring sites
significant parameters responsible for temporal and spatial
variations in water quality.
expose hidden factors accounting for the structure of the data
and
the influence of the possible sources on the water quality
parameters.
The final results may be helpful for effective water quality management
as well as rapid solutions on pollution problems (Morales et al 1999).
7.2 FACTOR ANALYSIS
Factor analysis attempts to explain the correlations between the
observations in terms of the underlying factors, which are not directly
observable (Yu et al 2003). There are three stages in factor analysis
(Gupta et al 2005):
For all the variables a correlation matrix is generated.
Factors are extracted from the correlation matrix based on
the correlation coefficients of the variables.
To maximize the relationship between some of the factors
and variables, the factors are rotated.
180
The first step is the determination of the parameter correlation
matrix, which has been done in the previous stage. It is used to account for
the degree of mutually shared variability between individual pairs of water
quality variables. Then, Eigen values and factor loadings for the correlation
matrix are determined. Eigen values correspond to an Eigen factor, which
identifies the groups of variables that are highly correlated among them.
Lower Eigen values may contribute little to the explanatory ability of the data.
Only the first few components are needed to account for much of the
parameter variability. Once the correlation matrix and Eigen values are
obtained, component loadings are used to measure the correlation between the
variables and components. Component rotation is used to facilitate
interpretation by providing a simpler factor structure (Zeng and Rasmussen
2005).
This study evaluated the possibility that a smaller group of water
quality parameters/locations might provide sufficient information for water quality
assessment. Principal component analysis was applied to a groundwater quality
data set collected from the study area of Tirupur Region, Tirupur District, Tamil
Nadu, India, using ‘the Statistical Package for the Social Sciences Software-
SPSS 14.0 for Windows’. Water quality monitoring was conducted at 62
sample locations within the study area during the seasons (June–July 2006,
November–December 2006 and June-July 2011). The selected parameters for
the estimation of groundwater quality characteristics are: Turbidity, pH, total
hardness (TH), total dissolved solids (TDS), calcium (Ca2+
), magnesium
(Mg2+
), sodium (Na+), potassium (K
+), bicarbonate (HCO3
-), sulphate (SO4
2-),
chloride (Cl-) nitrate (NO3
-), fluoride (F) and iron (Fe).
7.2.1 Spatial variation of groundwater quality using factor analysis
The whole study area was analyzed for factor analysis, for the pre-
monsoon (2006), post-monsoon (2006) and post-monsoon (2011). Factor
181
analysis is a multivariate statistical technique used not only to condense but
also to simplify the set of large number of variables to smaller number of
variables called factors. This technique is helpful to identify the underlying
factors, which determine the relationship of the observed variables. It
provides an empirical classification scheme of clustering of statement into
groups called factors.
7.2.1.1 Spatial variation of groundwater quality for the pre-monsoon
(2006)
The Factor Analysis (FA) generated three significant factors for the
pre-monsoon period, which are explained as 75.922 % of the variance in data
sets. Table 7.1 gives the rotated factor loadings, communalities, Eigen values
and the percentage of variance explained by these factors. In order to reduce
the number of factors and enhance the interpretability, the factors are rotated.
The rotation usually increases the quality of interpretation of the factors.
There are several methods of the initial factors matrix to attain simple
structure of the data. In this regard, Principal Components Analysis (PCA)
is widely used. After PCA rotation, each original variable tends to be
associated with one (or a small number) of the factors and each factor
represents only a small number of variable. Table 7.2 shows the summary
statistics of water quality parameters for the pre-monsoon (2006). The
parameters are grouped based on the factor loadings and the following factors
are identified:
Factor 1 (F1): TDS, TH, Ca, Cl, F, SO4, Na, K, HCO3 and NO3
Factor 2 (F2): Turbidity, Mg and Fe
Factor 3 (F3): pH
F1, F2 and F3 have been explained as 55.609 %, 13.011 % and
7.301 % of the variance respectively. The F1 has a high positive loading in
182
TDS, Na, Cl, K, TH, Ca, SO4, HCO3, NO3 and F which are 0.987, 0.93, 0.911,
0.894, 0.88, 0.844, 0.817, 0.767, 0.686 and 0.408 respectively. High positive
loading indicated strong linear correlation between the factor and the
parameters. The relationships of factor loadings on the groundwater variables
are shown in Figure 7.1 for pre-monsoon (2006).
Table 7.1 Rotated factor loadings of groundwater samples for the
pre-monsoon (2006)
Sl.No ParametersFactors
Communalities1 2 3
1 Turbidity 0.425 0.792 -9.981 0.818
2 TDS 0.987 -5.364 0.025 0.978
3 pH 0.146 0.174 0.911 0.882
4 TH 0.880 -5.982 -0.147 0.800
5 Ca 0.844 -9.348 -0.127 0.736
6 Mg 0.798 9.544 -0.176 0.668
7 Cl 0.911 4.397 -0.065 0.837
8 F 0.408 -0.361 -0.154 0.32
9 SO4 0.817 -0.178 0.183 0.732
10 Na 0.930 -0.124 0.122 0.895
11 K 0.894 0.16 0.079 0.831
12 HCO3 0.767 0.108 -0.059 0.604
13 Fe 0.315 0.851 -0.042 0.826
14 NO3 0.686 -0.456 0.157 0.702
15 Eigen value 7.785 1.822 1.022 10.629
16 % of Variance 55.609 13.011 7.301 75.922
17 Cumulative % 55.609 68.62 75.922 -
183
Table 7.2 Summary statistics of groundwater quality parameters for the
pre-monsoon (2006)
Sl.
NoParameters Minimum Maximum Mean Variance
Std.
Deviation
1 Turbidity 2 18 6.65 11.15 3.339
2 TDS 399 3672 1291.97 500709.18 707.608
3 pH 7.30 8.25 7.70 1.00 1.00002
4 TH 192 956 460 37055.06 192.497
5 Ca 35 288 105.94 2373.14 48.715
6 Mg 13 107 50.76 472.32 21.733
7 Cl 31 1092 333.71 61172.14 247.33
8 F 0 2 0.90 0.22 0.4649
9 SO4 4 382 85.94 4898.88 69.992
10 Na 24 720 180.5 19124.29 138.291
11 K 7 224 66.6 2175.10 46.638
12 HCO3 129 733 346.9 16565.27 128.706
13 Fe 0 1.20 0.124 0.05 0.2193
14 NO3 6 520 79.47 8873.11 94.197
184
Figure 7.1 Distribution of variables among factors given by factor
analysis for the pre-monsoon (2006)
185
7.2.1.2 Spatial variation of groundwater quality for the post-monsoon
(2006)
The FA generated three significant factors for the post-monsoon
period, which are explained as 74.458 % of the variance in data sets.
Table 7.3 gives the rotated factor loadings, communalities, Eigen values and
the percentage of variance explained by these factors. The factors are rotated.
The rotation increases the quality of interpretation of the factors. There are
several methods of the initial factors matrix to attain a simple structure of the
data. For this purpose, PCA is widely used. Table 7.4 shows the summary
statistics of water quality parameters for the post-monsoon (2006). The
parameters are grouped based on the factor loadings and the following factors
are indicated:
Factor 1 (F1): TDS, Cl, TH, Ca, Fe, Mg, SO4, F, NO3 and HCO3.
Factor 2 (F2): pH and K
Factor 3 (F3): Na and Turbidity
F1, F2 and F3 have been explained as 51.946 %, 13.825 % and
8.687 % of the variance respectively. F1 has a high positive loading in
TDS, Cl, TH, Ca, Fe, Mg, SO4, F, SO4, and NO3 which are 0.966, 0.966,
0.947, 0.864, 0.842, 0.798, 0.777, 0.701 and 0.568 respectively. The high
positive loading indicated strong linear correlation between the factor and the
parameters. The relationships of factor loadings on the groundwater variables
are arrayed in Figure 7.2, for post-monsoon (2006).
186
Table 7.3 Rotated factor loadings for the post-monsoon (2006)
ParametersFactors
Communalities1 2 3
Turbidity 0.517 0.246 0.549 0.629
TDS 0.966 -0.031 -0.120 0.949
pH -0.412 0.566 -0.212 0.536
TH 0.947 0.151 -0.127 0.935
Ca 0.864 -0.037 -0.434 0.937
Mg 0.798 -0.098 0.119 0.661
Cl 0.966 0.104 -0.104 0.955
F 0.701 0.018 0.069 0.497
SO4 0.777 0.236 -0.249 0.722
Na 0.605 -0.028 0.615 0.745
K 0.400 0.528 0.357 0.567
HCO3 0.186 -0.861 0.171 0.804
Fe 0.842 0.161 -0.123 0.749
NO3 0.568 -0.637 -9.354 0.737
Eigen value 7.272 51.946 51.946 111.164
% of Variance 51.946 13.825 8.687 74.458
Cumulative % 51.946 65.771 74.458 -
Table 7.4 Summary statistics of water quality parameters for the post-
monsoon (2006)
Parameters Minimum Maximum Mean Variance Std. Deviation
Turbidity 0 38 7.58 5.925 35.107
TDS 198 5,119 1,164.68 831.18 690859.402
pH 7.07 8.85 7.68 0.34695 0.12038
TH 114 2,558 696 470.821 221672.451
Ca 15 1,023 149.27 164.461 27047.35
Mg 0 319 74.56 61.638 3799.299
Cl 18 2,249 359.89 403.596 162890.069
F 0 1 0.40 0.330 0.1090
SO4 0 427 79.47 85.331 7281.335
Na 8 220 88.63 50.552 2555.483
K 1 91 22.82 19.732 389.361
HCO3 53 650 186 134.099 17982.609
Fe 0 1.20 0.166 0.2055 0.0422
NO3 0 125 34.05 25.408 645.555
187
Figure 7.2 Distribution of variables among factors given by factor
analysis for the post-monsoon (2006)
188
7.2.1.3 Spatial variation of groundwater quality for the pre-monsoon
(2011)
The FA generated three significant factors for the pre-monsoon
(2011), which are explained as 72.879% of the variance in data sets. Table 7.5
gives the rotated factor loadings, communalities, Eigen values and the
percentage of variance explained by these factors. Among the several
methods of the initial factors matrix to attain simple structure of the data,
PCA is widely used. Table 7.6 expresses the summary of statistics of
water quality parameters for the pre-monsoon (2011).
Table 7.5 Rotated factor loadings of groundwater samples for the pre-
monsoon (2011)
Sl.No ParametersFactors
Communalities1 2 3
1 Turbidity 0.139 0.073 0.781 0.635
2 TDS 0.813 0.544 0.098 0.967
3 pH -0.266 0.111 0.788 0.704
4 TH 0.963 0.139 0.073 0.934
5 Ca 0.838 0.813 0.544 0.715
6 Mg 0.841 -0.266 0.111 0.709
7 Cl 0.910 0.963 0.067 0.960
8 F -0.084 0.838 0.047 0.223
9 SO4 0.738 0.841 0.014 0.852
10 Na 0.518 0.910 0.359 0.928
11 K 0.291 -0.084 -0.013 0.788
12 HCO3 -0.122 0.738 0.334 0.525
13 Fe 0.771 0.518 0.793 0.603
14 NO3 -0.003 0.291 0.838 0.661
15 Eigen value 5.433 2.855 1.915 10.203
16 % of Variance 38.808 20.395 13.676 72.879
17 Cumulative % 38.808 59.203 72.879 -
The parameters are grouped based on the factor loadings and the
following factors are explained.
189
Factor 1 (F1): TDS, TH, Ca, Mg and K
Factor 2 (F2): Cl, F, SO4, Na and HCO3
Factor 3 (F3): Turbidity, pH, Fe and NO3
F1, F2 and F3 have been explained as 38.808%, 20.395% and
13.676% of the variance respectively. The F1 has a high positive loading
in TH, Mg, Ca, TDS and K which are 0.963, 0.841, 0.838, 0.813 and 0.291
respectively. The high positive loading indicated strong linear correlation
between the factor and the parameters. The relationships of factor loadings
on the groundwater variables are furnished in Figure 7.3, for pre-monsoon
(2011).
Table 7.6 Summary statistics of groundwater quality parameters for the
pre-monsoon (2011)
Sl.
NoParameters Minimum Maximum Mean Variance
Std.
Deviation
1 Turbidity 0 18 6.40 12.704 3.564
2 TDS 543 5990 1763.71 1034541.291 1017.124
3 pH 6.60 8.00 7.56 0.088 0.2969
4 TH 212 3600 776.68 277367.107 526.657
5 Ca 28 913 166.02 19395.951 139.269
6 Mg 0 480 92.15 5554.766 74.530
7 Cl 34 3190 541.24 265453.231 515.222
8 F 0 2.10 0.70 1018.741 31.9177
9 SO4 0 1210 158.8905 34280.769 185.15066
10 Na 24 1120 224.45 40603.498 201.503
11 K 7 269 67.40 3725.359 61.036
12 HCO3 138 787 411.02 23009.524 151.689
13 Fe 0 1.10 0.191 0.031 0.1760
14 NO3 0 569 76.118 7113.857 84.3437
190
Figure 7.3 Distribution of variables among factors given by
factor analysis for the pre-monsoon (2011)
191
7.3 CORRELATION OF PHYSICOCHEMICAL PARAMETERS
OF GROUNDWATER
Correlation coefficient is a commonly used measure to establish the
relationship between two variables. It is simply a measure to exhibit how
well one variable predicts the other (Kurumbein and Graybill 1965). It is used
to account for the degree of mutually shared variability between individual
pairs of water quality variables. The application has been broadened to study
the relationship between two or more hydrologic variables, and also to
investigate the dependence between successive values of a series of
hydrologic data. The analytical data of 62 groundwater samples for the
seasons spread over the study area are correlated. The groundwater quality
parameters considered for correlation are Turbidity, TDS, pH, TH, Ca, Mg,
Cl, F, SO4, Na, K, HCO3, Fe and NO3. In general, highly polluted
groundwater samples have low oxidation-reduction potential because of the
reducing atmosphere (Sunil Kumar Srivastava and Ramanathan 2007). The
results are summarized in Tables 7.7, 7.8 and 7.9 for the seasons.
7.3.1 Correlation of physicochemical parameters of groundwater for
the pre-monsoon (2006)
During the pre-monsoon (2006), the study illustrated that TDS
showed good positive correlation with Na and K. Also the pairs of TDS-
TH, TDS-Ca, TDS-SO4, TH-Ca, TH-Mg, Cl-Na, Cl-K, Na-SO4 and Na-K
have more significant correlations. TDS-Mg, HCO3, TDS-NO3, TDS-TH-Cl,
TH-Na, TH-HCO3, Ca-Cl, Ca-Na Na-NO3 and Turb-Fe have good positive
correlations. Further, TH- SO4, TH-K, Ca-Mg, Ca-SO4, Ca-K, Ca-HCO3,
Mg-Cl, Mg-Na, Mg-K, Mg-HCO3, Cl-SO4, Cl-HCO3, Na-HCO3, K-HCO3,
TH-NO3, Mg-SO4, Mg-K, Mg-NO3 pairs exhibit positive correlations. The
details are illustrated in Table 7.7.
192
7.3.2 Correlation of physicochemical parameters of groundwater
for the post-monsoon (2006)
During the post-monsoon (2006), the study proved that TDS
showed good positive correlation with Ca and Cl, and TH with Cl. The
pairs of TH-Ca, TH-Fe, Ca-Cl, Mg-Cl, Cl-Fe also showed more significant
correlation. TDS-Mg, TDS-SO4, TDS-Na, TDS-Fe, TH-Mg, TH-SO4, Ca-
SO4, Ca-Fe, Cl-SO4, also indicated good positive correlations. Also, the pairs
of TDS-Fe, TDS-NO3, TH-F, Ca-Mg, Ca-F, Ca-NO3, Mg-F, Mg-SO4, Mg-Fe,
Cl-F, Cl-Na exhibited positive correlations. The details are given in Table 7.8.
7.3.3 Correlation of physicochemical parameters of groundwater
for the pre-monsoon (2011)
During the pre-monsoon (2011), the study evolved that TDS
showed good positive correlation with Cl and TH. The pairs of TDS-TH,
TDS-Na, TH-Ca, TH-Mg, Mg-Cl and Na-K showed more significant
correlations. Also TDS-Ca, TDS-Mg, TDS-SO4, TH-SO4, Ca-Cl, Cl-SO4, Cl-
Na and Na-SO4 indicated good positive correlations. Further TDS-K, TH-Fe
and Cl-Fe exhibited positive correlations. The details are given in Table 7.9.
193
Table 7.7 Correlation of physicochemical parameters of groundwater during the pre-monsoon (2006)
Parameters Turbidity TDS pH TH Ca Mg Cl F SO4 Na K HCO3 Fe NO3
Turbidity 1.0000 `
TDS 0.3671 1.0000
pH 0.1111 0.1331 1.0000
TH 0.3373 0.8379 0.0628 1.0000
Ca 0.3036 0.8272 0.0662 0.8201 1.0000
Mg 0.3559 0.7533 0.0376 0.8031 0.6011 1.0000
Cl 0.3831 0.3287 0.0887 0.7724 0.7865 0.6775 1.0000
F -0.0294 0.3749 0.0061 0.3574 0.4173 0.3093 0.3860 1.0000
SO4 0.2188 0.8106 0.1574 0.6705 0.6371 0.5768 0.6539 0.2893 1.0000
Na 0.2868 0.9688 0.1506 0.7396 0.7251 0.6244 0.8885 0.3427 0.8108 1.0000
K 0.4459 0.9049 0.1619 0.6814 0.6271 0.6211 0.8817 0.2904 0.7187 0.8848 1.0000
HCO3 0.3351 0.7215 0.1143 0.7048 0.6831 0.6755 0.6208 0.1423 0.5213 0.6203 0.6491 1.0000
Fe 0.7451 0.2560 0.1297 0.1801 0.1606 0.2275 0.3013 -0.0338 0.1372 0.1880 0.4337 0.2752 1.0000
NO3 -0.0032 0.7016 0.0711 0.5396 0.5110 0.5273 0.4796 0.3352 0.7037 0.7417 0.5340 0.4448 -0.10344 1.0000
194
Table 7.8 Correlation of physicochemical parameters of groundwater during the post-monsoon (2006)
Parameters Turbidity TDS pH TH Ca Mg Cl F SO4 Na K HCO3 Fe NO3
Turbidity 1.0000 `
TDS 0.3707 1.0000
pH -0.1936 -0.3611 1.0000
TH 0.4946 0.9079 -0.3348 1.0000
Ca 0.2267 0.9066 -0.2854 0.8601 1.0000
Mg 0.3872 0.7995 -0.3380 0.7003 0.5477 1.0000
Cl 0.4209 0.9772 -0.3165 0.9331 0.8813 0.8045 1.0000
F 0.4296 0.6283 -0.1685 0.6847 0.5260 0.6296 0.6350 1.0000
SO4 0.3718 0.7410 -0.1627 0.7560 0.7786 0.5205 0.7666 0.4072 1.0000
Na 0.5315 0.7410 -0.2966 0.4535 0.2900 0.4614 0.5360 0.3647 0.4021 1.0000
K 0.2825 0.3785 0.0215 0.3769 0.1973 0.2787 0.4132 0.2225 0.2705 0.3939 1.0000
HCO3 -0.1226 0.2196 -0.4344 -0.0066 0.1179 0.3164 0.0723 0.1121 -0.0833 0.2526 -0.1984 1.0000
Fe 0.3763 0.7800 -0.2796 0.8587 0.7407 0.573 0.8070 0.4917 0.6718 0.4364 0.413 0.0093 1.0000
NO3 0.1834 0.5479 -0.4688 0.4392 0.5888 0.3573 0.4536 0.3892 0.2958 0.3175 -0.1023 0.0093 0.3976 1.0000
195
Table 7.9 Correlation of physicochemical parameters of groundwater during the pre-monsoon (2011)
Parameters Turbidity TDS pH TH Ca Mg Cl F SO4 Na K HCO3 Fe NO3
Turbidity 1.000
TDS 0.218 1.000
pH 0.410 -0.075 1.000
TH 0.108 0.817 -0.274 1.000
Ca 0.172 0.767 -0.150 0.836 1.000
Mg -0.066 0.708 -0.171 0.801 0.591 1.000
Cl 0.104 0.950 -0.235 0.910 0.789 0.803 1.000
F -0.227 -0.070 -0.183 -0.022 -0.019 -0.027 -0.039 1.000
SO4-0.209 0.709 -0.487 0.726 0.585 0.582 0.769 0.004 1.000
Na 0.056 0.810 -0.232 0.545 0.367 0.427 0.752 -0.048 0.754 1.000
K 0.153 0.669 -0.001 0.309 0.201 0.170 0.571 -0.068 0.495 0.841 1.000
HCO30.212 0.272 0.360 -0.093 0.020 0.026 0.089 -0.153 -0.124 0.294 0.323 1.000
Fe 0.196 0.575 -0.070 0.679 0.475 0.596 0.634 -0.061 0.546 0.456 0.347 -0.116 1.000
NO30.166 0.464 0.279 0.087 0.137 0.026 0.261 -0.020 0.172 0.538 0.515 0.345 -0.032 1.000
196
7.4 CLUSTER ANALYSIS
The assumptions of cluster analysis techniques include
homoscedasticity (equal variance) and normal distribution of the variables
(Alther 1979). However, an equal weighing of all the variables requires long-
transformation and standardization (z-scores) of the data. Comparisons based
on multiple parameters from different samples are made and the samples are
grouped according to their ‘similarity’ to each other. The classification of
samples according to their parameters is termed Q-mode classification. This
approach is commonly applied to water-chemistry investigations in order to
define groups of samples that have similar chemical and physical
characteristics. This is because rarely is a single parameter sufficient to
distinguish between different water types. Individual samples are compared
with the specified similarity/dissimilarity and linkage methods are then
grouped into clusters. The linkage rule used here is Ward’s method (Ward
1963). Linkage rules iteratively link nearby points (samples) by using the
similarity matrix. The initial cluster is formed by linkage of the two samples
with the greatest similarity. Ward’s method is distinct from all the other
methods because it uses an analysis of variance (ANOVA) approach to
evaluate the distances between clusters. Ward’s method is used to calculate
the error sum of squares, which is the sum of the distances from each
individual to the center of its parent group (Judd 1980). This form smaller
distinct clusters than those formed by other methods (StatSoft.Inc.1995).
Cluster analysis has been carried out to substitute the geo-
interpretation of hydogeochemical data. Cluster analysis has been useful in
studying the similar pair of groups of chemical constituents of water. The
similarity/dissimilarity measurements and linkage methods used for clustering
greatly affect the outcome of the Hierarchical Cluster Analysis (HCA) results.
After a careful examination of the available combination of
197
similarity/dissimilarity measurements, it was found that using Euclidean
distance (straight line distance between two points in c-dimensional space
defined by c variables) as similarity measurement, together with Ward’s
method for linkage, produced the most distinctive groups. In these groups
each member within the group is more similar to its fellow members than to
any other member from outside the group. The HCA technique does not
provide a statistical test of group differences; however, there are tests that can
be applied externally for this purpose (Ciineyt Giiler et al 2002). It is also
possible in HCA results that one single sample that does not belong to any of
the groups is placed in a group by itself. This unusual sample is considered as
residue. The values of chemical constituents were subjected to hierarchical
cluster analysis. Based on the indices of correlation coefficients, similar pairs
groups of chemical constituents have been linked. Then the next most similar
pairs of groups and so on, until all the chemical constituents have been
clustered in a dendrogram by an averaging method (Davis 1973; 1986).
7.4.1 Cluster analysis of groundwater samples
A 14 X 14 matrix of correlation coefficients is computed to perform
cluster analysis (Tables 7.7, 7.8 and 7.8). Correlation matrices of various
stages of clustering were then obtained. Hierarchical dendrogram for the
clustering, (Figures 7.4, 7.5 and 7.6) for the pre-monsoon (2006), post-
monsoon (2006) and pre-monsoon (2011), of the determined physical and
chemical parameters for all the studies sites were plotted. Dendrogram in CA
provided a useful graphical tool for determining the number of clusters that
describe the underlying process leading to spatial variation (Papaioannai.et al
2010). The CA results established that the parameters were principally
separated into two big clusters.
198
Cluster 1 (10 parameters are included) F, Fe, Turbidity. pH,
Mg, K, Ca, SO4, NO3 and Na)
Cluster 2 Cl, HCO3 and TH
A careful consideration of the content of clusters reveals that during
the pre-monsoon the first cluster included dominant chemical parameters (F,
Fe, Mg, K, Ca, SO4, NO3, and Na) and two physical parameters (Turbidity
and pH). The second cluster consisted of two chemical parameters (Cl and
HCO3) and one physical parameter (TH). During the post-monsoon, the first
cluster included dominant chemical parameters (F, Fe, Turbidity, K, NO3,
Mg, Na, SO4, Ca and HCO3) and one physical parameter (Turbidity). The
second cluster included one chemical parameter (Cl) and one physical
parameter (TH). In all the seasons, the physical parameter TDS was seen
clustering as independently.
Figure 7.4 Dendrogram for cluster analysis of groundwater for the pre-
monsoon (2006)
199
Figure 7.5 Dendrogram for cluster analysis of groundwater for the
post-monsoon (2006)
Figure 7.6 Dendrogram for cluster analysis of groundwater for the pre-
monsoon (2011)
200
The data analysis gave an idea of how the single physicochemical
parameters should be compared and related with all the physicochemical
values simultaneously, not individually. For instance, within a group of water
samples (Figure 7.4) like (Cl, HCO3 and TH), there is a stronger relation
between the group of chemical parameters (Cl and HCO3) and the physical
parameter (TH) or with parameters like (SO4, NO3 and Na) to the chemical
parameters (F, Fe, Mg, K and Ca) and physical parameters (Turbidity and
pH). The study revealed that in all the seasons the clustering parameters
were more or less same type.