Indian Journal of Chemistry Vol. 41A, March 2002, pp. 457-461
Exponent-dependent properties of the connectivity index
Ivan Gutman*, Mirko Lepovic & Dusica Vidovic Faculty of Science, University of Kragujevac, P. O. Box 60, YU-34000 Kragujevac, Yugoslavia
and
Lane H Clark Southern Illinois University at Carbondale, Carbondale, IL 62901-4408, USA
Received 20 November 2001
The connectivity index is defined as e(A) =~(ouOvl' , where Ov is the degree of the vertex v of the respective molecular graph, and where the summation embraces all pairs of adjacent vertices. The exponent A is usually chosen to be equal to -0.5 but other options have been considered as well. especially C( -I). We show that whereas C( -0.5) correctly reflects the extent of branching of the carbon-atom skeleton of organic molecules, and is thus a suitable topological index for modelling physico-chemical properties of the respective compounds, this is not the case when the exponent A assumes larger negative values, in particular when 1.=-1. The value of A is established beyond which C(A) fail s to be a measure of molecular branching.
Introduction The connectivity index (introduced in 1975 and
originally called "branching index"] is defined as:
.. . (1) /l, V
where 8. denotes the degree (= number of first neighbours) of the vertex v of the molecular graph G, and where the summation embraces all pairs of adjacent vertices of G. This structure-descriptor eventually became one of the most popular topological indices. On its applications for predicting physico-chemical and pharmacologic properties of organic compounds two books2.3 and scores of papers were written; details and further bibliography can be found in the recent monographs4
-6 and the review7
.
Formula (1) is a special case of a more general "connectivity index" C, defined as:
... (2) II , V
Evidently, X,= C( -0.5). Several recent mathematical studies8
.9 are devoted
to C(A), especially C( -I). We mention in passing that C(A) for A=+ I is the so-called "2nd Zagreb group index"; for details see the handbookS.
In the earlier paper 1 no convincing argument was given why in formula (2) one should choose A= -0.5.
The original aim was to provide a measure of the branching of the carbon-atom skeleton of alkanes. Based on the analysis of the data obtai~,~d for butane, pentane and hexane isomers one arrived at the conclusion I that both choices A= -0.5 and A = -I were equally plausible. Nevertheless, preference was given to the former choice because C( -0.5) had greater isomer-discriminating power than C(-l).
The possibility that the exponent A assumes values other than -0.5 was examined by several authors 10-14.
Indeed, if one would view the exponent A as a variable that is adjusted so as to optimize the correlation between C(A) and some physico-chemical property, then A so determined would differ from -0.5. The disadvantage of such an approach is that the A-value so determined would significantly depend both on the physico-chemical property considered and on the sample of molecules employed.
The option A = -I was recently examined in some detail by Moon and one of the present authors8
. In what follows we denote C( -1)= C( -I ;G) by J..l = J..l(G) .
In this paper we are concerned with certain properties of the connectivity index C(A) of trees. Recall that a tree is a connected acyclic graph. A chemical tree is a tree with property 8v ~ 4 for all vertices v. Chemical trees provide a graph representation of alkanes .
Connectivity indices vs. branching Much work has been devoted to provide a precise
458 INDIAN J. CHEM., SEC. A, MARCH 2002
characterization of "branching" and to order (isomeric) molecules according to their extent of branching; see 15
-19 for details and further references. In spite
of all these efforts there is no general agreement about what "branching" is and how it could/should be measured.
With regard to the extent of branching of trees, there is a generally accepted (and intuitively plausible) assertion: Among n-vertex trees the path Pn and the star Sn (see Fig.l) are the least and the most branched species, respectively. In view of this, a necessary condition for any topological index to be an acceptable measure of branching is that its values be extremal for Pn and Sn. The connectivity index satisfi es this requirement: if T is any n-vertex tree, but T:f:. P,,, Sn, then:
. .. (3)
The proofs (which are far from being easy) of the general validity of the left- and right-hand sides of (3) were offered only a few years ago by Caporossi et al. 20 and Bollobas and Erdos2l
, respectively. The path Pn is, of course, a chemical tree (representing the respective normal alkane) . The star Sn is a chemical tree on ly up to 11=5. For II ~6 also the chemical trees with minimal connectivity index were determined22
•
In the earlier paper I it was anticipated that analogous inequalities hold when A=-], namely:
• • • • • * p n
Sn
•
•
Xn n = even
Fig. I-The n-vertex trees with maximal and minimal Randic index X= C (-0.5) are the path Pn (least branched, with the property that Oy ~ 2 fo r all vertices) and the star Sn (most branched, possessing a vertex v wilh Oy =n-I), respectively; if A. assumes large negative values then the n-vertex trees with maximal and minimal C(A.) are Xn and Snn respectively. The tree Xn can be described as the n-veltex tree possessing a vertex v with Oy = [1 n/2] and [(n-I )/2] vertices of degree 2, all adjacent to 1'.
... (4)
As a kind of surprise, it was recently established8 that for n being sufficiently large the left-·hand side of (4) is violated, i.e., Pn is not the tree with maximal )1-
value. Consequently, )1 cannot be used as a measure of branching and its applicability in QSPR and QSAR studies is doubtful.
In Fig. 2 we illustrate this conclusion by comparing the correlation of X= C(-0.5) and )1= C(-l) with the boiling points of a selected set of alkanes. In the diagram a) (that, deliberately, is identical to what earlier was reported23
, the correlation coeffic ient is 0.998
150
100
50
bp 0
-50
-100
-150
~=-1 ' 0 ,.>'~ :.,r. • :
/ '" . .' .
/ _./ . -200 I~(~b)~, ----l_L--'---L~----l---:~...l.-~--'-_;:~
-0.5 0·0 0·5 10 1-5 2·0 Q·5 /oJ
Fig. 2-a): Correlation between the normal boi ling points of a selected set of 40 alkanes (same as employed in ref. 23) and the Randic index X. For the third-degree polynomial approximation b.p. '" A+BX+CX2+DX3 (as recommended in ref. 23), one gets A=-165.7±4.3, B=81.1±7.7, C=6.6±4.2, D=-2.17±0.66, correlation coefficient R=0.9977, standard deviation SD=4.37°C. b): Correlation between the same boiling points and the ,u-index. The data obtained by least-squares fitting of b.p. '" A+B,u +C,u2+ D,u3 are: A=-165.1±17.4, B=36.5±53.6, C= I44.3±52.2, D=-46.62 ±14.6, R=0.9626, SD=17.45°C, confirming that the QSPR applicabi lity of ,u is significantly inferior to that of X.
GUTMAN et al.: EXPONENT -DEPENDENT PROPERTIES OF THE CONNECTIVITY INDEX 459
whereas in the diagram b) the correlation coefficient is only 0.963; more data on these correlations is found in the caption of Fig. 2.
In order to learn more on the breakdown of the inequalities (4) we determined the n-vertex tree(s) with maximal and minimal J..l-values for n up to 20. For all examined values of n the star has minimal J..l-value, in agreement with (4). For n ~ 9 the path has maximal J..lvalue, in agreement with (4). However, for n ~1O the trees with maximal J..l differ significantly from Po; these trees are depicted in Fig. 3.
From Fig. 3, it is evident that the trees with maximal J..l-index are highly branched. Otherwise, their general structure is not easy to characterize. For n=16 and n=19 (and most probably for other values of n > 20) the tree with maximal J..l is not unique.
Most of the species depicted in Fig. 3 are chemical trees; exceptions are only the 19-vertex tree IV and the 20-vertex tree. From the analysis outlined in the subsequent section it is evident that for n exceeding 20 more and more non-chemical trees will be encountered.
_.lL _JJ __ -fL n= 10 n=l! n= 12
~.- .-it-- -1L. n=13 n=14 n=15
-L{J- __ WL .II I III
n=16
n= 17 n= 18 n=20
·-It.: n= 19
Fig. 3--The n-vertex trees with maximal connectivity indices Jl =C(-I). For n=I6 and n=19 there are four distinct trees with equal maximal Jl-value. For n=JO,II, . . . ,20 the respective maximal Jlvalues are: 2.77778, 3.02778, 3.29167, 3.55556, 3.S1250, 4.08333,4.60417, 4.S75OO, 5.12500 and 5.40000.
Intending to shed more light on the phenomenon described above we compared the orderings of trees according to decreasing X and J..l, respectively. Two characteristic results are shown in Figs 4 and 5. As seen from these figures, the two orderings are significantly different, especially for larger values of n (as in Fig. 5).
The data presented in Figs 4 and 5 clearly show that the connectivity index for A = -0.5 does, and the connectivity index for A=-l does not provide a plausible measure of molecular branching. In other words, when the exponent A is decreased from -0.5 to -1 .0 a drastic change occurs in the structure-dependence of the connectivity index C(A), making it unsuitable for QSPR and QSAR purposes (at least, as far as branching-dependent molecular properties are concerned). In the case of n-vertex chemical trees this happens for all n ~10.
This change begins (or, to be more precise: becomes fully obvious) at a "critical" value of A for which the equality:
. .. (5)
is satisfied by the first n-vertex tree Terit. different from Po. Usually (but not always) Terit is just one of the trees depicted in Fig. 3.
A=- 0.5
--1_._ ._L_ al a2 a3
i ___ ._ 1_ ._ _ _ L-_ .JL-as a6 a7 a8
A=-lO
_J_._ .. _l._ bl b2 b3 b4
._L_ . +_ ._ 1 L b5 b6 b7
I
II
b8
Fig. 4--The first few lO-vertex trees ordered according to decreasing connectivity indices: the species aI-aS pertain to X whereas b,-b8 to Jl : X(al) = 4.91421, X(a2) = x(a3)= X(a4)=
4.S4606, X(a5)=X(a6)=X(a7)=4.S0S06, X (as)=4.79475, Jl (bl)= 2.7777S, Jl (b2)= Jl (b3) == Jl (b6) = 2.75000, Jl' (b7)= Jl (bS)= 2.69444. Note that al=b2. a2=b3, a3=b4 and a4=b5 ; hence in the case n=JO there is still some agreement between the two orderings.
460 INDIAN 1. CHEM .• SEC. A. MARCH 2002
;\=-0·5
-'.-'-01 ~. cI c2
I .. -_ . __ 1---__ _
c4 c5
I __ -L __
c7 c8
-+L _JlL_ : I dI
! I 1 -d6
d3
d7 d8
----.1.. __
I c3
c6
c9
d4 d5
d9
Fig. 5--Same data as in Fig. 4. for 11=14: x(cl)=6.91421. X(c2)= X(c3)= = X(c9)=6.84606. Jl(dl)=3.81250. Jl(d2)= Jl(d3)= 3.80556. Jl(d4)=3.80000. Jl(d5)= Jl(d6)= ....... =Jl(d9)=3.79167. No tree cl-c9 coincides with any of the trees dl-d9 indicating a complete disagreement between the two orderings.
In Table 1 are given the critical values for the exponent A, calculated by means of Eq. (5), as well as the trees Terit .
From the data given in Table 1 we see that connectivity index loses its "branching index" character around )...=-0.9, which is relatively far from the adopted value A= -0.5 and relatively near to A =-1. The original choice I of A= -0.5 for the exponent in the definition of the connectivity index (instead of A=-1) is now seen to be a rather fortunate decision. This certainly is one of the reasons of the great and quarter-of-century-long success of the connectivity index. Our analysis sheds some new light on the true meaning of this choice.
Beyond A =-1 The cause of the above described "breakdown" of
the connectivity index is best seen if we consider the behaviour of C(A) for large negative values of A. For this, denote by mij the number of edges connecting a vertex of degree i with a vertex of degree j and rewrite Eq. (2) as:
C(A) = I nli/ij)" .. . (6) i$j
Table I-Values of the exponent A below which the path Pn is no more the tree with maximal connectivity index. Eq. (2). and the
tree Teril which overtakes the lead; for details see Eq. (5).
11 Acril Teril
10 -0.90821 Tree in Fig. 3 11 -0.90821 Tree in Fig. 3 12 -0.91833 Tree in Fig. 3 13 -0.87976 Tree in Fig. 3 14 -0.87976 3.4.5-Triethylheptane 15 -0.87642 Tree in Fig. 3 16 -0.86594 Trees I & IV in Fig . 3 17 -0.88191 Tree in Fig. 3 18 -0.85096 Tree in Fig. 3 19 -0.85096 Trees II & III in Fig. 3 20 -0.86108 Tree in Fig. 3
Table 2-Alkanes with 11 carbon atoms having maximal connectivity indices when the exponent A is sufficiently large (and nega
tive). For 11=12. 14.17.19 these become maximal already for A:5 -1 (cf. Fig. 3); for details see text
11
5 II-Pentane 6 3-Methylpentane 7 3-Ethylpentane 8 3-Ethylhexane 9 3.3-Diethylpentane 10 3.3-Diethylhexane 11 3.3-Diethyl.4-methylhexane 12 3.3.4-Triethylhexane =Tree in Fig. 3 13 3.3.4-Triethylheptane 14 3.3.4.4-Tetraethylhexane =Tree in Fig. 3 15 3.3,4.4-Tetraethylheptane 16 3.3.4.4-Tetraethyl.5-methylheptane 17 3.3.4.4,5-Pentaethylheptane =Tree in Fig. 3 18 3.3.4.4.5-Pentaethyloctane 19 3.3.4.4.5.5-Hexaethylheptane =Tree I in Fig. 3 20 3.3.4.4.5.5-Hexaethyloctane
Recall that the condition mll=O is satisfied by all connected graphs except by the molecular graph of ethane. Therefore, in what follows we assume that the term mIl (l.1)A = mIl is not contained among the summands on the right-hand side of (6).
Now, for large negative values of the exponent A all summands on the right-hand side of (6) become small and will vanish as A~-oo. The term that decreases most slowly will thus become dominant. This term, evidently, is m122\ implying that for large negative values of A:
C(A) ""m122A
This means that for large negative values of A, C(A) will be maximal for the graph possessing the greatest
GUTMAN et al.: EXPONENT-DEPENDENT PROPERTIES OF THE CONNECfIVITY INDEX 461
possible number of edges that connect a vertex of degree one with a vertex of degree two. The trees with such a property are Xn depicted in Fig. 1.
By means of Eq.(6) it is easy to show that the tree with minimal C(A)-value, no matter how close A is to -00, is the star Sn. Thus, for large negative values of A instead of the relation (3) we have:
C(A; Xn) > C(A; 7) > C(A; Sn)
As seen in Fig. I, both trees Xn and Sn are highly branched. Consequently, when A is sufficiently large (and negative) then the ordering of trees (and chemical trees) by means of the connectivity index C(A) is not determined by the extent of branching, but by some quite different (above specified) structural features. Consequently, with A becoming more and more negative its "branching index" character must necessarily disappear. As shown in the preceding section, for n ~ 10 this happens before A becomes equal to -1 .
The alkanes (chemical trees) that for large negative A have the greatest connectivity index are also easily characterized. These are specified in Table 2. The main structural detail determining these extremal systems is the presence of as many as possible ethylgroups (edges connecting vertices of degree one and two). The pattern by which they are constructed is readily envisaged: it has a periodicity of length five so that for all n=5, 6, ... , the construction step n ~ n+ 1 is fully analogous to the step n+5 ~ n+6.
References 1 Randic M, JAm chem Soc, 97 (1975) 6609. 2 Kier L B & Hall L H, Molecular connectivity in chemistry
and drug research (Academic Press, New York), 1976. 3 Kier L B & Hall L H, Molecular connectivity in structure
activity analysis (Wiley, New York), 1986. 4 Topological indices and related descriptors in QSAR and
QSPR. edited by J Devillers & A T Balaban (Gordon & Breach, Amsterdam), 1999.
5 Todeschini R & Consonni V, Handbook oj molecular descriptors (Wiley-VCH, Weinheim), 2000.
6 QSPRlQSAR studies by molecular descriptors, edited by M V Diudea (Nova, Huntington, NY), 2001.
7 Pogliani L, Chem Rev, 100 (2000) 3827. 8 Clark L H & Moon J W, Ars Combinatoria, 54 (1999) 223. 9 Bollobas B, Erdos P & Sarkar A, Discrete Math, 200 (1999)
5. 10 Randic M, Hansen P J & Jurs P C. J chem InJ Comput Sci, 28
(1988) 60. 11 Estrada E, J chem InJ Comput Sci, 35 (1995) 1022. 12 Arnie D, Beslo D, Lucie B, Nikolic S & Trinajstic N, J chem
InJComputSci, 38 (1998) 819. 13 Estrada E, Chem Phys Letters, 312 (1999) 556. 14 Gutman I, Araujo 0 & Rada J, ACH Models Chem, 137
(2000) 653. 15 Gutman I, Ruscic B, Trinajstic N & Wilcox C F, J chem
Phys, 62 (1975) 3399. 16 Ruch E & Gutman I. J Combin In/System Sci, 4 (1979) 285. 17 Bertz S H, Discrete appl Math , 19 (1988) 65. 18 Randic M, Acta chim Slov, 44 (1997) 57. 19 Balaban A T, Mills D & Basak S C, J chem InJ Comput Sci,
in press. 20 Caporossi G, Gutman I & Hansen P, Comput Chem, 23
(1999) 469. 21 Bollobas B & Erdos p. Ars Combinatoria, 50 (1998) 225. 22 Gutman I, Miljkovic 0; Caporossi G & Hansen P, Chem
Phys Letters, 306 (1999) 366. 23 Mihalic Z & Trinajstic N, J chem Educ, 69 (1992) 701.