exponent-dependent properties of the connectivity...

5
Indian Journal of Chemistry Vol. 41A, March 2002, pp. 457-461 Exponent-dependent properties of the connectivity index Ivan Gutman*, Mirko Lepovic & Dusica Vidovic Faculty of Science, University of Kragujevac, P. O. Box 60, YU-34000 Kragujevac, Yugoslavia and Lane H Clark Southern Illinois University at Carbondale, Carbondale, IL 62901-4408, USA Received 20 November 2001 The connectivity index is defined as e(A) where Ov is the degree of the vertex v of the re spective molecular graph, and where the summation embraces all pairs of adjacent vertices. The exponent A is usually chosen to be equal to -0.5 but other options have been considered as well. especially C( -I) . We show that whereas C( -0.5) correctly reflects the extent of branching of the carbon-atom skeleton of organic molecules, and is thus a suitable topological index for modelling physico-chemical properties of the respective compounds, this is not the case when the exponent A assumes larger negative va lues, in particular when 1.=-1. The value of A is established beyond which C(A) fail s to be a measure of molecular branching. Introduction The connectivity index (introduced in 1975 and originally called "branching index"] is defined as: .. . (1) /l, V where 8. denotes the degree (= number of first neigh- bours) of the vertex v of the molecular graph G, and where the summation embraces all pairs of adjacent vertices of G. This structure-descriptor eventually became one of the most popular topological indices. On its applications for predicting physico-chemical and pharmacologic properties of organic compounds two books2 .3 and scores of papers were written; details and further bibliography can be found in the recent monographs 4 - 6 and the review 7 . Formula (1) is a special case of a more general "connectivity index" C, defined as: ... (2) II ,V Evidently, X,= C( -0.5). Several recent mathematical studies 8 . 9 are devoted to C(A), especially C( -I). We mention in passing that C(A) for A=+ I is the so-called "2nd Zagreb group in- dex"; for details see the handbook S. In the earlier paper 1 no convincing argument was given why in formula (2) one should choose A= -0 .5. The original aim was to provide a measure of the branching of the carbon-atom skeleton of alkanes. Based on the analysis of the data for butane, pentane and hexane isomers one arrived at the conclu- sion I that both choices A= -0 .5 and A = -I were equally plausible. Nevertheless, preference was given to the former choice because C( -0.5) had greater iso- mer-discriminating power than C(-l). The possibility that the exponent A assumes values other than -0 .5 was examined by several authors 10-14. Indeed, if one would view the exponent A as a vari- able that is adjusted so as to optimize the cor relation between C(A) and some physico-chemical property, then A so determined would differ from -0.5. The dis- advantage of such an approach is that the A-value so determined would significantly depend both on the physico-chemical property considered and on the sample of molecules employed. The option A = -I was recently examined in some detail by Moon and one of the present authors 8 . In what follows we denote C( -1)= C( -I ;G) by J..l = J..l(G) . In this paper we are concerned with certain proper- ties of the connectivity index C(A) of trees. Recall that a tree is a connected acyclic graph. A chemical tree is a tree with property 8 v 4 for all vertices v. Chemical trees provide a graph representation of alkanes . Connectivity indices vs. branching Much work has been devoted to provide a precise

Upload: others

Post on 28-Feb-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exponent-dependent properties of the connectivity indexnopr.niscair.res.in/bitstream/123456789/18231/1/IJCA 41A... · 2016. 7. 20. · 458 INDIAN J. CHEM., SEC. A, MARCH 2002 characterization

Indian Journal of Chemistry Vol. 41A, March 2002, pp. 457-461

Exponent-dependent properties of the connectivity index

Ivan Gutman*, Mirko Lepovic & Dusica Vidovic Faculty of Science, University of Kragujevac, P. O. Box 60, YU-34000 Kragujevac, Yugoslavia

and

Lane H Clark Southern Illinois University at Carbondale, Carbondale, IL 62901-4408, USA

Received 20 November 2001

The connectivity index is defined as e(A) =~(ouOvl' , where Ov is the degree of the vertex v of the respective molecular graph, and where the summation embraces all pairs of adjacent vertices. The exponent A is usually chosen to be equal to -0.5 but other options have been considered as well. especially C( -I). We show that whereas C( -0.5) correctly reflects the extent of branching of the carbon-atom skeleton of organic molecules, and is thus a suitable topological index for modelling physico-chemical properties of the respective compounds, this is not the case when the exponent A assumes larger negative values, in particular when 1.=-1. The value of A is established beyond which C(A) fail s to be a measure of molecular branching.

Introduction The connectivity index (introduced in 1975 and

originally called "branching index"] is defined as:

.. . (1) /l, V

where 8. denotes the degree (= number of first neigh­bours) of the vertex v of the molecular graph G, and where the summation embraces all pairs of adjacent vertices of G. This structure-descriptor eventually became one of the most popular topological indices. On its applications for predicting physico-chemical and pharmacologic properties of organic compounds two books2.3 and scores of papers were written; details and further bibliography can be found in the recent monographs4

-6 and the review7

.

Formula (1) is a special case of a more general "connectivity index" C, defined as:

... (2) II , V

Evidently, X,= C( -0.5). Several recent mathematical studies8

.9 are devoted

to C(A), especially C( -I). We mention in passing that C(A) for A=+ I is the so-called "2nd Zagreb group in­dex"; for details see the handbookS.

In the earlier paper 1 no convincing argument was given why in formula (2) one should choose A= -0.5.

The original aim was to provide a measure of the branching of the carbon-atom skeleton of alkanes. Based on the analysis of the data obtai~,~d for butane, pentane and hexane isomers one arrived at the conclu­sion I that both choices A= -0.5 and A = -I were equally plausible. Nevertheless, preference was given to the former choice because C( -0.5) had greater iso­mer-discriminating power than C(-l).

The possibility that the exponent A assumes values other than -0.5 was examined by several authors 10-14.

Indeed, if one would view the exponent A as a vari­able that is adjusted so as to optimize the correlation between C(A) and some physico-chemical property, then A so determined would differ from -0.5. The dis­advantage of such an approach is that the A-value so determined would significantly depend both on the physico-chemical property considered and on the sample of molecules employed.

The option A = -I was recently examined in some detail by Moon and one of the present authors8

. In what follows we denote C( -1)= C( -I ;G) by J..l = J..l(G) .

In this paper we are concerned with certain proper­ties of the connectivity index C(A) of trees. Recall that a tree is a connected acyclic graph. A chemical tree is a tree with property 8v ~ 4 for all vertices v. Chemical trees provide a graph representation of alkanes .

Connectivity indices vs. branching Much work has been devoted to provide a precise

Page 2: Exponent-dependent properties of the connectivity indexnopr.niscair.res.in/bitstream/123456789/18231/1/IJCA 41A... · 2016. 7. 20. · 458 INDIAN J. CHEM., SEC. A, MARCH 2002 characterization

458 INDIAN J. CHEM., SEC. A, MARCH 2002

characterization of "branching" and to order (iso­meric) molecules according to their extent of branch­ing; see 15

-19 for details and further references. In spite

of all these efforts there is no general agreement about what "branching" is and how it could/should be measured.

With regard to the extent of branching of trees, there is a generally accepted (and intuitively plausi­ble) assertion: Among n-vertex trees the path Pn and the star Sn (see Fig.l) are the least and the most branched species, respectively. In view of this, a nec­essary condition for any topological index to be an acceptable measure of branching is that its values be extremal for Pn and Sn. The connectivity index satis­fi es this requirement: if T is any n-vertex tree, but T:f:. P,,, Sn, then:

. .. (3)

The proofs (which are far from being easy) of the general validity of the left- and right-hand sides of (3) were offered only a few years ago by Caporossi et al. 20 and Bollobas and Erdos2l

, respectively. The path Pn is, of course, a chemical tree (representing the respective normal alkane) . The star Sn is a chemical tree on ly up to 11=5. For II ~6 also the chemical trees with minimal connectivity index were determined22

In the earlier paper I it was anticipated that analo­gous inequalities hold when A=-], namely:

• • • • • * p n

Sn

Xn n = even

Fig. I-The n-vertex trees with maximal and minimal Randic index X= C (-0.5) are the path Pn (least branched, with the prop­erty that Oy ~ 2 fo r all vertices) and the star Sn (most branched, possessing a vertex v wilh Oy =n-I), respectively; if A. assumes large negative values then the n-vertex trees with maximal and minimal C(A.) are Xn and Snn respectively. The tree Xn can be de­scribed as the n-veltex tree possessing a vertex v with Oy = [1 n/2] and [(n-I )/2] vertices of degree 2, all adjacent to 1'.

... (4)

As a kind of surprise, it was recently established8 that for n being sufficiently large the left-·hand side of (4) is violated, i.e., Pn is not the tree with maximal )1-

value. Consequently, )1 cannot be used as a measure of branching and its applicability in QSPR and QSAR studies is doubtful.

In Fig. 2 we illustrate this conclusion by comparing the correlation of X= C(-0.5) and )1= C(-l) with the boiling points of a selected set of alkanes. In the dia­gram a) (that, deliberately, is identical to what earlier was reported23

, the correlation coeffic ient is 0.998

150

100

50

bp 0

-50

-100

-150

~=-1 ' 0 ,.>'~ :.,r. • :

/ '" . .' .

/ _./ . -200 I~(~b)~, ----l_L--'---L~----l---:~...l.-~--'-_;:~

-0.5 0·0 0·5 10 1-5 2·0 Q·5 /oJ

Fig. 2-a): Correlation between the normal boi ling points of a selected set of 40 alkanes (same as employed in ref. 23) and the Randic index X. For the third-degree polynomial approximation b.p. '" A+BX+CX2+DX3 (as recommended in ref. 23), one gets A=-165.7±4.3, B=81.1±7.7, C=6.6±4.2, D=-2.17±0.66, correla­tion coefficient R=0.9977, standard deviation SD=4.37°C. b): Correlation between the same boiling points and the ,u-index. The data obtained by least-squares fitting of b.p. '" A+B,u +C,u2+ D,u3 are: A=-165.1±17.4, B=36.5±53.6, C= I44.3±52.2, D=-46.62 ±14.6, R=0.9626, SD=17.45°C, confirming that the QSPR applicabi lity of ,u is significantly inferior to that of X.

Page 3: Exponent-dependent properties of the connectivity indexnopr.niscair.res.in/bitstream/123456789/18231/1/IJCA 41A... · 2016. 7. 20. · 458 INDIAN J. CHEM., SEC. A, MARCH 2002 characterization

GUTMAN et al.: EXPONENT -DEPENDENT PROPERTIES OF THE CONNECTIVITY INDEX 459

whereas in the diagram b) the correlation coefficient is only 0.963; more data on these correlations is found in the caption of Fig. 2.

In order to learn more on the breakdown of the in­equalities (4) we determined the n-vertex tree(s) with maximal and minimal J..l-values for n up to 20. For all examined values of n the star has minimal J..l-value, in agreement with (4). For n ~ 9 the path has maximal J..l­value, in agreement with (4). However, for n ~1O the trees with maximal J..l differ significantly from Po; these trees are depicted in Fig. 3.

From Fig. 3, it is evident that the trees with maxi­mal J..l-index are highly branched. Otherwise, their general structure is not easy to characterize. For n=16 and n=19 (and most probably for other values of n > 20) the tree with maximal J..l is not unique.

Most of the species depicted in Fig. 3 are chemical trees; exceptions are only the 19-vertex tree IV and the 20-vertex tree. From the analysis outlined in the subsequent section it is evident that for n exceeding 20 more and more non-chemical trees will be en­countered.

_.lL _JJ __ -fL n= 10 n=l! n= 12

~.- .-it-- -1L. n=13 n=14 n=15

-L{J- __ WL .II I III

n=16

n= 17 n= 18 n=20

·-It.: n= 19

Fig. 3--The n-vertex trees with maximal connectivity indices Jl =C(-I). For n=I6 and n=19 there are four distinct trees with equal maximal Jl-value. For n=JO,II, . . . ,20 the respective maximal Jl­values are: 2.77778, 3.02778, 3.29167, 3.55556, 3.S1250, 4.08333,4.60417, 4.S75OO, 5.12500 and 5.40000.

Intending to shed more light on the phenomenon described above we compared the orderings of trees according to decreasing X and J..l, respectively. Two characteristic results are shown in Figs 4 and 5. As seen from these figures, the two orderings are signifi­cantly different, especially for larger values of n (as in Fig. 5).

The data presented in Figs 4 and 5 clearly show that the connectivity index for A = -0.5 does, and the connectivity index for A=-l does not provide a plau­sible measure of molecular branching. In other words, when the exponent A is decreased from -0.5 to -1 .0 a drastic change occurs in the structure-dependence of the connectivity index C(A), making it unsuitable for QSPR and QSAR purposes (at least, as far as branching-dependent molecular properties are con­cerned). In the case of n-vertex chemical trees this happens for all n ~10.

This change begins (or, to be more precise: be­comes fully obvious) at a "critical" value of A for which the equality:

. .. (5)

is satisfied by the first n-vertex tree Terit. different from Po. Usually (but not always) Terit is just one of the trees depicted in Fig. 3.

A=- 0.5

--1_._ ._L_ al a2 a3

i ___ ._ 1_ ._ _ _ L-_ .JL-as a6 a7 a8

A=-lO

_J_._ .. _l._ bl b2 b3 b4

._L_ . +_ ._ 1 L b5 b6 b7

I

II

b8

Fig. 4--The first few lO-vertex trees ordered according to decreasing connectivity indices: the species aI-aS pertain to X whereas b,-b8 to Jl : X(al) = 4.91421, X(a2) = x(a3)= X(a4)=

4.S4606, X(a5)=X(a6)=X(a7)=4.S0S06, X (as)=4.79475, Jl (bl)= 2.7777S, Jl (b2)= Jl (b3) == Jl (b6) = 2.75000, Jl' (b7)= Jl (bS)= 2.69444. Note that al=b2. a2=b3, a3=b4 and a4=b5 ; hence in the case n=JO there is still some agreement between the two orderings.

Page 4: Exponent-dependent properties of the connectivity indexnopr.niscair.res.in/bitstream/123456789/18231/1/IJCA 41A... · 2016. 7. 20. · 458 INDIAN J. CHEM., SEC. A, MARCH 2002 characterization

460 INDIAN 1. CHEM .• SEC. A. MARCH 2002

;\=-0·5

-'.-'-01 ~. cI c2

I .. -_ . __ 1---__ _

c4 c5

I __ -L __

c7 c8

-+L _JlL_ : I dI

! I 1 -d6

d3

d7 d8

----.1.. __

I c3

c6

c9

d4 d5

d9

Fig. 5--Same data as in Fig. 4. for 11=14: x(cl)=6.91421. X(c2)= X(c3)= = X(c9)=6.84606. Jl(dl)=3.81250. Jl(d2)= Jl(d3)= 3.80556. Jl(d4)=3.80000. Jl(d5)= Jl(d6)= ....... =Jl(d9)=3.79167. No tree cl-c9 coincides with any of the trees dl-d9 indicating a complete disagreement between the two orderings.

In Table 1 are given the critical values for the ex­ponent A, calculated by means of Eq. (5), as well as the trees Terit .

From the data given in Table 1 we see that connec­tivity index loses its "branching index" character around )...=-0.9, which is relatively far from the adopted value A= -0.5 and relatively near to A =-1. The original choice I of A= -0.5 for the exponent in the definition of the connectivity index (instead of A=-1) is now seen to be a rather fortunate decision. This certainly is one of the reasons of the great and quarter-of-century-long success of the connectivity index. Our analysis sheds some new light on the true meaning of this choice.

Beyond A =-1 The cause of the above described "breakdown" of

the connectivity index is best seen if we consider the behaviour of C(A) for large negative values of A. For this, denote by mij the number of edges connecting a vertex of degree i with a vertex of degree j and rewrite Eq. (2) as:

C(A) = I nli/ij)" .. . (6) i$j

Table I-Values of the exponent A below which the path Pn is no more the tree with maximal connectivity index. Eq. (2). and the

tree Teril which overtakes the lead; for details see Eq. (5).

11 Acril Teril

10 -0.90821 Tree in Fig. 3 11 -0.90821 Tree in Fig. 3 12 -0.91833 Tree in Fig. 3 13 -0.87976 Tree in Fig. 3 14 -0.87976 3.4.5-Triethylheptane 15 -0.87642 Tree in Fig. 3 16 -0.86594 Trees I & IV in Fig . 3 17 -0.88191 Tree in Fig. 3 18 -0.85096 Tree in Fig. 3 19 -0.85096 Trees II & III in Fig. 3 20 -0.86108 Tree in Fig. 3

Table 2-Alkanes with 11 carbon atoms having maximal connec­tivity indices when the exponent A is sufficiently large (and nega­

tive). For 11=12. 14.17.19 these become maximal already for A:5 -1 (cf. Fig. 3); for details see text

11

5 II-Pentane 6 3-Methylpentane 7 3-Ethylpentane 8 3-Ethylhexane 9 3.3-Diethylpentane 10 3.3-Diethylhexane 11 3.3-Diethyl.4-methylhexane 12 3.3.4-Triethylhexane =Tree in Fig. 3 13 3.3.4-Triethylheptane 14 3.3.4.4-Tetraethylhexane =Tree in Fig. 3 15 3.3,4.4-Tetraethylheptane 16 3.3.4.4-Tetraethyl.5-methylheptane 17 3.3.4.4,5-Pentaethylheptane =Tree in Fig. 3 18 3.3.4.4.5-Pentaethyloctane 19 3.3.4.4.5.5-Hexaethylheptane =Tree I in Fig. 3 20 3.3.4.4.5.5-Hexaethyloctane

Recall that the condition mll=O is satisfied by all con­nected graphs except by the molecular graph of eth­ane. Therefore, in what follows we assume that the term mIl (l.1)A = mIl is not contained among the summands on the right-hand side of (6).

Now, for large negative values of the exponent A all summands on the right-hand side of (6) become small and will vanish as A~-oo. The term that decreases most slowly will thus become dominant. This term, evidently, is m122\ implying that for large negative values of A:

C(A) ""m122A

This means that for large negative values of A, C(A) will be maximal for the graph possessing the greatest

Page 5: Exponent-dependent properties of the connectivity indexnopr.niscair.res.in/bitstream/123456789/18231/1/IJCA 41A... · 2016. 7. 20. · 458 INDIAN J. CHEM., SEC. A, MARCH 2002 characterization

GUTMAN et al.: EXPONENT-DEPENDENT PROPERTIES OF THE CONNECfIVITY INDEX 461

possible number of edges that connect a vertex of de­gree one with a vertex of degree two. The trees with such a property are Xn depicted in Fig. 1.

By means of Eq.(6) it is easy to show that the tree with minimal C(A)-value, no matter how close A is to -00, is the star Sn. Thus, for large negative values of A instead of the relation (3) we have:

C(A; Xn) > C(A; 7) > C(A; Sn)

As seen in Fig. I, both trees Xn and Sn are highly branched. Consequently, when A is sufficiently large (and negative) then the ordering of trees (and chemi­cal trees) by means of the connectivity index C(A) is not determined by the extent of branching, but by some quite different (above specified) structural fea­tures. Consequently, with A becoming more and more negative its "branching index" character must neces­sarily disappear. As shown in the preceding section, for n ~ 10 this happens before A becomes equal to -1 .

The alkanes (chemical trees) that for large negative A have the greatest connectivity index are also easily characterized. These are specified in Table 2. The main structural detail determining these extremal systems is the presence of as many as possible ethyl­groups (edges connecting vertices of degree one and two). The pattern by which they are constructed is readily envisaged: it has a periodicity of length five so that for all n=5, 6, ... , the construction step n ~ n+ 1 is fully analogous to the step n+5 ~ n+6.

References 1 Randic M, JAm chem Soc, 97 (1975) 6609. 2 Kier L B & Hall L H, Molecular connectivity in chemistry

and drug research (Academic Press, New York), 1976. 3 Kier L B & Hall L H, Molecular connectivity in structure­

activity analysis (Wiley, New York), 1986. 4 Topological indices and related descriptors in QSAR and

QSPR. edited by J Devillers & A T Balaban (Gordon & Breach, Amsterdam), 1999.

5 Todeschini R & Consonni V, Handbook oj molecular de­scriptors (Wiley-VCH, Weinheim), 2000.

6 QSPRlQSAR studies by molecular descriptors, edited by M V Diudea (Nova, Huntington, NY), 2001.

7 Pogliani L, Chem Rev, 100 (2000) 3827. 8 Clark L H & Moon J W, Ars Combinatoria, 54 (1999) 223. 9 Bollobas B, Erdos P & Sarkar A, Discrete Math, 200 (1999)

5. 10 Randic M, Hansen P J & Jurs P C. J chem InJ Comput Sci, 28

(1988) 60. 11 Estrada E, J chem InJ Comput Sci, 35 (1995) 1022. 12 Arnie D, Beslo D, Lucie B, Nikolic S & Trinajstic N, J chem

InJComputSci, 38 (1998) 819. 13 Estrada E, Chem Phys Letters, 312 (1999) 556. 14 Gutman I, Araujo 0 & Rada J, ACH Models Chem, 137

(2000) 653. 15 Gutman I, Ruscic B, Trinajstic N & Wilcox C F, J chem

Phys, 62 (1975) 3399. 16 Ruch E & Gutman I. J Combin In/System Sci, 4 (1979) 285. 17 Bertz S H, Discrete appl Math , 19 (1988) 65. 18 Randic M, Acta chim Slov, 44 (1997) 57. 19 Balaban A T, Mills D & Basak S C, J chem InJ Comput Sci,

in press. 20 Caporossi G, Gutman I & Hansen P, Comput Chem, 23

(1999) 469. 21 Bollobas B & Erdos p. Ars Combinatoria, 50 (1998) 225. 22 Gutman I, Miljkovic 0; Caporossi G & Hansen P, Chem

Phys Letters, 306 (1999) 366. 23 Mihalic Z & Trinajstic N, J chem Educ, 69 (1992) 701.