on hierarchical software metrics

4
On hierarchical software metrics by Prof. Ronald E. Prather The notion of a hierarchical software metric is introduced, encompassing and extending the earlier work of Fenton and Whitty and this author. It is shown that a stronger interpretation of this notion is insufficient for treating the important 'number-of-paths' metric, whereas this metric definitely falls within the range of the axiomatic system introduced herein. Finally, the technique of argument by induction over an arbitrary flowgraph decomposition is clarified and demonstrated in the new hierarchical context. 1 Introduction The idea of a software metric being defined recursively or inductively over the flowgraph of an arbitrary program, through the analysis of its structural decomposition (Refs. 1 and 2) into a sequence of'irreducible'sub-flowgraphs, has attracted a good deal of attention in the software engineering literature. The germ of this idea can be found in the author's earlier work (Ref. 3), but a more complete study may be found in a recent paper by Fenton and Whitty (Ref. 4). We will find, however, that one poss- ible interpretation of this framework is insufficient for treating the full range of software measures of interest, particularly that of the 'number-of-paths' metric, an important estimating measure of the cost of testing a piece of software. We show how this limitation may be overcome by studying a somewhat broader class of 'hierarchical' metrics than might other- wise have been intended. Moreover, we will find that this new axiomatisation admits an effective technique of inductive reasoning, sufficient to study the import- ant questions that are of interest in the theory. By and large, we will use the notation and terminology of Refs. 3 and 4. In par- ticular, a flowgraph F will consist of a directed graph with distinguished start vertex and stop vertex, such that every vertex or node lies on a path from 'start' to 'stop'. Although our results have a more general range of application, we assume for the sake of simplicity that every node 42 other than 'stop' has outdegree 1 or 2 (and, correspondingly, we will refer to procedure nodes and predicate nodes in these two instances). Note that the 'stop' vertex has out- degree 0. Provided that a formal mapping has been given for the language syntax, a program written in any procedural lan- guage gives rise to a unique flowgraph (Ref. 5) and, because this is well known, we will not call un-necessary attention to this association here, except to note that the procedure nodes are identified with the simple programming language state- ments (of assignment, input and output), whereas the predicate nodes are asso- ciated with the decision points that occur in the program. The most important preliminary result, one underlying the whole theory, is that of the recursive 'sequence nesting' decom- position of any flowgraph. Theorem 1: Every flowgraph F has a unique sequential decomposition into (sequentially) irreducible sub-flow- graphs. Each of these admits a further nested decomposition F, = F,(X,,X 2 X ni ) into maximal sub-flowgraphs. The proof is given in Refs. 4 and 6. The recursive application of this result yields the structure of an arbitrary flow- graph F as a hierarchical tree exhibiting the top-down decomposition of F into irreducible sub-flowgraphs. Fenton and Whitty, together with Kaposi, have shown (Refs. 5 and 7) that in large measure the study of the 'irreducibles' can be reduced to that of the CGK-irreducibles alone, wherein the procedure nodes are 'col- lapsed' (so that we obtain flowgraphs without procedure nodes). In this regard, the following result will be useful in the sequel. Theorem 2: Every CGK-irreducible with n +1 predi- cate nodes can be obtained by 'grafting a new predicate node' to an edge in a CGK- irreducible having n predicate nodes. The proof is given in Refs. 7 and 8. In particular, we note that the 'grafting' process at a particular edge results in a new predicate node, one of whose outgo- ing edges continues in the direction of the original edge, the other to a different but otherwise arbitrary node of the flowgraph. Finally, we note that the irreducibles appearing in a decomposition may be classified according to the number of predicate nodes occurring therein. More generally, the complete family of irreduc- ibles can be described as an infinite union: 00 where S n is the sub-class of irreducibles involving n predicate nodes. If the decom- position of a flowgraph F yields irreduc- ibles of the class S] at most, then F is said to be a structured flowgraph. In general, we will say that F is n-structured if its decomposition involves irreducibles only fromtheclassesS,,S 2 S n . Note that if F involves no irreducibles whatsoever, as occurs only in cases where F is a sequence of simple statements, then we may say that F is O-structured, just to complete the picture. In this way, it is seen that the notion of 'structured program- ming' fits into a broader scheme, as Software Engineering Journal March 1987

Upload: ronald-e

Post on 21-Sep-2016

214 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: On hierarchical software metrics

On hierarchical software metricsby Prof. Ronald E. Prather

The notion of a hierarchical software metric is introduced,encompassing and extending the earlier work of Fenton and Whittyand this author. It is shown that a stronger interpretation of thisnotion is insufficient for treating the important 'number-of-paths'metric, whereas this metric definitely falls within the range of theaxiomatic system introduced herein. Finally, the technique ofargument by induction over an arbitrary flowgraph decompositionis clarified and demonstrated in the new hierarchical context.

1 Introduction

The idea of a software metric beingdefined recursively or inductively overthe flowgraph of an arbitrary program,through the analysis of its structuraldecomposition (Refs. 1 and 2) into asequence of'irreducible'sub-flowgraphs,has attracted a good deal of attention inthe software engineering literature.

The germ of this idea can be found inthe author's earlier work (Ref. 3), but amore complete study may be found in arecent paper by Fenton and Whitty (Ref.4). We will find, however, that one poss-ible interpretation of this framework isinsufficient for treating the full range ofsoftware measures of interest, particularlythat of the 'number-of-paths' metric, animportant estimating measure of the costof testing a piece of software. We showhow this limitation may be overcome bystudying a somewhat broader class of'hierarchical' metrics than might other-wise have been intended. Moreover, wewill find that this new axiomatisationadmits an effective technique of inductivereasoning, sufficient to study the import-ant questions that are of interest in thetheory.

By and large, we will use the notationand terminology of Refs. 3 and 4. In par-ticular, a flowgraph F will consist of adirected graph with distinguished startvertex and stop vertex, such that everyvertex or node lies on a path from 'start' to'stop'. Although our results have a moregeneral range of application, we assumefor the sake of simplicity that every node

42

other than 'stop' has outdegree 1 or 2(and, correspondingly, we will refer toprocedure nodes and predicate nodes inthese two instances).

Note that the 'stop' vertex has out-degree 0. Provided that a formal mappinghas been given for the language syntax, aprogram written in any procedural lan-guage gives rise to a unique flowgraph(Ref. 5) and, because this is well known,we will not call un-necessary attention tothis association here, except to note thatthe procedure nodes are identified withthe simple programming language state-ments (of assignment, input and output),whereas the predicate nodes are asso-ciated with the decision points that occurin the program.

The most important preliminary result,one underlying the whole theory, is that ofthe recursive 'sequence nesting' decom-position of any flowgraph.

Theorem 1:Every flowgraph F has a unique

sequential decomposition

into (sequentially) irreducible sub-flow-graphs. Each of these admits a furthernested decomposition

F, = F,(X,,X2 Xni)

into maximal sub-flowgraphs. The proofis given in Refs. 4 and 6.

The recursive application of this resultyields the structure of an arbitrary flow-graph F as a hierarchical tree exhibiting

the top-down decomposition of F intoirreducible sub-flowgraphs. Fenton andWhitty, together with Kaposi, have shown(Refs. 5 and 7) that in large measure thestudy of the 'irreducibles' can be reducedto that of the CGK-irreducibles alone,wherein the procedure nodes are 'col-lapsed' (so that we obtain flowgraphswithout procedure nodes). In this regard,the following result will be useful in thesequel.

Theorem 2:Every CGK-irreducible with n +1 predi-

cate nodes can be obtained by 'grafting anew predicate node' to an edge in a CGK-irreducible having n predicate nodes. Theproof is given in Refs. 7 and 8.

In particular, we note that the 'grafting'process at a particular edge results in anew predicate node, one of whose outgo-ing edges continues in the direction of theoriginal edge, the other to a different butotherwise arbitrary node of the flowgraph.

Finally, we note that the irreduciblesappearing in a decomposition may beclassified according to the number ofpredicate nodes occurring therein. Moregenerally, the complete family of irreduc-ibles can be described as an infinite union:

00

where Sn is the sub-class of irreduciblesinvolving n predicate nodes. If the decom-position of a flowgraph F yields irreduc-ibles of the class S] at most, then F is saidto be a structured flowgraph. In general,we will say that F is n-structured if itsdecomposition involves irreducibles only

fromtheclassesS,,S2 Sn. Note that if Finvolves no irreducibles whatsoever, asoccurs only in cases where F is asequence of simple statements, then wemay say that F is O-structured, just tocomplete the picture. In this way, it is seenthat the notion of 'structured program-ming' fits into a broader scheme, as

Software Engineering Journal March 1987

Page 2: On hierarchical software metrics

argued in Ref. 5. We draw on this obser-vation in several facets of the theory to bedeveloped here.

2 The hierarchical axiom scheme

In general, a (flowgraph) measure of soft-ware complexity m is a function

m : flowgraphs—* numbers

(or, from programs to numbers, recallingthe association of flowgraphs with pro-grams). For brevity, we simply refer tosuch functions as 'measures' or 'metrics'.It is hoped that one can discover, identifyand study those metrics that are useful aspredictors or indicators of one or moreaspects of program complexity, forexample programming effort, testing costetc.

Here, we are particularly interested inthose metrics that can be given a recur-sive or inductive definition over the flow-graph decomposition structure. To thisend, we will say that m is a hierarchicalmetric if:

• Axiom 1:

m(S) = 1

• Axiom 2:

(S = simple process)

m(F,°F2°...°Fn) = gn(mFumF2 mFn)

• Axiom 3:

m(F(XuX2 Xn))= hF(mXu mX2,..., mXn)

for computable functions gn (n = 1,2,...)and hF (FeS). Particular attention must begiven, however, to the intended inter-pretation of axiom 3. The axiom is to beunderstood as if we had written

hf(mXu mX2 mXn)= h(F, mXu mX2 mXn)

so that the structure of the irreducible Fappears as an argument to the 'global'computable function h.

The Fenton and Whitty axioms (Ref. 4)are similar to the above. But a strongerinterpretation of axiom 3 is possible, as ifwe were to understand

hf{mXu mX?,...,mXn)= h(mF, mXu mX2 mXn)

wherein the global function h 'sees' Fonlythrough a measure of its own complexityas an irreducible. And, of course, thisrequires that the measures mF (FeS) beprescribed in advance. With such astrengthened interpretation of axiom 3,we will refer to m as a recursive metric. Inthe weaker interpretation we haveprovided for the hierarchical metrics, themeasures mF are not defined, other thanin the form

mF(N,N,..., N) = M0,0 0)

for 'null' processes N, for which weassume m/V = 0. Of course, our two inter-

pretations yield the immediate result ofthe following theorem.

Theorem 3:Every recursive metric is hierarchical.

But the question then remains: are there(interesting) hierarchical metrics that arenot recursive? We answer this question inthe affirmative in the following section. Inso doing, we will of course limit our atten-tion to measures m that are computablein a broadest sense, in that an algorithmexists for computing m(F) for any givenflowgraph F. This is an obvious firstrequirement for consideration.

3 The number-of-paths metric

In general, the number of paths from'start' to 'stop' in a flowgraph will beinfinite, owing to the possible presence ofloops. However, we obtain a finitemeasure

r\ : flowgraphs—* natural numbers

if we restrict our attention to the simplepaths, i.e. those for which no edge is tra-versed more than once. There is then littlepossibility for confusion in referring to thisrestricted measure r\ as the number-of-paths metric (see Ref. 6 for a closelyrelated metric). It is well known (Ref. 6)that such measures provide an extremelyuseful estimate of the testing effort, sinceone often wishes to run sufficiently manytests as to exercise all such programpaths, or at least some fraction thereof.

Moreover, y\ is obviously a computablemetric. We have only to assign a left-rightorientation to the outgoing edges of thepredicate nodes, and then to make use ofthe binary tree-generating algorithm:

at depth : = 0 t o 2 n - lfor each leaf

extend to left if possibleextend to right if possible

finally counting the resulting (simple)paths from 'start' to 'stop'. Here T] is thenumber of vertices in the underlying flow-graph, and we note that the upper limit onthe outer computational loop is thendetermined by a simple application of the'pigeonhole principle'. As a result, thealgorithm has a computationalcomplexity:

2 n - 1

T(n) = J Q 2k2k

2 n - 1= 2 X k2k = O(n2")

representing an exponential order, albeitone of utmost simplicity. As a matter offact the algorithm can easily be extendedto treat the case of a general digraph, hav-ing arbitrary outdegrees. For the digraph

of Fig. la we obtain the binary labelledtree of Fig. 2, with the resulting conclusionthat there are precisely four simple pathsfrom 'start' to 'stop".

Now there is no question that T\ is ametric of great interest and importance inthe software engineering environment. Inthe result that follows, we show that thisnumber-of-paths metric definitely fallswithin the hierarchical axiom schemeintroduced here, and yet it falls outside thestronger (recursive) interpretation of theFenton and Whitty theory.

Theorem 4:The number-of-paths metric r\ is hier-

archical but not recursive.

The proof is as follows. For the first as-sertion, we have only to show that r\ maybe given an inductive definition 1, 2, 3over the decompositional structure of anygiven flowgraph F. In fact, we may write:

• Definition I:

Definition 2:n

T! (F,°F2°...oF,,) = II/ i

Definition 3:V""

= X 11

where {p,, p2,..., p^} is the set of paths of Fand we have

bti =1 if X, is on path p,

0 otherwise

Note that this latter understandingrequires that the structure of F be known,showing the specific interconnection ofthe maximal sub-flowgraphsX,. Moreover,one needs to have computed the simplepaths of the given irreducible F, forexample using the algorithm just intro-duced. And yet all this is well within therange of capability assumed for the com-putable function h appearing in the orig-inal interpretation of axiom 3, so that,indeed, y\ is a hierarchical metric.

On the other hand, a knowledge of i\Falone (but together with T\XU r\X?,...,j]Xn)is not sufficient to determine T](F(X|,X^, ...,Xn)). To see this, consider the two irre-duciblesFandF' shown in Figs, lib andc:.We have

T | F = T , F ' (=4)

as given by the simple paths:

e2e,e4e6e2e5

e4e3e2e5

e4e6

e5e,e2e5

e,e4e6

e,e4e3e2es

in F and F', respectively (one may use thetree algorithm as before here). Now if we

Software Engineering Journal March 1987 43

Page 3: On hierarchical software metrics

Fig. 1 Digraph F, flowgraph Fand flowgraph F'a Digraph F b Flowgraph F c Flowgraph F'

suppose that XUX2 X6 are chosen so

that

T)X, = T\X]' ( = 2)

i)X2 = f\X2' (=2)

TlX6 = T)X6' ( = 2)

we will find that, nevertheless,

2 X6) 2 ' X6')

and in fact

t)F(XuX2 X6) = 1 6 + 4 + 1 6 + 4 = 4 0

r\F{X\',X2 X6') = 2 + 8 + 8 + 32= 50

showing that T) is not recursive.The result shows that the stronger inter-

pretation of the Fenton and Whitty axiomsis not sufficiently broad for the intendedpurpose, even though there are a numberof well known hierarchical metrics — forexample McCabe's measure (Ref. 9),Prather's measure |JL (Ref. 3), as extendedin Ref. 4 etc. — that are in fact 'recursive'.

4 Inductive arguments

Having provided an inductive definition ofthe class of flowgraphs, using their struc-tural decomposition, and having intro-duced an inductive axiom scheme oversuch structures to define the notion of ahierarchical software metric, it is entirely

natural to expect that one would be able toprovide inductive proofs for importantproperties of these metrics. Indeed, this isthe case, but there are certain pitfallsawaiting the unwary, and we feel that it isperhaps necessary (and certainly helpful)to outline the general strategy by way ofexample.

As our illustration we choose a pair ofestimates — lower and upper bounds —for the number-of-paths metric T\, interms of the McCabe measure p. Theseestimates have long been a part of the'folklore' of software engineering. How-ever, outside of the easy proof available inrestricting the attention to structured pro-grams, we know of no thoroughly con-vincing argument to handle the generalcase in an elegant way. We would hope tocorrect this deficiency here.

We recall that the McCabe measure

p: flowgraphs—* natural numbers

is generally introduced (Ref. 8) as thecyclomatic number of the digraph F (inthe sense of graph theory); i.e. we have pFequals the number of 'independent'cycles in F.

Note that usually we first augment F byadjoining an edge from 'stop' to 'start' soas to ensure that our axiom 1 is satisfied.McCabe has shown that, in fact,

pF = n + 1

where n is the number of predicate nodesin F, and we will make occasional use of

this fact in the arguments below.Moreover, we make implicit use of the

obvious identities

T,(F) = -n(C(F))p(F) = p(C(F))

where C(F) is the CGK graph of F asdefined in Ref. 5. And with this under-standing, we may state the followingresult, whose proof is given in two parts(we note that neither of the inequalitiescan be relaxed, since both bounds can infact be met).

Theorem 5:For every flowgraph Fwe have

As proof of this, for every simple process Swe have the pair of equalities, since pS =1=T)S. For the first part of the inductiveargument we consider the left-hand ine-quality. Assuming that pF, =S T\Fh for/ = l,2,...,n, we obtain

P(F,°F2°...oFn) = 1 + £ ( P F , - 1)i = i

n

^ 1 + X (V7/ ~ 1)/ = 1

n

^ i + ( n TIF- i)

= I ! T\F, = T1(F,°F2°...°Fn)i = I

We then use theorem 2 to conclude thatpF ^ y\F for every irreducible F. Thus wehave equality (pF = 1 = j]F) in case FeS0.And in assuming the inequality for all irre-ducibles inS,,_, we obtain

p(F) = 1 + p(F - D)

^ 1 + T I ( F - D ) T, (F)

for all FeSn, where D is the 'grafted' predi-cate node. Note that in our last inequalitythere is at least one new (simple) path in F.Finally, if we assume that pX, =S T^X,, for/ =l,2,...,n, we obtain

pF(X, Xn) = P F +

£/ = i

f

= X

(pX,-

= -nF(X1(...,xn)

as required, using the inequality on irre-ducibles as obtained directly above. Notethat the last inequality in our string followsfrom the fact that every X, appears onsome path.

For the second part of the argument(the right-hand inequality) we first sup-pose that TI

then obtain

2|)Fr' f o r / = 1 ) 2 n . We

44 Software Engineering Journal March 1987

Page 4: On hierarchical software metrics

Tl(F,oF2o...oFn) = II

i •••• I

2I>F' ' = 2 ^

Again we use theorem 2 to help in con-cluding that •nF^2'"r"1 for every irreduc-ible F. When FeS0 we have the equality (iqF = 1 = 2i>r '). If we assume the desiredinequality for all irreducibles inSn_i, thenfor FeS,, we obtain

T\F = j](F— D) + number of new paths^ 2(>(F D) ' + number of new paths= 21>F 2 + number of new pathss= 2|1F 2 + 2"F 2 = 2(2"F 2) = 2'"r"1

since the new paths are again achieved asif in a sub-flowgraph with pF—2 decisionnodes. Finally, if we assume that iqX,^ 2I>X'"', for /=1,2 n, we obtain

XJ

= x

where

Again, we have used the inequality on irre-ducibles in the last inequality of our string.

Note that without the 'structure argu-ment', the tree-generating algorithm onlyshows that

so that, indeed, we learn a great deal fromthis new perspective.

5 Further observations

So far, we have introduced two classes ofsoftware metrics that can be given aninductive definition over the decompo-sitional structure of the flowgraph of aprogram: the 'recursive' and the 'hier-archical' metrics. Moreover, we haveshown that the converse of theimplication

recursive => hierarchical

does not hold, since the important 'num-ber-of-paths' metric is hierarchical butnot recursive.

More generally, we wish to establish abroader sequence of implications

recursive => hierarchical =£>inductive =̂> computable

involving yet a third sub-classification ofthe 'computable hierarchy': the so-called

Fig. 2 Binary labelled tree

'inductive' metrics. Given any p^O, we willsay that a measure

m: flowgraphs-H> numbers

is inductive (mod p) if our axiom 3

mF(X] Xn) = hF{mX\ mXn)

holds for every irreducible FeSp.Then it is clear that every hierarchical

metric is inductive (mod p) for all p2=0(and conversely). Moreover, if p<q wehave the implication

inductive (mod q) => inductive (mod p)

but not conversely. To see the latter, wehave only to consider the measure

2 if there is more than onelevel of nested 'G's in thehierarchy of F

1 otherwise

where GeSq—Sp. Surely 8 is inductive(modp) since

5F(X, Xn)=2 if, and only if, 8(X,)=2 forsome i

for all FeSp. On the other hand, the know-ledge that 5(X,) = 1 for all / is not sufficient

S(F) =

to determine whether G(Xh...,Xn) hasmore than one level of nested 'G's, so that8 is not inductive (mod q).

Next, we exhibit a metric

co : flowgraphs—* {1,2}

that is computable in the broadest sense,but not inductive (mod p) for any p^O.For any two distinguished flowgraphs Fh

F2 we take

1 if the number of 'F,'s isgreater than or equal tothe number of 'F '̂s in thehierarchy of F

2 otherwise

Clearly the hierarchical tree for thedecomposition of F yields an algorithmicprocedure for computing co(F)= 1 or 2 forany flowgraph F. But even co(F|°F?

0...°F,,)cannot be determined by the knowledgeof co(F,),...,co(Fn) alone, showing that co isnot inductive.

Finally, we should make the observa-tion that within each level of the spectrum(for example inductive, hierarchical,recursive) there is a further gradationaccording to the computational complex-ity of the (best) algorithm for calculatingthe metric. Thus r\ is at least exponentiallycomplex because T(n) = O(n2") for theexecution of the tree-generatingalgorithm alone. This is representative ofan issue that has not yet attracted a greatdeal of attention in the software engineer-ing literature, but one that is destined to beof increasing importance in theapplications.

6 Acknowledgment

The author would like to acknowledge themost helpful comments of an anonymousreferee, resulting in a much improvedtreatment of the notion of an inductivemeasure than would have otherwise beenthe case.

7 References

1 PRATHER, R. E., and GIULIERI, S. J.: 'Decomposition of flowchart schemata', ComputerJournal, 1981, 24, pp. 258-262

2 WHITTY, R. W., FENTON, N. E., and KAPOSI, A. A.: 'Structured programming. A tutorial guide',Software & Microsystems, 1984, 3, pp. 54-65

3 PRATHER, R. E.: 'An axiomatic theory of software complexity measure", Computer Journal,1984, 27, pp. 340-347

4 FENTON, N. E., and WHITTY, R. W.: 'Axiomatic approach to software metrication throughprogram decomposition', Computer Journal, 1987, 30 (to appear)

5 FENTON, N. E., WHITTY, R. W., and KAPOSI, A. A.: 'A generalized mathematical theory ofstructured programming', Theoretical Computer Science, 1985, 36, pp. 145-171

6 LIPAEV, V. V., POZIN, B. A., and STROGANOVA, I. N.: 'Complexity of program module testing',Programming & Computer Software, 1983, 9, pp. 332-337

7 WHITTY, R. W.: 'Generation of an important class of program flowgraphs1. Internal Report,Department of Electrical Engineering, Polytechnic of the South Bank, London, England, 1983

8 FGCIK, J., and KRAL, J.: The hierarchy of program control structures', Computer Journal,1986,29, pp. 24-32

9 McCABE, T.: 'A complexity measure', IEEE Transactions on Software Engineering, 1976,SE-2, pp. 308-320

Prof. R. E. Prather is Caruth Distinguished Professor in the Department of Computer Science,Trinity University, San Antonio, TX 78284, USA.

Software Engineering Journal March 1987 45