on hierarchical software metrics
TRANSCRIPT
On hierarchical software metricsby Prof. Ronald E. Prather
The notion of a hierarchical software metric is introduced,encompassing and extending the earlier work of Fenton and Whittyand this author. It is shown that a stronger interpretation of thisnotion is insufficient for treating the important 'number-of-paths'metric, whereas this metric definitely falls within the range of theaxiomatic system introduced herein. Finally, the technique ofargument by induction over an arbitrary flowgraph decompositionis clarified and demonstrated in the new hierarchical context.
1 Introduction
The idea of a software metric beingdefined recursively or inductively overthe flowgraph of an arbitrary program,through the analysis of its structuraldecomposition (Refs. 1 and 2) into asequence of'irreducible'sub-flowgraphs,has attracted a good deal of attention inthe software engineering literature.
The germ of this idea can be found inthe author's earlier work (Ref. 3), but amore complete study may be found in arecent paper by Fenton and Whitty (Ref.4). We will find, however, that one poss-ible interpretation of this framework isinsufficient for treating the full range ofsoftware measures of interest, particularlythat of the 'number-of-paths' metric, animportant estimating measure of the costof testing a piece of software. We showhow this limitation may be overcome bystudying a somewhat broader class of'hierarchical' metrics than might other-wise have been intended. Moreover, wewill find that this new axiomatisationadmits an effective technique of inductivereasoning, sufficient to study the import-ant questions that are of interest in thetheory.
By and large, we will use the notationand terminology of Refs. 3 and 4. In par-ticular, a flowgraph F will consist of adirected graph with distinguished startvertex and stop vertex, such that everyvertex or node lies on a path from 'start' to'stop'. Although our results have a moregeneral range of application, we assumefor the sake of simplicity that every node
42
other than 'stop' has outdegree 1 or 2(and, correspondingly, we will refer toprocedure nodes and predicate nodes inthese two instances).
Note that the 'stop' vertex has out-degree 0. Provided that a formal mappinghas been given for the language syntax, aprogram written in any procedural lan-guage gives rise to a unique flowgraph(Ref. 5) and, because this is well known,we will not call un-necessary attention tothis association here, except to note thatthe procedure nodes are identified withthe simple programming language state-ments (of assignment, input and output),whereas the predicate nodes are asso-ciated with the decision points that occurin the program.
The most important preliminary result,one underlying the whole theory, is that ofthe recursive 'sequence nesting' decom-position of any flowgraph.
Theorem 1:Every flowgraph F has a unique
sequential decomposition
into (sequentially) irreducible sub-flow-graphs. Each of these admits a furthernested decomposition
F, = F,(X,,X2 Xni)
into maximal sub-flowgraphs. The proofis given in Refs. 4 and 6.
The recursive application of this resultyields the structure of an arbitrary flow-graph F as a hierarchical tree exhibiting
the top-down decomposition of F intoirreducible sub-flowgraphs. Fenton andWhitty, together with Kaposi, have shown(Refs. 5 and 7) that in large measure thestudy of the 'irreducibles' can be reducedto that of the CGK-irreducibles alone,wherein the procedure nodes are 'col-lapsed' (so that we obtain flowgraphswithout procedure nodes). In this regard,the following result will be useful in thesequel.
Theorem 2:Every CGK-irreducible with n +1 predi-
cate nodes can be obtained by 'grafting anew predicate node' to an edge in a CGK-irreducible having n predicate nodes. Theproof is given in Refs. 7 and 8.
In particular, we note that the 'grafting'process at a particular edge results in anew predicate node, one of whose outgo-ing edges continues in the direction of theoriginal edge, the other to a different butotherwise arbitrary node of the flowgraph.
Finally, we note that the irreduciblesappearing in a decomposition may beclassified according to the number ofpredicate nodes occurring therein. Moregenerally, the complete family of irreduc-ibles can be described as an infinite union:
00
where Sn is the sub-class of irreduciblesinvolving n predicate nodes. If the decom-position of a flowgraph F yields irreduc-ibles of the class S] at most, then F is saidto be a structured flowgraph. In general,we will say that F is n-structured if itsdecomposition involves irreducibles only
fromtheclassesS,,S2 Sn. Note that if Finvolves no irreducibles whatsoever, asoccurs only in cases where F is asequence of simple statements, then wemay say that F is O-structured, just tocomplete the picture. In this way, it is seenthat the notion of 'structured program-ming' fits into a broader scheme, as
Software Engineering Journal March 1987
argued in Ref. 5. We draw on this obser-vation in several facets of the theory to bedeveloped here.
2 The hierarchical axiom scheme
In general, a (flowgraph) measure of soft-ware complexity m is a function
m : flowgraphs—* numbers
(or, from programs to numbers, recallingthe association of flowgraphs with pro-grams). For brevity, we simply refer tosuch functions as 'measures' or 'metrics'.It is hoped that one can discover, identifyand study those metrics that are useful aspredictors or indicators of one or moreaspects of program complexity, forexample programming effort, testing costetc.
Here, we are particularly interested inthose metrics that can be given a recur-sive or inductive definition over the flow-graph decomposition structure. To thisend, we will say that m is a hierarchicalmetric if:
• Axiom 1:
m(S) = 1
• Axiom 2:
(S = simple process)
m(F,°F2°...°Fn) = gn(mFumF2 mFn)
• Axiom 3:
m(F(XuX2 Xn))= hF(mXu mX2,..., mXn)
for computable functions gn (n = 1,2,...)and hF (FeS). Particular attention must begiven, however, to the intended inter-pretation of axiom 3. The axiom is to beunderstood as if we had written
hf(mXu mX2 mXn)= h(F, mXu mX2 mXn)
so that the structure of the irreducible Fappears as an argument to the 'global'computable function h.
The Fenton and Whitty axioms (Ref. 4)are similar to the above. But a strongerinterpretation of axiom 3 is possible, as ifwe were to understand
hf{mXu mX?,...,mXn)= h(mF, mXu mX2 mXn)
wherein the global function h 'sees' Fonlythrough a measure of its own complexityas an irreducible. And, of course, thisrequires that the measures mF (FeS) beprescribed in advance. With such astrengthened interpretation of axiom 3,we will refer to m as a recursive metric. Inthe weaker interpretation we haveprovided for the hierarchical metrics, themeasures mF are not defined, other thanin the form
mF(N,N,..., N) = M0,0 0)
for 'null' processes N, for which weassume m/V = 0. Of course, our two inter-
pretations yield the immediate result ofthe following theorem.
Theorem 3:Every recursive metric is hierarchical.
But the question then remains: are there(interesting) hierarchical metrics that arenot recursive? We answer this question inthe affirmative in the following section. Inso doing, we will of course limit our atten-tion to measures m that are computablein a broadest sense, in that an algorithmexists for computing m(F) for any givenflowgraph F. This is an obvious firstrequirement for consideration.
3 The number-of-paths metric
In general, the number of paths from'start' to 'stop' in a flowgraph will beinfinite, owing to the possible presence ofloops. However, we obtain a finitemeasure
r\ : flowgraphs—* natural numbers
if we restrict our attention to the simplepaths, i.e. those for which no edge is tra-versed more than once. There is then littlepossibility for confusion in referring to thisrestricted measure r\ as the number-of-paths metric (see Ref. 6 for a closelyrelated metric). It is well known (Ref. 6)that such measures provide an extremelyuseful estimate of the testing effort, sinceone often wishes to run sufficiently manytests as to exercise all such programpaths, or at least some fraction thereof.
Moreover, y\ is obviously a computablemetric. We have only to assign a left-rightorientation to the outgoing edges of thepredicate nodes, and then to make use ofthe binary tree-generating algorithm:
at depth : = 0 t o 2 n - lfor each leaf
extend to left if possibleextend to right if possible
finally counting the resulting (simple)paths from 'start' to 'stop'. Here T] is thenumber of vertices in the underlying flow-graph, and we note that the upper limit onthe outer computational loop is thendetermined by a simple application of the'pigeonhole principle'. As a result, thealgorithm has a computationalcomplexity:
2 n - 1
T(n) = J Q 2k2k
2 n - 1= 2 X k2k = O(n2")
representing an exponential order, albeitone of utmost simplicity. As a matter offact the algorithm can easily be extendedto treat the case of a general digraph, hav-ing arbitrary outdegrees. For the digraph
of Fig. la we obtain the binary labelledtree of Fig. 2, with the resulting conclusionthat there are precisely four simple pathsfrom 'start' to 'stop".
Now there is no question that T\ is ametric of great interest and importance inthe software engineering environment. Inthe result that follows, we show that thisnumber-of-paths metric definitely fallswithin the hierarchical axiom schemeintroduced here, and yet it falls outside thestronger (recursive) interpretation of theFenton and Whitty theory.
Theorem 4:The number-of-paths metric r\ is hier-
archical but not recursive.
The proof is as follows. For the first as-sertion, we have only to show that r\ maybe given an inductive definition 1, 2, 3over the decompositional structure of anygiven flowgraph F. In fact, we may write:
• Definition I:
Definition 2:n
T! (F,°F2°...oF,,) = II/ i
Definition 3:V""
= X 11
where {p,, p2,..., p^} is the set of paths of Fand we have
bti =1 if X, is on path p,
0 otherwise
Note that this latter understandingrequires that the structure of F be known,showing the specific interconnection ofthe maximal sub-flowgraphsX,. Moreover,one needs to have computed the simplepaths of the given irreducible F, forexample using the algorithm just intro-duced. And yet all this is well within therange of capability assumed for the com-putable function h appearing in the orig-inal interpretation of axiom 3, so that,indeed, y\ is a hierarchical metric.
On the other hand, a knowledge of i\Falone (but together with T\XU r\X?,...,j]Xn)is not sufficient to determine T](F(X|,X^, ...,Xn)). To see this, consider the two irre-duciblesFandF' shown in Figs, lib andc:.We have
T | F = T , F ' (=4)
as given by the simple paths:
e2e,e4e6e2e5
e4e3e2e5
e4e6
e5e,e2e5
e,e4e6
e,e4e3e2es
in F and F', respectively (one may use thetree algorithm as before here). Now if we
Software Engineering Journal March 1987 43
Fig. 1 Digraph F, flowgraph Fand flowgraph F'a Digraph F b Flowgraph F c Flowgraph F'
suppose that XUX2 X6 are chosen so
that
T)X, = T\X]' ( = 2)
i)X2 = f\X2' (=2)
TlX6 = T)X6' ( = 2)
we will find that, nevertheless,
2 X6) 2 ' X6')
and in fact
t)F(XuX2 X6) = 1 6 + 4 + 1 6 + 4 = 4 0
r\F{X\',X2 X6') = 2 + 8 + 8 + 32= 50
showing that T) is not recursive.The result shows that the stronger inter-
pretation of the Fenton and Whitty axiomsis not sufficiently broad for the intendedpurpose, even though there are a numberof well known hierarchical metrics — forexample McCabe's measure (Ref. 9),Prather's measure |JL (Ref. 3), as extendedin Ref. 4 etc. — that are in fact 'recursive'.
4 Inductive arguments
Having provided an inductive definition ofthe class of flowgraphs, using their struc-tural decomposition, and having intro-duced an inductive axiom scheme oversuch structures to define the notion of ahierarchical software metric, it is entirely
natural to expect that one would be able toprovide inductive proofs for importantproperties of these metrics. Indeed, this isthe case, but there are certain pitfallsawaiting the unwary, and we feel that it isperhaps necessary (and certainly helpful)to outline the general strategy by way ofexample.
As our illustration we choose a pair ofestimates — lower and upper bounds —for the number-of-paths metric T\, interms of the McCabe measure p. Theseestimates have long been a part of the'folklore' of software engineering. How-ever, outside of the easy proof available inrestricting the attention to structured pro-grams, we know of no thoroughly con-vincing argument to handle the generalcase in an elegant way. We would hope tocorrect this deficiency here.
We recall that the McCabe measure
p: flowgraphs—* natural numbers
is generally introduced (Ref. 8) as thecyclomatic number of the digraph F (inthe sense of graph theory); i.e. we have pFequals the number of 'independent'cycles in F.
Note that usually we first augment F byadjoining an edge from 'stop' to 'start' soas to ensure that our axiom 1 is satisfied.McCabe has shown that, in fact,
pF = n + 1
where n is the number of predicate nodesin F, and we will make occasional use of
this fact in the arguments below.Moreover, we make implicit use of the
obvious identities
T,(F) = -n(C(F))p(F) = p(C(F))
where C(F) is the CGK graph of F asdefined in Ref. 5. And with this under-standing, we may state the followingresult, whose proof is given in two parts(we note that neither of the inequalitiescan be relaxed, since both bounds can infact be met).
Theorem 5:For every flowgraph Fwe have
As proof of this, for every simple process Swe have the pair of equalities, since pS =1=T)S. For the first part of the inductiveargument we consider the left-hand ine-quality. Assuming that pF, =S T\Fh for/ = l,2,...,n, we obtain
P(F,°F2°...oFn) = 1 + £ ( P F , - 1)i = i
n
^ 1 + X (V7/ ~ 1)/ = 1
n
^ i + ( n TIF- i)
= I ! T\F, = T1(F,°F2°...°Fn)i = I
We then use theorem 2 to conclude thatpF ^ y\F for every irreducible F. Thus wehave equality (pF = 1 = j]F) in case FeS0.And in assuming the inequality for all irre-ducibles inS,,_, we obtain
p(F) = 1 + p(F - D)
^ 1 + T I ( F - D ) T, (F)
for all FeSn, where D is the 'grafted' predi-cate node. Note that in our last inequalitythere is at least one new (simple) path in F.Finally, if we assume that pX, =S T^X,, for/ =l,2,...,n, we obtain
pF(X, Xn) = P F +
£/ = i
f
= X
(pX,-
= -nF(X1(...,xn)
as required, using the inequality on irre-ducibles as obtained directly above. Notethat the last inequality in our string followsfrom the fact that every X, appears onsome path.
For the second part of the argument(the right-hand inequality) we first sup-pose that TI
then obtain
2|)Fr' f o r / = 1 ) 2 n . We
44 Software Engineering Journal March 1987
Tl(F,oF2o...oFn) = II
i •••• I
2I>F' ' = 2 ^
Again we use theorem 2 to help in con-cluding that •nF^2'"r"1 for every irreduc-ible F. When FeS0 we have the equality (iqF = 1 = 2i>r '). If we assume the desiredinequality for all irreducibles inSn_i, thenfor FeS,, we obtain
T\F = j](F— D) + number of new paths^ 2(>(F D) ' + number of new paths= 21>F 2 + number of new pathss= 2|1F 2 + 2"F 2 = 2(2"F 2) = 2'"r"1
since the new paths are again achieved asif in a sub-flowgraph with pF—2 decisionnodes. Finally, if we assume that iqX,^ 2I>X'"', for /=1,2 n, we obtain
XJ
= x
where
Again, we have used the inequality on irre-ducibles in the last inequality of our string.
Note that without the 'structure argu-ment', the tree-generating algorithm onlyshows that
so that, indeed, we learn a great deal fromthis new perspective.
5 Further observations
So far, we have introduced two classes ofsoftware metrics that can be given aninductive definition over the decompo-sitional structure of the flowgraph of aprogram: the 'recursive' and the 'hier-archical' metrics. Moreover, we haveshown that the converse of theimplication
recursive => hierarchical
does not hold, since the important 'num-ber-of-paths' metric is hierarchical butnot recursive.
More generally, we wish to establish abroader sequence of implications
recursive => hierarchical =£>inductive =̂> computable
involving yet a third sub-classification ofthe 'computable hierarchy': the so-called
Fig. 2 Binary labelled tree
'inductive' metrics. Given any p^O, we willsay that a measure
m: flowgraphs-H> numbers
is inductive (mod p) if our axiom 3
mF(X] Xn) = hF{mX\ mXn)
holds for every irreducible FeSp.Then it is clear that every hierarchical
metric is inductive (mod p) for all p2=0(and conversely). Moreover, if p<q wehave the implication
inductive (mod q) => inductive (mod p)
but not conversely. To see the latter, wehave only to consider the measure
2 if there is more than onelevel of nested 'G's in thehierarchy of F
1 otherwise
where GeSq—Sp. Surely 8 is inductive(modp) since
5F(X, Xn)=2 if, and only if, 8(X,)=2 forsome i
for all FeSp. On the other hand, the know-ledge that 5(X,) = 1 for all / is not sufficient
S(F) =
to determine whether G(Xh...,Xn) hasmore than one level of nested 'G's, so that8 is not inductive (mod q).
Next, we exhibit a metric
co : flowgraphs—* {1,2}
that is computable in the broadest sense,but not inductive (mod p) for any p^O.For any two distinguished flowgraphs Fh
F2 we take
1 if the number of 'F,'s isgreater than or equal tothe number of 'F '̂s in thehierarchy of F
2 otherwise
Clearly the hierarchical tree for thedecomposition of F yields an algorithmicprocedure for computing co(F)= 1 or 2 forany flowgraph F. But even co(F|°F?
0...°F,,)cannot be determined by the knowledgeof co(F,),...,co(Fn) alone, showing that co isnot inductive.
Finally, we should make the observa-tion that within each level of the spectrum(for example inductive, hierarchical,recursive) there is a further gradationaccording to the computational complex-ity of the (best) algorithm for calculatingthe metric. Thus r\ is at least exponentiallycomplex because T(n) = O(n2") for theexecution of the tree-generatingalgorithm alone. This is representative ofan issue that has not yet attracted a greatdeal of attention in the software engineer-ing literature, but one that is destined to beof increasing importance in theapplications.
6 Acknowledgment
The author would like to acknowledge themost helpful comments of an anonymousreferee, resulting in a much improvedtreatment of the notion of an inductivemeasure than would have otherwise beenthe case.
7 References
1 PRATHER, R. E., and GIULIERI, S. J.: 'Decomposition of flowchart schemata', ComputerJournal, 1981, 24, pp. 258-262
2 WHITTY, R. W., FENTON, N. E., and KAPOSI, A. A.: 'Structured programming. A tutorial guide',Software & Microsystems, 1984, 3, pp. 54-65
3 PRATHER, R. E.: 'An axiomatic theory of software complexity measure", Computer Journal,1984, 27, pp. 340-347
4 FENTON, N. E., and WHITTY, R. W.: 'Axiomatic approach to software metrication throughprogram decomposition', Computer Journal, 1987, 30 (to appear)
5 FENTON, N. E., WHITTY, R. W., and KAPOSI, A. A.: 'A generalized mathematical theory ofstructured programming', Theoretical Computer Science, 1985, 36, pp. 145-171
6 LIPAEV, V. V., POZIN, B. A., and STROGANOVA, I. N.: 'Complexity of program module testing',Programming & Computer Software, 1983, 9, pp. 332-337
7 WHITTY, R. W.: 'Generation of an important class of program flowgraphs1. Internal Report,Department of Electrical Engineering, Polytechnic of the South Bank, London, England, 1983
8 FGCIK, J., and KRAL, J.: The hierarchy of program control structures', Computer Journal,1986,29, pp. 24-32
9 McCABE, T.: 'A complexity measure', IEEE Transactions on Software Engineering, 1976,SE-2, pp. 308-320
Prof. R. E. Prather is Caruth Distinguished Professor in the Department of Computer Science,Trinity University, San Antonio, TX 78284, USA.
Software Engineering Journal March 1987 45