an empirical study of software interface faults — an update
TRANSCRIPT
Bell LaboratoriesMurray Hill, New Jersey 07974
Systems and Software Research CenterTechnical Report
An Empirical Study of Software Interface Faults — An Update
Dewayne E. PerryW. Michael Evangelist*
Abstract
In [Perry 85] we reported work in progress on the analysis of software interface faults in a large softwaresystem. In this update of that work, we complete the analysis of the interface faults for this large systemand extend our results. In our original analysis, we determined that there were 15 categories of interfacefaults in the prima facie interface fault set of errors reports. We add one new category as a result of ourcompleted analysis: hardware/software interface problems. We summarize our original findings for the setof errors determined by our operational definition of interface faults, present our new findings for thesingle-file interface faults, and then present the characterization of interface faults with respect to the totalerror population.
published in
Proceedings of the Twentieth AnnualHawaii International Conference on System Sciences.
January 1987.Volume II, pages 113-126.
_
* MCC, Austin, TX 78759 (now at Florica International University)
1. Introduction.
Relatively few studies of errors encountered during the development of large systems
have appeared in computing literature. [Endress 75] is one of the first papers to analyse
errors in a large program. While interface errors were not specifically cited as a category,
several aspects of interface problems are identifiable within various subcategories.
[Schneidewind 79] and [Glass 81] report on different aspects of the development life-
cycle, but neither discuss interface errors as a distinct category. [Ostrand 84] discusses
inadequate specifications, but is unclear as to whether they are concerned about
requirements specifications, design specifications, or interface specifications; interface
errors, as such, are not considered. [Thayer 78] provides an extensive categorization of
errors but with a definition of interface errors that is substantially narrower than ours. In
two of the projects reported on, the interface categories provided a significant number of
errors: 17% and 22.5% for projects 3 and 4 respecitively. [Bowen 80] uses an error
scheme based on [Thayer 78] but reports a much smaller percentage of interface errors
(4.5%). Finally, [Basili 84] provides a definition of interface errors that is acceptably
close to ours (that is, interface errors are "those that are associated with structures
existing outside the module’s local environment but which the module used") and reports
39% of the errors as interface errors in the development of a medium-scale system.
Basili et al. conclude that interface errors appear to be a major problem.
In the succeeding sections we recapitulate the background for this study, summarize our
initial findings, present our analysis of the single-file interface faults, discuss the
combined interface faults sets, and summarize our findings.
2. Background for the study.
The software studied here and in [Perry 85] formed the third release of a real-time
system. The life-cycle represented in the development of this system included a
specification of requirements, high-level design, detailed design, code implementation,
unit test, integration test, and system test, in that order. We examined a selection of
faults reported by testers (for the integration and system test phases) on approximately
350,000 non-commentary source lines from files written in the C programming language.
(The number of non-commentary source lines in a file is the number of newlines
remaining after comments and blank lines are deleted.) We also included fault reports
written against a selection of global header files. (Of course, all faults reported during
testing were removed before system release.)
Faults discovered during testing are reported and tracked by a modification request
tracking system. Access to submitted source files is possible only through the tracking
system, thereby guaranteeing that all fault repair activity is tracked. (One problem,
however, is that occasionally more than one fault is repaired in a file that has been
accessed by specifying a single fault report.) Modification requests (MRs) include both
software faults, our primary interest here, and feature requests (enhancements for the new
release of the system). We use the term MR in the subsequent discussion to refer
exclusively to software faults for which either an implementation file (one with a ".c"
suffix) or a global header file (one with a ".h" suffix) had to be modified.
For the purpose of determining a prima facie set of interface MRs, we used the following
operational definition: an MR is a prima facie interface MR if it
i. requires a change in two or more C files, or
ii. requires a change in a global header file.
MRs falling under criterion i. are interface MRs in the sense that information had to be
altered in more than one independent software unit. MRs falling under criterion ii. are
interface problems for the reason that global header files in C are the principal
mechanism for specifying interfaces among independent software entities. The reader
should understand that files are the smallest entity tracked by the MR tracking system.
Interface faults, however, could occur among separate functions within a file, and some
interface problems could have emerged and been repaired before the software was tested
as an integral unit. Furthermore, many interface faults affecting two or more files require
modification of only one of the files.
Using this definition to determine an initial set of interface fault MRs, we demonstrated
the importance of studying the problem. A close examination of a significant sample of
these MRs confirmed the adequacy of the operational definition — virtually all of the
MRs were arguably interface problems and approximately 36% of the MRs in the
database satisfied one or both of the criteria given.
An important question remained: did any of the remaining single-file MRs contain
interface faults? A preliminary analysis (confined to dividing the MRs into inteface and
non-interface problems) of those MRs that had an impact on one file indicated that a
substantial number of those MRs were indeed interface problems — approximately 51%
of the single file errors were interface faults (approximately 33% of all the errors). The
combined percentage for both the latter group and the group satisfying the operational
definition is 68.6% of the total error population. It is not hard to conclude that the
interface problem is significant and justifies increased attention.
3. Data Classification.
On the basis of the prima facie set we classified the interface faults represented by the
randomly selected data by studying each MR description in detail to determine the
underlying interface problem. As each MR was analyzed, it was placed in either a
previously constructed category or in a new category, the existence of which had to be
justified. Our goal was to construct the least number of categories consistent with a
transparent presentation of the results of the analysis. This analysis resulted in categories
1 through 15. During the analysis of the single-file MRs, it was apparent that we needed
one new category (category 16) to represent the problems encountered in writing
software to deal with hardware interfaces. While this category could be construed as
merely an interface misuse problem, it reflects a problem that occurs with sufficient
regularity to warrant its own category.
The categories dictated by the data are as follows:
1. Construction. These are interface faults endemic to languages that physically
separate the interface specification from the implementation code. For example, in
the C-based system under study, problems associated with using the #include
directive caused many of the MRs in this category.
2. Inadequate functionality. These are faults caused by the fact that some part of the
system assumed, perhaps implicitly, a certain level of functionality not provided by
another part of the system.
3. Disagreements on functionality. These are faults caused, most likely, by a dispute
over the proper location of some functional capability in the software.
4. Changes in functionality. A change in the functional capability of some unit was
made in response to a changing need.
5. Added functionality. A completely new functional capability was recognized and
requested as a system modification.
6. Misuse of interface. These are faults arising from a misunderstanding of the
required interface among separate units.
7. Data structure alteration. Either the size of a data structure was inadequate or it
failed to contain a sufficient number of information fields.
8. Inadequate error processing. Errors were either not detected or not handled
properly.
9. Additions to error processing. Either changes in other units dictated changes in the
handling of errors or error reporting was inadquate.
10. Inadequate postprocessing. These faults reflect the failure to satisfy obligations
(see [Perry 86]) entailed by the execution of procedures and functions. One such
example obligation is the failure to free the computational workspace of
information no longer required.
11. Inadequate interface support. The actual functionality supplied was inadequate to
support the specified capabilities of the interface.
12. Initialization/value errors. A failure to initialize or assign an appropriate value to a
data structure caused these faults.
13. Violation of data constraints. A specified relationship among data items was not
supported by the implementation.
14. Timing/performance problems. These faults were caused by inadequate
synchronization among communicating processes and failure to meet either
explicit or implicit performance criteria.
15. Coordination of changes. Someone failed to communicate modifications to one
software unit to those responsible for other units that depend on the first. There are
two kinds of coordination problems: explicit — for example, where a structure
field is deleted — and implicit — for example, where the coordination required
concerns the implications of a change.
16. Hardware/Software Interfaces. These faults arise from inadequate software
handling of hardware devices.
4. Summary of Initial Results
To obtain a sample of tractable size for detailed examination, we used a random process
to select 94 MRs. Of these, 85 MR descriptions provided sufficient information to
support an analysis of their characteristics. Each of these 85 MR descriptions contained a
statement of the fault, as perceived by the tester who initiated the MR, as well as a
statement of the repair activity undertaken by the developer responsible for the affected
files. A primary purpose for analyzing this sample is to describe some ways in which the
interface problem contributes materially to the rapidly escalating costs of software
development. Another purpose is to derive preliminary suggestions for environmental
changes to minimize the occurrence of interface errors. Finally, the results of the analysis
demonstrate that the problem should be studied in greater depth.
Figure 1: Comparison of the Operational Definition Errors
12.9 Construction
12.9 Inadequate Functionality
2.4 Disagreements on Functionality
1.2 Changes in Functionality
2.4 Added Functionality
3.5 Misuse of Interface
9.4 Data Structure Alteration
15.3 Inadequate Error Processing
3.5 Additions to Error Processing
10.6 Inadequate Postprocessing
7.1 Inadequate Interface Support
3.5 Initialization/value Errors
5.9 Violation of Data Constraints
4.7 Timing/Performance Problems
4.7 Coordination of Changes
The most significant categories in the data are inadequate error processing, construction,
inadequate functionality, inadequate postprocessing, and data structure alteration.
Changes in, additions to, and disagreements about functionality are the least significant
categories.
The reader is referred to [Perry 85] for a discussion of the possible causes and potential
solutions for each of these categories. It should be clear from the data that there is no
single category whose solution would provide a significant reduction in the number of
interface errors. Only by attacking an appropriate number of these problems will there be
a significant reduction in the number of interface errors.
We then partitioned the categories into conceptually related subsets. In Table 1a, we
order the conceptual categories from most significant to least significant. Without regard
to the amount of effort required to address the the individual categories, the problems of
incorrect handling of data and errors, and the problems with functionality offer the
greatest promise for concentrated work. (Please note that the Ref. Nos column in the table
below refers to the category numbers presented in section 3 above.)_ _________________________________________________________
Table 1a. Conceptual Categories - Operational Definition Errors_ _________________________________________________________Category Ref. Nos.* Tot. %_ _________________________________________________________
Data 7, 12, 13 18.8_ _________________________________________________________Functionality 2, 3, 4, 5 18.8_ _________________________________________________________Error Handling 8,9 18.8_ _________________________________________________________Construction 1 12.9_ _________________________________________________________Inadequate Postprocessing 10 10.6_ _________________________________________________________Interface Misuse/Support 6, 11, 16 10.6_ _________________________________________________________Timing/Performance 14 4.7_ _________________________________________________________Coordination 15 4.7_ _________________________________________________________ ⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
A further conceptual simplification is possible (and one that was not presented in the
initial paper): reducing the categories roughly into design and implementation problem
categories. The problems of functionality and additions to error processing are, in
general, design problems rather than implementation problems. Data structure alteration
problems, however, may encompass both designa and implementation problems. For
example, insufficient space for arrays and and missing fields in structures tend to be
design problems, although missing fields may be considered an implementation problem
in the presence of cryptic design specifications. In this case, the data supports our claim
that these categories are design problems and not implementation problems. The
remaining categories are basically implementation problems. We present this
classification for the previously analysed error set in Table 1b. Slightly less than one-
third of these prima facie interface faults have their roots in design flaws while slightly
more than two-thirds are implementation flaws._ _________________________________________________________
Table 1b. Design/Implementation - Operational Definition Errors_ _________________________________________________________Category Ref. Nos.* Tot. %_ _________________________________________________________
Design 2 - 5, 7, 9 31.8_ _________________________________________________________Implementation 1, 6, 8, 10 - 16 68.2_ _________________________________________________________ ⎜
⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜
Another way of viewing these various interface problems is to emphasize those activities
most likely to have an impact on thses problems. Table 1c groups the categories
according to similarity of recommended corrective technique (again, see [Perry 85] for
our discussion of possible causes and solutions). This grouping is not a partition, of
course, since more than one technique might be applied to a particular category._ ______________________________________________________________
Table 1c. Categories of Correction - Operational Definition Errors_ ______________________________________________________________Category Ref. Nos.* Tot. %_ ______________________________________________________________
General Methodology Problems 2, 3, 6, 7, 8, 9, 10, 12, 13, 14 71.8_ ______________________________________________________________Methodology tools 8, 10, 12, 13, 14 40.0_ ______________________________________________________________Methodology Enforcement 1, 2, 11 32.9_ ______________________________________________________________Traceability Enforcement 4, 8 16.5_ ______________________________________________________________Training 2 12.9_ ______________________________________________________________Management 15 4.7_ ______________________________________________________________Customer Interaction 4 1.2_ ______________________________________________________________ ⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
5. Analysis of the Single File Errors
Single-file interface faults tend to be problems in using rather than in defining interfaces.
Of course, where the definition of an interface changes, all those uses of that interface
must change as a result in order to maintain consistency between the definition and all
uses. Where interfaces are not well specified, clusterings of problems occur in their use.
It was those kinds of problems that we sought to catch in out operational definition of
interface errors. However, those are not the only instantiations of interface faults.
According to the definition which have accepted (cf. [Basili 84]), any problems
associated with using structures defined outside the user are interface problems. In this
single file set of interface errors, we have what are essentially isolated cases of interface
use problems, including bits and pieces of previous changes that were not adequately
implied or caught in the set of multiple-file changes.
For the single-file MR group, we used the same random process as we used for the
previous set to select 96 MRs. Of these 96 MRs, 49 were MRs which contained interface
faults — that is, 51.04% of the single-file problems were interface faults. As the single-
file MR set accounted for 64.13% of the total error population, 32.73% (that is, 51.04%
of 64.13%) of the total error population are single-file interface faults.
Given that these are single-file interface faults, it is not suprising that the significant fault
categories are inadequate error processing, coordination of changes, misuse of interfaces,
inadequate postprocessing, and hardware/software interfaces. Construction problems,
disagreements on functionality, additions to error processing, and violations of data
constraints are the least significant; added functionality, data structure alteration, and
timing/performance problems are not represented at all.
Figure 2: Comparison of the Single File Errors
2.0 Construction
6.1 Inadequate Functionality
2.0 Disagreements on Functionality
6.1 Changes in Functionality
0.0 Added Functionality
10.2 Misuse of Interface
0.0 Data Structure Alteration
20.4 Inadequate Error Processing
2.0 Additions to Error Processing
10.2 Inadequate Postprocessing
4.1 Inadequate Interface Support
8.2 Initialization/value Errors
2.0 Violation of Data Constraints
0.0 Timing/Performance Problems
16.3 Coordination of Changes
10.2 Hardware/Software Interface
The ordering of the conceptual categories changes radically as indicated in Table 2a.
Construction and data categories drop from the top half to the bottom half; functionality
drops as well, but remains in the top half of the ranking. Interface misuse/support
problems are much more important in this set (partially because of the hardware/software
interface problems); error handling problems remain significant; and coordination, not
suprisingly, assumes much more importance (a reasonable explanation for this latter
phenomenon is that these problems are bits and pieces of previous changes that were not
adequately implied or caught in the set of multiple-file changes). The problem of
inadequate postprocessing remains at the same level as in the previous set.
_ ______________________________________________Table 2a. Conceptual Categories - Single File Errors_ ______________________________________________
Category Ref. Nos.* Tot. %_ ______________________________________________Interface Misuse/Support 6, 11, 16 24.5_ ______________________________________________Error Handling 8,9 22.5_ ______________________________________________Coordination 15 16.3_ ______________________________________________Functionality 2, 3, 4, 5 14.3_ ______________________________________________Inadequate Postprocessing 10 10.2_ ______________________________________________Data 7, 12, 13 10.2_ ______________________________________________Construction 1 2.0_ ______________________________________________Timing/Performance 14 0.0_ ______________________________________________ ⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
The ratio of design to implementation problems is also quite different in the single-file
MR set: design faults are approximately half of what they were in the prima facie set._ _______________________________________________Table 2b. Design/Implementation - Single File Errors_ _______________________________________________
Category Ref. Nos.* Tot. %_ _______________________________________________Design 2 - 5, 7, 9 16.3_ _______________________________________________Implementation 1, 6, 8, 10 - 16 83.7_ _______________________________________________ ⎜
⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜
In comparing corrective techniques for single-file faults with those of multiple-file faults,
we note that general methodology problems are still the most significant problem, but
less so here that in the previous set. Methodology enforcement drops significantly in this
fault set, while traceability enforcement and management corrective techniques rise
significantly.
_ _________________________________________________________________Table 2c. Categories of Correction - Single File Errors_ _________________________________________________________________Category Ref. Nos.* Tot. %_ _________________________________________________________________
General Methodology Problems 2, 3, 6, 7, 8, 9, 10, 12, 13, 14, 16 61.4_ _________________________________________________________________Methodology Tools 8, 10, 12, 13, 14 40.8_ _________________________________________________________________Traceability Enforcement 4, 8 26.5_ _________________________________________________________________Management 15 16.3_ _________________________________________________________________Methodology Enforcement 1, 2, 11 12.2_ _________________________________________________________________Training 2 6.1_ _________________________________________________________________Customer Interaction 4 6.1_ _________________________________________________________________ ⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
6. Analysis of All Interface Errors
Given that the two sets of results provide such a different view of interface problems, we
present an integration of the results in order to gain a better perspective on the interface
problem as a whole. One way of integrating the results would be to add the numbers of
faults in each category and divide by the total number of interface faults (x/134) in order
to gain the integrated whole percentages. However, as the number of prima facie MRs is
larger than the number of single-file MRs, this would improperly bias the results.
Another way would be to add the percentages presented in Figures 1 and 2 and divided
by 2. This approach would give a fairly good approximation for the relationships
between the different categories, but would weight the findings slightly towards the
single-file set of errors. In order to obtain a representative combination of results, we
have weighted the two sets according to their proportion of the interface fault set
(52.289% for the operational definition set and 47.711% for the single-file set; the reader
will find the individual proportions in Appendix A).
Inadequate error processing, inadequate postprocessing, coordination of changes, and
inadequate functionality are the most significant categories. It is interesting to note that
inadequate post processing assumes a more significant position in the integrated results
than it had in either of the individual results because of its consistency across both fault
sets. Among the less significant categories are additions to error processing,
timing/performance problems, disagreements about and additions to functionality.
Figure 3: Comparison of the All the Interface Errors
7.7 Construction
9.7 Inadequate Functionality
2.2 Disagreements on Functionality
3.5 Changes in Functionality
1.2 Added Functionality
6.7 Misuse of Interface
4.9 Data Structure Alteration
17.7 Inadequate Error Processing
2.8 Additions to Error Processing
10.4 Inadequate Postprocessing
5.6 Inadequate Interface Support
5.7 Initialization/value Errors
4.1 Violation of Data Constraints
2.5 Timing/Performance Problems
10.3 Coordination of Changes
4.9 Hardware/Software Interface
As would be expected in the integration of two disparate sets of results, the resulting
conceptualization differs from the two previous sets, though the results are closer to those
of the single-file set than to the prima facie set.
_ ______________________________________________Table 3a. Conceptual Categories - Combined Errors_ ______________________________________________
Category Ref. Nos.* Tot. %_ ______________________________________________Error Handling 8,9 20.5_ ______________________________________________Interface Misuse/Support 6, 11, 16 17.2_ ______________________________________________Functionality 2, 3, 4, 5 16.7_ ______________________________________________Data 7, 12, 13 14.7_ ______________________________________________Inadequate Postprocessing 10 10.4_ ______________________________________________Coordination 15 10.3_ ______________________________________________Construction 1 7.7_ ______________________________________________Timing/Performance 14 2.5_ ______________________________________________ ⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
It is interesting to note that three-fourths of all the interface errors are implementation
errors and only one-fourth are design errors. Independent of the amount of effort required
for the different fault categories, it appears that concentration on implementation
problems would yield the most significant results._ ______________________________________________Table 3b. Design/Implementation - Interface Errors_ ______________________________________________
Category Ref. Nos.* Tot. %_ ______________________________________________Design 2 - 5, 7, 9 24.4_ ______________________________________________Implementation 1, 6, 8, 10 - 16 75.6_ ______________________________________________ ⎜
⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜
General methodology and methodology tools are the significant categories of corrective
techniques; methodology and traceability enforcement are moderately significant; and
management, training, and costumer interaction are the least significant.
_ _________________________________________________________________Table 3c. Categories of Correction - Combined Errors_ _________________________________________________________________Category Ref. Nos.* Tot. %_ _________________________________________________________________
General Methodology Problems 2, 3, 6, 7, 8, 9, 10, 12, 13, 14, 16 71.4_ _________________________________________________________________Methodology tools 8, 10, 12, 13, 14 40.4_ _________________________________________________________________Methodology Enforcement 1, 2, 11 23.1_ _________________________________________________________________Traceability Enforcement 4, 8 21.3_ _________________________________________________________________Management 15 10.3_ _________________________________________________________________Training 2 9.7_ _________________________________________________________________Customer Interaction 4 3.5_ _________________________________________________________________ ⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
7. Interface Errors and the Total Error Population
Another way of characterizing the relationships between the different interface fault
categories is to determine their proportion to the entire error population. We present this
comparison to provide an indication of the individual interface fault categories in
relationship to the entire set of errors.
Figure 4: Percentage of the Interface Errors in the Total Population
5.3 Construction
6.6 Inadequate Functionality
1.5 Disagreements on Functionality
2.4 Changes in Functionality
.8 Added Functionality
4.6 Misuse of Interface
3.4 Data Structure Alteration
12.2 Inadequate Error Processing
1.9 Additions to Error Processing
7.1 Inadequate Postprocessing
3.9 Inadequate Interface Support
3.9 Initialization/value Errors
2.8 Violation of Data Constraints
1.7 Timing/Performance Problems
7.0 Coordination of Changes
3.3 Hardware/Software Interface
_ ___________________________________________________________Table 4a. Conceptual Categories - Interface Errors, % Total Errors_ ___________________________________________________________
Category Ref. Nos.* Tot. %_ ___________________________________________________________Error Handling 8,9 14.1_ ___________________________________________________________Interface Misuse/Support 6, 11, 16 11.8_ ___________________________________________________________Functionality 2, 3, 4, 5 11.4_ ___________________________________________________________Data 7, 12, 13 10.1_ ___________________________________________________________Inadequate Postprocessing 10 7.1_ ___________________________________________________________Coordination 15 7.0_ ___________________________________________________________Construction 1 5.3_ ___________________________________________________________Timing/Performance 14 1.7_ ___________________________________________________________ ⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
In the proportional design/implementation categorization, it is worth noting that
approximately 52% of the errors are interface implementation errors. This percentage is
then a lower bound on the number of implementation errors in the entire error set. The
percentage of interface design faults also forms a lower bound on the number of design
errors in the entire error set.
In order to gain a better perspective on this way of characterizing the faults that occur, we
need to look at the remaining single-file, non-interface MRs to see what the ratio of
design to implementation errors are and then integrate those findings with this ratio._ ______________________________________________________________Table 4b. Design/Implementation - Interface Errors, % of Total Errors_ ______________________________________________________________
Category Ref. Nos.* Tot. %_ ______________________________________________________________Design 2 - 5, 7, 9 16.7_ ______________________________________________________________Implementation 1, 6, 8, 10 - 16 51.9_ ______________________________________________________________ ⎜
⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜
Similarly, for a truer perspective of the techniques for attacking these problems, we need
to look at the remaining single-file errors. With this lack, we can only view these
findings in corrective techniques as lower bounds on the problem._ _________________________________________________________________
Table 4c. Categories of Correction - % of Total Errors_ _________________________________________________________________Category Ref. Nos.* Tot. %_ _________________________________________________________________
General Methodology Problems 2, 3, 6, 7, 8, 9, 10, 12, 13, 14, 16 49.0_ _________________________________________________________________Methodology tools 8, 10, 12, 13, 14 27.7_ _________________________________________________________________Methodology Enforcement 1, 2, 11 15.8_ _________________________________________________________________Traceability Enforcement 4, 8 14.6_ _________________________________________________________________Management 15 7.0_ _________________________________________________________________Training 2 6.6_ _________________________________________________________________Customer Interaction 4 2.4_ _________________________________________________________________ ⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜⎜
8. Summary
In our initial paper [Perry 85] we showed that interface faults were significant and
presented a categorization of the problems encountered in the detailed analysis of a prima
facie set of interface faults. This fault set accounted for 35.87% of the entire error
population (and over half of the 68.6% interface faults in the entire population)
In this paper, we analysed the single-file interface faults and found our original
categorization to be adequate for describing the faults encountered, except for the need
for one new category: hardware/software interface problems (which in a sense is a special
case of the interface misuse category, but one that occurs with regularity in systems that
must deal with devices). This fault set accounted for 32.73% of the entire population.
We then integrated the results of the analyses in order to gain a balanced picture of
interface problems and presented them in two forms: the first showing the relationship
between the different interface fault categories and the second showing their relationship
in proportion to the entire error set.
The addition of the single-file faults changed the complexion of our initial results
significantly. Error handling problems, while significant in the initial set, assumed even
more significance in the combined set. Interface misuse/support problems assumed much
more significance in the combined results. In general, implementation problems were
much more significant in the single-file set than in the initial set (though they were
significant there as well). However, there are several things to consider before putting
too much emphasis on this weight towards implementation problems:
• a sampling of single-file errors should be examined to determined the ratio of design
to implementation in the non-interface problems; and
• a sampling of feature MRs should be examined to determine the extent to which they
reflect truly new features versus the extent to which they reflect design flaws, or
perhaps requirement specification problems.
Once this has been done (we are in the process of doing this analysis), we will be able to
present a balanced picture of the dichotomy between design and implementation
problems for this large, real-time system.
Given the reinforcement of several of our initial results, we reiterate three important
issues.
• We have provided a categorization of interface faults found in this system. However,
there is no data on the amount of effort associated with each category for
determination and repair. If we can weight the categories, we can then provide a
more accurate picture of interface problems and methods of reducing them.
• As in the initial paper, the extended results confirm the fact that no single technique
geared toward one specific category will yield significant results. Only by providing
mechanisms that attempt to solve a reasonable number of the problems will we see a
significant reduction in interface faults.*
• The categorization of corrective techniques very clearly demonstrates that difficulties
with methodology underly the great majority of interface problems. Resources spent
on a carefully tracked, commonly understood,and generally enforced methodology
that addresses the issues raised in our data analysis would have a high payoff.*
Acknowledgements.
We wish to thank G. G. Lecompte for his invaluable help in analysing the single-file
interface faults.
_
* The Inscape Environment [Perry 86a] is one such attempt. The environment focuses on the use offormal module interface specifications (which by themselve provide means of reducing a number ofinterface problems). Inscape incorporates knowledge of the specifications, the programming language,and the semantics of program construction to provide an environment which participates in the processof constructing and evolving software systems. See [Perry 86b] for a discussion of how the InscapeEnvironment has the potential of preventing a little over 73% of the interface faults, and thus, over 50%of the total error population.
References
[Basili 84] Victor R. Basili and Barry T. Perricone, Software Errors andComplexity: an Empirical Investigation, Communications of theACM, 27:1 (January 1984), 42-52.
[Bowen 80] John B. Bowen, Standard Error Classification to Support SoftwareReliability Assessment, AFIPS Conference Proceedings, 1980National Computer Conference, 1980, 697-705.
[Endress 75] Albert Endres, An Analysis of Errors and Their Causes in SystemPrograms, IEEE Transactions on Software Engineering, Vol.SE-1, No. 2, (June 1975), 140-149.
[Glass 81] Robert L. Glass, Persistent Software Errors, IEEE Transactionson Software Engineering, Vol. SE-7, No. 2, (March 1981), 162-168.
[Ostrand 84] Thomas J. Ostrand and Elaine J. Weyuker, Collecting andCategorizing Software Error Data in an Industrial Environment,The Journal of Systems and Software, 4 (1984), 289-300.
[Perry 85] D. E. Perry and W. M. Evangelist, An Empirical Study of SoftwareInterface Faults, Proceedings of the New Directions InComputing Conference, Trondheim, Norway, (August 1985),32-37.
[Perry 86a] Dewayne E. Perry, The Inscape Program Construction andEvolution Environment, submitted to the Practical SoftwareDevelopment Environments Conference to be held in Palo AltoCA, (December 1986).
[Perry 86b] Dewayne E. Perry, Programmer Productivity in the InscapeEnvironment, Proceedings of GLOBECOM 86, Houston TX,(December 1986).
[Schneidewind 79] N. F. Schneidewind and Heinz-Michael Hoffmann, An Experimentin Software Error Data Collection and Analysis, IEEETransactions on Software Engineering, Vol. SE-5, No. 3, (May1979), 276-286.
[Thayer 78] Thomas A. Thayer, Myron Lipow, and Eldred C. Nelson, SoftwareReliability - A Study of Large Project Reality, TRW Series ofSoftware Technology, Volume 2. North-Holland, 1978.
Appendix
Figure A is obtained from Figure 1 by taking those percentages and multiplying by the
proportion of the total interface error set that is represented by the prima facie fault set,
52.289%.
Figure A: Weighted Percentage of the Operational Definition Errors
6.767 Construction
6.767 Inadequate Functionality
1.230 Disagreements on Functionality
.615 Changes in Functionality
1.230 Added Functionality
1.845 Misuse of Interface
4.922 Data Structure Alteration
7.997 Inadequate Error Processing
1.845 Additions to Error Processing
5.536 Inadequate Postprocessing
3.691 Inadequate Interface Support
1.845 Initialization/value Errors
3.076 Violation of Data Constraints
2.461 Timing/Performance Problems
2.461 Coordination of Changes
.000 Hardware/Software Interface
Figure B is obtained from Figure 2 by taking those percentages and multiplying by the
proportion of the total interface error set that is represented by the single-file fault set,
47.711%
Figure B: Weoghted Percentage of the Single File Errors
.974 Construction
2.921 Inadequate Functionality
.974 Disagreements on Functionality
2.921 Changes in Functionality
.000 Added Functionality
4.868 Misuse of Interface
.000 Data Structure Alteration
9.737 Inadequate Error Processing
.974 Additions to Error Processing
4.868 Inadequate Postprocessing
1.948 Inadequate Interface Support
3.895 Initialization/value Errors
.974 Violation of Data Constraints
.000 Timing/Performance Problems
7.790 Coordination of Changes
4.868 Hardware/Software Interface