classification as an approachto requirements analysis

PROCEEDINGS OF THE 1ST ASIS SIG/CR CLASSIFICAnON RESEARCH WORKSHOP 131

Classification as an Approach To Requirements Analysis

James D. Palmer, Yiqing li.ang, and Lillian Wang

Center for Software Systems EngineeringSchool oflnfonnationTechnology and Engineering

George Mason University, Fairfax, VA 22030-4444, USA

A CLASSIFICATION APPROACH TO PROBLEM SOLVING

Oassification schemes have proven immensely useful in computerized systems for problem

solving in the physical and biological sciences. A typical example is found in the medical

domain where various taxonomies have been created for diseases, symptOms, laboratory tests,drugs, and so forth. These taxonomies may be used for characterizing specific clinical situations

in expert systems that assist physicians and other health professionals in diagnosis and treatment.

Oassification can be divided into several phases. The activities in the first phase are to find acategory structure which can fit observations. This phase is called "cluster analysis", or"typology", "learning", "clumping", "regionalization", etc., or "classification (construction)" in

our paper, depending on the field to which it is applied. There are many approaches to this

cluster analysis. These include approaches such as numerical taxonomy and conceptual

clustering. Once this category structure has been established, the next phase is to classify new

observations, that is, recognize them as members of one category or another. There are twodifferent situations for the activities in this phase: 1) when the category structure is completelyknown, this kind of activity is called "classification", "indexing", or "classifying" in this paper,and 2) if category structure is partly known or only part of the information of the observation is

known, this kind of activity is called "discriminant analysis" [AND73] [GOR81].

Oancey [CLA84] has characterized classification problem solving as making a selection from

a set of pre-enumerated solutions (in contrast to constructing new solutions). If the problem

solver has a priori knowledge of existing solutions and is able to relate these to the problem

description by data abstraction and refinement, then the problem can be solved using

classification. Other artificial intelligence researchers, especially those investigating machinelearning, have developed new techniques such as conceptual clustering [MIC86] (in contrast to

numerical/statistical clustering), which might be used for developing classification schemes for

problem solving.

CLASSIFICATION AS AN APPROACH TO REQUIREMENTS ANALYSIS

Software requirement<; analysis (or requirements analysis briefly) is the first important stage

in the whole software development life cycle. During this stage, requirements analysts, working

closely with users and customers, define a complete description of the external behavior of the

software system to be built. This description is usually called software requirements, software

definition, or software specification. For example, one of the requirements of a software system

Howitzer Improvement Program (HIP) for upgrading the Howitzer capability is defined as:

Requirement No.1) Default CP initialization parameters shall be stored in nonvolatilememory to maximize user friendliness.

TORONTO, NOV. 4,1990 J. PALMER, Y. LIANG & L. WANG

Palmer, J. D., Liang, Y., & Wang, L. (1990). Classification as an approach to requirements analysis. Proceedings of the 1st ASIS SIG/CR Classification Research Workshop, 131-138. doi: 10.7152/acro.v1i1.12472

ISSN: 2324-9773

132 PROCEEDINGS OF TIlE 1ST ASIS SIG/CR CLASSIFICAnON RESEARCH WORKSHOP

Many people have proposed different requirements taxonomies [SOM85] [ROM85] [FAI85]

for use during this phase. Basically requirements can be classified into functional and

nonfunctional. Each of these two categories can be sutxlivided into subclasses and so on.

Samson [SAM89] gave a rather comprehensive taxonomy of requirements. Following is a

segment of the taxonomy:

FunctionalSoftware Utility

Software Perfonnance

External Interface Requirements

Data Requirements

Logical Data Sttueture

Data Management

NonfunctionalDesign Constraints

Quality GoalsLife-cycle Constraints

SecurityOperational

Operational Constraints

Even so, requirements analysis has generally been a neglected phase for software

development. The consequences of this neglect are so serious that no one involved in software

engineering can afford to ignore them. Based on this, we suggest an early fix of such errors as

ambiguities, inconsistencies. incompleteness and incorrect facts is required (PAL88] in the

requirements analysis phase of the software engineering life-eycle.

The requirements analysis problem seems to be a good application for a knowledge-based

system for detecting conftiC$ among requirements. A proper taxonomy would provide factual

knowledge concerning this. Procedural knowledge would consist of rules for identifying

potential conflicts.. A taxonomy would facilitate evaluation of rules applied to large-scale

systems where there are thousands of requirements, and where rules that apply to top-level

conceptual nodes in the taxonomy would probably apply to descendants as well. Thus, a

taxonomy lends itself to systems that feature inheritance. Samson (SAM89] first ,Suggested a set

of rules indicating how conftiets among requirements in a taxonomy may be resolved. If such a

taxonomy were used for classifying (indexing) the requirements for a specific software system,

these rules would use this classifying (indexing) method as access points to the requirements, and

identify potential conflicts among requirements for that system. We will return to the taxonomy

problem in the next section.

The application just described, for use in requirements conflict resolution, uses a one-step

inference and a single classification structure, namely, the requirements taxonomy. Requirement

No. I can be classified asfunctional / output I data storage. Another HIP requirement is:

Requirement No.2) Automated fire control system for the MI09 self-propelled howitzer

shall require real-time processing.

This is defined as a real-time system, thus it is classified as non-functional I process

constraints I real-time. In this case, conflict is identified directly between these two

requirements. They are in conflict because Requirement No. I calls for a static system while No.

2 calls for a real-time system. This is reflected in the following conflict identification rule:

CLASSIFICAnON AS AN APPROACH TO REQUIREMENTS ANALYSIS TORONTO, NOV. 4, 1990


ISSN: 2324-9773

PROCEEDINGS OF TIIE 1ST ASIS SIG/CR CLASSIFICATION RESEARCH WORKSHOP 133

IF there is a requirement that is classified as functional I output I data storageAND IF there is a requirement that isclassified as non-functional! process constraint I real-time

THEN there is a conftict between these two requirements

In addition to conflict resolution, we believe classification is a powerful tool that may beapplied to the more complex task of software validation and verification. with the ultimateobjective of arriving at a software test plan. TIlls application would use multiple taxonomies andperform multi-stage inference. The remainder of this section describes the problem-solving steps

for this expanded application. including identifying the different taxonomies that would beneeded.

Conflict resolution as part of the activity of requirements validation is carried out in therequirements analysis stage. The implications of software validation and verification in softwarerequirements prompt another more complicated task, that of generating software test plan fromso~are requirements in requirements analysis stage.

Some ground work has been done for the implementation of this task. Much research hasbeen carried out on software metrics and testing. Many software metries have been defined andthere are several hundred commercial test tools available. However, software metries and testtools are generally applied to software projects that have been completed. Our plan is to utilizethis technology, but rather than apply it to software, we would move the operations ofmeasurement and testing to the requirements analysis stage, prior to the labor-intensive task ofwriting software. The basic idea is to apply these operations, in effect, to the softwarerequirements rather than the software itself at a much later time in the software developmentcycle. Armed with these ideas. a test plan for the software system is generated in therequirements analysis stage. resulting in the parallel development of software testing withsoftware development. Problems in identifying the testability of requirements volatility early inthe initial stage of development will encourage refinement of those requirements before muchdevelopment has been done.

At this stage, software requirements are not written in a formal language of any kind.Therefore. the problem is how to apply current technology that depends on existence of software.Our approach is to use three different taxonomies representing. respectively, softwarerequirements, software metrics, and software test tools. The following diagram shows the flow ofinference between these taxonomic components, in order to achieve software validation andverification resulting in a software test plan. Instead of one heuristic association or "great leap"(CLA84] of an abstract problem statement into features that characterize a solution, we useseveral "small leaps". As shown in the follOWing diagram, one such association occurs betweenrequirements taxonomy and software metries taxonomy, and another, between software metriestaxonomy and software test tools taxonomy.

TORONTO, NOV. 4, 1990 J. PALMER, Y. LIANG & L. WANG


ISSN: 2324-9773

134 PROCEEDINGS OF THE 1ST ASIS SIG/CR CLASSIFICATION RESEARCH WORKSHOP

virtual match

software requirements .----.> software test planI ~

I II Iv I

requirements taxonomy I refinement

I II Iv I

software metrics taxonomy ---> software test tools taxonomy. heuristic match

heuristic match

data abstraction(reqt indexing)

The doned arrow labeled vinual match indicates the final match we see as the result of a series ofheuristic matches and other steps and does not exist in itself.

An example using Requirement No. I can be traced through the whole process.. Initially thisrequirement is classified, as described earlier., as junctional/output / data storage. A heuristicmatch from this classification against the software memcs taxonomy would result in selection ofa metrics of "completeness". This. in tum, is mapped by a heuristics match to a test planconsisting of test tools such as inspection, snapshot, test case generator. Validation of thesoftware would occur when test attributes are satisfied.

Thus, completion of the steps of data abstraction, multi-stage heuristic match betweendifferent taxonomies, and refinement would provide a satisfactory solution in the form ofa testplan for the software requirements.

BRIDGING THE GAP BETWEEN REQUIREMENTS AND TAXONOMY

Oassification problem solving for requirements analysis is based on the supposition thatindividual requirements for a software system are indexed according to a taxonomy; Obviously,

classifying (indexing) requirements is the bottle-neck to the whole approach. It is especiallyprohibitive for large scale systems with thousands of requirements as the object. Neither a classicnumerical taxonomy nor conceptual clustering may apply, as observing the attributes and thevariable values of each requirement is impractical. Manual processing is certainly neithermanageable nor dependable. Natural language processing to understand requirements so as toclassify is impossible at this stage.

Our approach to classifying (indexing) a large set of software requirements specifications is to

use semi-automatic narurallanguage processing, based on syntactic analysis, coupled with humaninput using a human-computer interface. From comprehensive reading of several existingrequirements specifications, we find that individual non-functional requirements are oftencharacterized by nouns that indicate their quality goals. Thus, identification of these nounsrepresents a good first approach to classifying non-functional requirements. A requirement suchas: "The system should have the maximum availability" clearly indicates it could be classified asa non-functional quality goal with the noun "availability" as indicator. Functional requirements

CLASSIFICATION AS AN APPROACH TO REQUIREMENTS ANALYSIS TORONTO, NOV. 4, 1990


ISSN: 2324-9773

PROCEEDINGS OF TIlE 1ST ASIS SIG/CR CLASSIFICATION RESEARCH WORKSHOP 135

should, in most cases, imply an action or perfonnance attribute required for the system. Thus,verbs in functional requirements statements should reflect the requirements classification..

Reading of requirements specifications proves this assumption: verbs in the statements or thenouns or gerunds following cenain verb patterns can be used as a means to classify the

requirement

We have investigated using verbs which have been identified in previoUS research. Studies

indicate that verbs can be classified according to their functions (CH081]. But we are interested

more in software function related verbs and their taxonomy. For example, the verb CLUSTER is

most often related to software requirements that can be classified as: functional / datarequirement I data management I processing. PSUPSA (Problem Statement Language I Problem

Statement Analyzer), an automated requirements specification tool developed at the University ofMichigan [TEI77], recognizes 58 relationship types. PSL relationships may be likened to "verb"

which, together with PSL objects, serves to generate "sentence". For the Requirement No.1·example, identifying "memory" and "parameters" as PSL objects, we can relate them to one

another with the PSL relationship type "CONTAINED" to fonn the PSL sentenee "parametersCONTAINED-BY memory". However. this relationship does not indicate the functions for the

entire requirement, as we do by classifying it as functional I output I data strorage. Wood andSommerville [W0088] and Maarek and Kaiser [MAA87] have identified some basic functionsfor software components, such as control, communicate, search, etc., but do not systematically

relate them to as many verb classifications as possible.

We believe a Unix* system to be a more comprehensive system that can be used to describevarious kinds of software functions. Verb selections used in describing Unix commands provide

a list of more than 100 verbs related to and covering most software functions. Efforts were madeto generate a verb taxonomy according to their semantic functions. As discovered by [CH08t],

this taxonomy is rather bushy, not deep. For example, the verb CLUSTER has such subconceptsas categorize, classify. collect, gather, group, etc. These may be called synonyms to the primary

verb.

From this we manually map this verb taxonomy onto a software functional requirements

taxonomy and discover a set of rules. This set of rules is empirical and based on experience. It

corresponds most closely to the "rule of thumb" approach from AI. Experience to date shows thatbased on a set of 8t software requirements for HIP, this set of heuristic rules is able to correctly

classify more than 50% of the requirements. Eleven of them had apparent problems that could be

recognized; these were identified by the system. The system could resolve the problems with

these problematic requirements and classify seven of them correctly.

The approach we adopt is as follows: Lhe set of heuristic rules is first used to automaticallyclassify (index) as many requirements as possible.. For requirements that the system is not sure

how to classify, the system prompts the user with several options. And it identifies to users those

requirements it cannot classify at all and requests a rephrasing of the requirement from the user

according design templates.

• Unix is a Trade Mark of AT&T

TORONTO, NOV. 4,1990 J. PALMER, Y. LIANG & L. WANG


ISSN: 2324-9773

136 PROCEEDINGS OF TIlE 1ST ASIS SIG/CR CLASSIFICATION RESEARCH WORKSHOP

FUTURE WORK

This research is supported by the Virginia Center for Innovative Technology and the USArmy. George Mason University is working on the part of automatically classifying

requirements and metrics taxonomy, while Sonex Enterprises Inc. is working on the tools

taxonomy part. A prototype has been constructed and has been successfully demonstrated. A

system to implement the system is now under development

The three taxonomies of software requirements, software metrics, and software test tools,

especially the latter two, need much more refinement The verb taxonomy needs enhancement as

well. 1be same is troe with the roles that implement the mappings between these taxonomies.

A more schematic approach should be sought for automatically classifying requirements.

Automatic indexing teehniques in information retrieval, with a knowledge-based system to assistmay provide a good guide for future direction.

Multi-inheritance instances should be taken into consideration. Ranking techniques should be

included for the user's decision process iri selecting from matched results. This is particularlytrue when there is more than one hit resulting from a match. In our situation, where there are

several steps that do matching, with probably more than one hit from each match, this is a

necessity.

Qassification schemes other than the enumerative ones could be adopted for comparison.

The facet scheme [PRI87] should be one of those considered.

REFERENCES

[AND73] Anderberg, Michael R., Clustering Analysis for Applications, Academic Press, NY,

1973.

[CH081] Chodorow, Martin S., "Growing taxonomic word trees from dictionaries," wor1cshoppresentation, IBM Research Center, Yorktown Heights, NY, 1981.

[CLA84] Qancey, W. J., "Qassification Problem Solving," Proceedings of NationalConference on Artificial Intelligence, August 6-10, 1984, University of Texas atAustin, AAAI, Los Altos, CA, pp. 49-55.

[FAI85] Fairley, Richard E., Software Engineering Concepts, McGraw-Hill, 1985.

[GOR81] A. D. Gordon, Classification, Chapman and Hall, London, 1981.

[MAA87] Maarek, Yoelle and Gail E. Kaiser, "Using conceptual clustering for classifyingreusable Ada code," ACM SIGAda International Conference, Boston, MA, 1987,pp.208-215.

[MIC86] Michalski, Ryszard S. and Robert E. Stepp, "Learning from Observatin: Conceptual

Qustering," in Machine Learning - an Artificial Intelligence Approach, Ed. by

Ryzard S. Michalski, Jaime G. Carbonell and Tom M. Mitchell, Morgan Kaufmann,Los Altos, CA, 1986, pp.331-363.

CLASSIFICATION AS AN APPROACH TO REQUIREMENTS ANALYSIS TORONTO, NOV. 4, 1990


ISSN: 2324-9773

PROCEEDINGS OF THE 1ST ASlS SIG/CR CLASSIFICAnON RESEARCH WORKSHOP 137

[PAL88] Palmer, James, "Impact of Requirements Uncertainty on Software Productivity", inProductivity: Progress, Prospects, and Payoff, 27th Annual Technical Symposium

Sponsored by Washington, D. C. Chapter of the ACM and NBS. Gaithersburg, MD.June 1988, pp.85-90.

[PRI87] Prieto-Diaz, R. and P. Freeman, "A Software Oassification Scheme forReusability", IEEE Software, Vol. 4, No.1, January 1987, pp.6-16.

[ROM85] Roman, Gruia-Catalin, "A Taxonomy of Current Issues in RequirementsEngineering," IEEE Computer, Vol. 18, No.4, April 1985, pp. 14-22.

[SAM89] Samson, Donaldine E., Automated Assistance for Software Requirements Definition,Ph.D. Dissertation, George Mason University, Fairfax, VA, 1989.

[SOM85] Sommerville, I., Software Engineering, Addison-Wesley, 1982, 1985.

[TEI77] Teichroew, D. and E. Hershey, "PSLlPSA: A Computer Aided Technique forStructured Documentation and Analysis of Infonnation Processing System,", IEEETransactions on Software Engineering, Vol. 3, No.1, 1977, pp. 41-48.

[W0088] Wood, M. and I. Sommerville, "An Information Retrieval System for SoftwareComponents", Software Engineering Journal, Vol. 3, no. 5, September 1988, lEE,London, pp.198-207.

TORONTO, NOV. 4. 1990 J. PALMER, Y. LIANG & L. WANG


ISSN: 2324-9773

138 PROCEEDINGS OF TIIE 1ST ASIS SIG/CR CLASSIFICAnON RESEARCH WORKSHOP

TORONTO, NOV. 4, 1990


ISSN: 2324-9773

classification as an approachto requirements analysis

Documents