software quality: testing and verification i. © lethbridge/laganière 2001 chapter 9: architecting...
Post on 22-Dec-2015
213 views
TRANSCRIPT
Software Quality:Testing and Verification I
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
2
1.A failure is an unacceptable behaviour exhibited by a system
— The frequency of failures measures software reliability Low failure rate = high reliability
— Failures result from violation of a requirement
2.A defect is a flaw that contributes to a failure— It might take several defects to cause one
failure
3.An error is a software decision that leads to a defect
Software Flaws are identified at three levels
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
3
Eliminating Failures: Testing vs Verification
Testing = running the program with a set of inputs to gain confidence that the software has few defectsGoal: reduce the frequency of failures When done: after the programming is completeMethodology: develop test cases; run the program with
each test caseVerification = formally proving that the software has no defects
Goal: eliminate failuresWhen done: before and after the programming is
completeMethodology: write separate specifications for the
code; prove that the code and the specifications are mathematically equivalent
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
4
Effective and Efficient Testing
Effective testing uncovers as many defects as possible Efficient testing finds defects using the fewest possible tests•Good testing is like detective work:
—The tester must try to understand how programmers and designers think, so as to better find defects.
—The tester must cover all the use case scenarios and options.
—The tester must be suspicious of everything.—The tester must not take a lot of time.
•The tester is not the programmer
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
5
Testing Methods
1. Black box: Testers run the software with a collection of inputs and observe the outputs— none of the source code or design
documentation is available2. Glass box (aka ‘white-box’ or
‘structural’): Testers watch all the steps taken by the software during a run — Testers have access to the source
code and documentation— Individual programmers often use
glass-box testing to verify their own code
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
6
Equivalence classes
•It is impossible to test a software product by brute force, using every possible input value.•So a tester divides all the inputs into groups that will be treated similarly by the software. —These groups are called equivalence classes. —A representative from each group is called a test case.
—The assumption is that if the software has no defects for the test case, then it will have no defects for the entire equivalence class.
•This approach is practical, but•This approach is also flawed (it will not find all defects)
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
7
Examples of equivalence classes
1. Valid input is a month number (1-12). Equivalence classes could be: [-∞..0], [1..12], [13.. ∞]. — E.g., the three test cases could be -1, 5, and 45.
2. Valid input is a course id, with a department name (e.g., CSCI), a 3-digit number (e.g., 260) in the range 001-499, and an optional section (e.g., A, B, C, D, or E). Equivalence classes (test cases) could be:— A valid course id from each one of the 25 departments,
each having a 3-digit number in the range 001-499.— A valid course id with a section— A course id with an invalid department name— A course id with an invalid number — A course id with an invalid section
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
8
Fighting combinatorial explosion
•Combinatorial explosion means that you cannot realistically use a test case from every combination of equivalence classes across the system.—E.g., With just 10 inputs and 5 possible values each, the system has 105 = 100,000 equivalence classes.
•Sooo…—Make sure that at least one test case represents an equivalence class of every different input.
—Include test cases just inside the boundaries of the input values.
—Include test cases just outside the boundaries.
—Include a few random test cases.
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
9
Common Programming Errors
1. Incorrect logical conditions on loops and conditionalsThe landing gear must be deployed whenever the plane is within 2 minutes from landing or takeoff, or within 2000 feet from the ground. If visibility is less than 1000 feet, then the landing gear must be deployed
whenever the plane is within 3 minutes from landing or lower than 2500 feet.
if(!landingGearDeployed && (min(now-takeoffTime,estLandTime-now))< (visibility < 1000 ? 180 :120) || relativeAltitude < (visibility < 1000 ? 2500 :2000) ){ throw new LandingGearException();}
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
10
2. Performing a calculation in the wrong part of a control construct E.g., while(j<maximum)
{ k=someOperation(j); j++;
}if(k==-1) signalAnError();
3. Not terminating a loop or recursive method properlyE.g., while (i < courses.size())
if (id.equals(courses.getElement(i))) … ;
4. Not enforcing the preconditions (correctly) in a use caseE.g., Failure to check that a courseOffering is
not full before adding a student to its class list.
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
11
5. Not handling null conditions (null references) properlyE.g., a Student with no schedule.
6. Not handling singleton conditions (one or zero of something that is normally more than one). E.g., a schedule with 0 courses in it.
7. Off-by-one errorsE.g., for (i=1; i<arrayname.length; i++) {
/* do something */ }
8. Operator precedence errorsE.g., x*y+z instead of x*(y+z)
9. Use of inappropriate standard algorithmsE.g., a non-unstable sort
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
12
Defects in Numerical Algorithms
1. Not enough bits or digits (magnitude/overflow)
2. Not enough decimal places (precision)3. Ordering operations poorly, allowing
errors to propagate4. Assuming exact equality between two
floating point valuesE.g., use abs(v1-v2) < epsilon
instead of v1 == v2
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
13
Defects in Timing and Co-ordination
Critical race — One thread fails because another thread interferes with the ‘normal’ sequence of events.
— Critical races can be prevented by locking data so they cannot be accessed by another thread simultaneously. In Java, synchronized can be used to lock an object until the method terminates.
E.g., consider two students wanting to add the same courseOffering to their schedules at the same time. These two threads must be synchronized in order to prevent a critical race.
Deadlock and livelock— Deadlock is a situation where two or more threads are stopped, each waiting for the other to do something.The system hangs and the threads cannot do anything.
— Livelock is similar, except that the threads can do some computations even though the system is hanging.
E.g., consider a student wanting to access a course that another student is adding to her schedule, and the other student suspends this action and goes to lunch. How can this kind of deadlock be prevented in StressFree?
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
14
Defects in Handling Other Unusual Situations
1. Insufficient throughput or response time
2. Incompatibility with specific hardware/software configurations
3. Inability to handle peak loads or missing resources
4. Inappropriate management of resources5. Inability to recover from a crash6. Ineffective documentation (user
manual, reference manual or on-line help)
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
15
Strategies for Testing Large Systems
Big bang vs integration testing • In big bang testing, you test the entire system as a unit
• A better strategy is incremental testing (sometimes called unit testing): — First test each individual subsystem alone
— Then add more and more subsystems and test them one at a time
— Can do this horizontally or vertically, depending on the architecture (e.g., a client-server architecture allows horizontal testing; server side first and client side second)
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
16
Top-down vs Bottom-up testing
Top-down1. Start by testing the user interface (GUI).— Simjulate he underlying functionality
using stubs (code with the same interface but no functionality).
2. Then work downward, integrating lower and lower layers one at a time.
Bottom-up1. Start by testing the very lowest levels of
the software.— Use drivers to test these modules (Drivers
are simple programs that call the modules at the lower layers).
2. Now work upward, replacing the drivers with the actual modules that call the lower level modules.
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
17
Strategies for incremental testing
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
18
The test-fix-test cycle
When testing exposes a failure:1. A failure report is entered into a failure
tracking system. 2. The failure is screened and assigned a
priority. 3. Low-priority failures might be put on a
known bugs list included with the software’s release notes.
4. Some failure reports might be merged if they seem to expose the same defects.
5. The failure is investigated.6. The defect causing the failure is tracked
down and fixed. 7. A new version of the software is created,
and this cycle is repeated.
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
19
The ripple effect
Efforts to remove one defect will likely add new ones •The maintainer tries to fix problems without fully understanding the ramifications
•The maintainer makes ordinary human errors•The system can regress into a more and more failure-prone state
Regression testing reruns only a subset of the previously-successful test cases at each iteration (i.e., focus on the trouble spots).•It’s expensive to re-run every test case every time the software is updated.
•Regression test cases are carefully selected to cover as much of the system as possible.
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
20
So when do we stop testing?
Stop testing when: 1. all the level 1 test cases are successfully
executed.2. a certain predefined percentage of level 2 and
level 3 test cases have been executed successfully.
3. the targets have been achieved and are maintained for at least two build cycles, where
— A build involves compiling and integrating all the system’s components.
Failure rates fluctuate between builds because:— Different sets of regression tests are used,
and— New defects are introduced as old ones are
fixed
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
21
Who is involved in testing?
1. Original developers conduct the first pass of unit and integration testing.
2. A separate group of developers conducts independent testing.— They have no vested interest, and — They have specific expertise in test case
design and test tool utilization.3. Users and clients
— Alpha testing: performed under the supervision of the software development team.
— Beta testing: Performed in a normal work environment.(An open beta release is the release of low-quality software to the general population.)
— Acceptance testing: customers do it on their own initiative.
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
22
Inspections
An activity in which one or more people critically examine source code or documentation, looking for defects. •Normally team activities, with roles:
— The author — The moderator— The secretary— The paraphrasers try to explain the code
•A peer review process•Inspect only completed documents•Complementary to testing: better at finding maintainability or efficiency defects
•Inspect before before testing.
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
23
Quality Assurance: When things go wrong…
Perform root cause analysis Determine whether problems are caused by: — Lack of training — Schedules too tight — Poor designs or choices of reusable components
Measure— the number of failures encountered by users— the number of failures found when testing— the number of failures found when inspecting— the percentage of code that is reused— The number of questions asked by users at the help desk (as a measure of usability and the quality of documentation)
Strive for continual improvement
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
24
Software Process standards
The personal software process (PSP): • A disciplined approach that a developer can use to improve the quality and efficiency of his or her personal work.
• One of the key tenets is personally inspecting your own work.
The team software process (TSP): • Describes how teams of software engineers can work together effectively.
The software capability maturity model (CMM): • Contains five levels, Organizations start in level 1, and as their processes become better they can move up towards level 5.
ISO 9000-2: • An international standard that lists how an organization can improve its overall software process.
© Lethbridge/Laganière 2001
Chapter 9: Architecting and designing software
25
Difficulties and Risks in Quality Assurance
It’s easy to forget to test some aspects of a software system:
— ‘running the code a few times’ is not enough. — Forgetting certain types of tests impacts quality.
There’s a natural conflict between quality and meeting deadlines. So…
— Create a separate department to oversee QA. — Publish statistics about quality.— Plan adequate time for all activities.
People have different skills, knowledge, and preferences when it comes to quality. So…
— Assign tasks that fit their strengths. — Train people in testing and inspecting techniques.— Provide feedback about performance vis-a-vis quality in software.
— Require developers and maintainers to work alternately on a testing team.