how good are your testers? an assessment of testing ability liang huang, chris thomson and mike...

20
How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of Sheffield

Upload: blaise-dixon

Post on 25-Dec-2015

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

How good are your testers? An assessment of testing ability

Liang Huang, Chris Thomson and Mike Holcombe

Department of Computer Science, University of Sheffield

Page 2: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

• Background

• The Experiment

• Preliminary results

• Conclusion and Future research

Page 3: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

• Background

• The Experiment

• Preliminary results

• Conclusion and Future research

Page 4: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

Background• Test First programming

Story

Implementation

Write Tests

Run test cases

All pass?

Rework

No

Yes

Next Story

Story

Write Tests

Implementation

Run test cases

All pass?No

Rework

Next Story

Yes

Test Last Test First

How Test Last and Test First work respectively

Page 5: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

Background

• Previous Studies (TF vs TL)

TF programmers obtained higher productivity1) Kaufmann et al [Kaufmann 2003]2) Janzen et al [Janzen 2005]

TF programmers failed to obtain higher productivity1) Müller et al [Müller 2002]2) Williams [Williams 2003] et al and Maximilien et al [Maximilien 2003]3) George et al [George 2003, 2004]4) Macias et al [Macias 2004]5) Erdogmus [Erdogmus 2005]

Page 6: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

Background

• Previous Studies (TF vs. TL)

TF programmers obtained higher external qualityWilliams [Williams 2003] et al and Maximilien et al [Maximilien 2003]

George et al [George 2003, 2004]

Edwards [Edwards 2003]

TF programmers failed to obtain higher external qualityMüller et al [Müller 2001]

Pancur et al [Pancur 2003]

Macias et al [Macias 2004]

Erdogmus [Erdogmus 2005]

Page 7: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

BackgroundOur Initial study

Results (pertaining to the effectiveness):1) TF teams spent more percentage of time on testing

2) TF teams obtained higher productivity however statistically insignificant

3) The minimum external quality achievable was improved with the increase of time spent on testing as a percentage

4) Linear correlation between Effort spent on Testing and Coding

Page 8: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

Background

• Motivation The differences in terms of effectiveness between TF and TL programmers are

possibly due to some co-variances other than the treatments (testing/programming strategies).

1) TF is not easy to learn [Crispin 2006].

2) Subjects are not skillful of programming following TF.

3) Testing has an impact on the Code quality and productivity [Basili 1986, Stephens 2003].

It is imperative to analyze the tests written by subjects and to assess the subjects’ ability to test, to distinguish the good and bad testers.

Page 9: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

• Background

• The Experiment

• Preliminary results

• Conclusion and Future research

Page 10: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

The Experiment

Context: Sheffield Software Engineering ObservatorySemi-industrial setting.

Medium-sized projects,

Longer development time,

Real external clients

2 groups of subjects 2nd and 3rd year computer science undergraduates.

4 th year MEng and MSc students.

Page 11: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

The Experiment

• Questionnaire ASubjects were given1) A short piece of Java code, and2) 29 potential tests

and asked to select tests for1) Category partition testing (22 out of 29 were necessary for the partition),

and2) Giving Branch coverage (The coverage and redundant choices were

calculated for each of the responses).

The testing ability was measured by 1) For Category partitioning: (The number of Correct choices made) -– (the number of redundant choices)2) Branch coverage obtained, redundant choices for giving branch coverage

Page 12: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

The Experiment

• Procedure

1) Team and group allocation

2) Intensive training of doing TF

3) Software development, including group meetings, management meetings, and client meetings

4) Questionnaire distribution (before Easter vocation)

Page 13: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

• Background

• The Experiment

• Preliminary results

• Conclusion and Future research

Page 14: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

Preliminary results

• Undergraduates achieved lower marks in doing Category partitioning whereas made more redundant choices when giving the branch coverage, however NOT statistically significant.

• Postgraduates did no better than undergraduates when giving the branch coverage.

Page 15: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

Preliminary results

• The postgraduates had higher probability to be Excellent (38% versus 21% for undergraduates), and the much lower probability to be the Poor (13% versus 43% for undergraduates), given that the responses were categorized by “Excellent” (70% and above), “Fair” (50%-70%) and “Poor” (50% and below)

Page 16: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

• Background

• The Experiment

• Preliminary results

• Conclusion and Future research

Page 17: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

Conclusion and Future research

• Limitation

1) Student subjects,

2) Small sample size,

3) Low response rate

4) The ability to select tests, not write test

5) Code based questionnaire only

Page 18: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

Conclusion and Future research

• ConclusionSince category partition method requires some analysis of the specification, and

TF requires programmers to write tests before code

Programmers with higher level of expertise did better when doing category partition, while

failed to do better in the case of giving branch coverage,

which suggestsTF requires higher level of expertise.

Page 19: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

Conclusion and Future research

• Future ResearchQuestionnaires 1) which is NOT code based, and/or 2) in which testing of different level is focusedare to be distributed in a larger group of subjects with different backgrounds.

Questionnaire B (proposed)Subjects were proposed to be given1) A short piece of text specification, and2) A number of potential tests

The testing ability was proposed to be measured by 1) The number of Correct choices made2) The number of redundant choices

Page 20: How good are your testers? An assessment of testing ability Liang Huang, Chris Thomson and Mike Holcombe Department of Computer Science, University of

Thanks for listening