reasons to repeat tests

7/25/2019 Reasons to Repeat Tests

1/4

Reasons to Repeat TestsbyJames Bach

(with help from colleagues Doug Homan, Michael Bolton, Ken Pugh, Cem Kaner, BretPettichord, Jim Batterson, Geo utton, plus numerous students who ha!e participated inthe "Mine#eld De$ate" as part of m% testing class& 'he mine#eld analog% as tal) a$outit was inspired $% Brian Maric)*s tal) Classic 'esting Mista)es&+

Testing to fnd bugs is like searching a minefeld or mines. I you just travel the samepath through the feld again and again, you wont fnd a lot o mines. !ctually, thats agreat way to avoid mines. The space represented by a modern sotware product is hugelymore comple" than a minefeld, so its even more o a problem to assume that somesmall number o #paths#, say, a hundred, thousand, or million, when endlessly repeated,will fnd every important bug. !s many tests as a team o testers can physically perormin a ew weeks or months is still not that many tests compared to all the things that canhappen to a product in the feld.

The minefeld analogy is really just another way o saying that testing is a samplingprocess, and we probably want a larger sample, rather than a tiny sample repeated overand over again. $ence the minefeld heuristic is do diferent tests instead orepeating the same tests.

But what do I mean by repeat the same test% Its easy to see that no test can berepeated eactl%, any more than you can e"actly retrace your ootsteps. &ou can getclose, but you will always be a tiny bit o'. (oes repeating a test mean that the secondtime you run the test you have to make sure that sunlight is shining at the same angleonto your mousepad% )aybe. (ont laugh. I did e"perience a bug, once, that wastriggered by sunlight hitting an optical sensor inside a mouse. &ou just cant say or sure

what actors are going to a'ect a test. $owever, when you test you have a certain goaland a certain theory o the system. &ou may very well be able to repeat a test withrespect to that goal and theory in every respect that !* you know about and B* you careabout and +* isnt too e"pensive to repeat. othing is necessarily intractable about that.

Thereore, by a repeated test, I mean a test that includes elements alread% )nown to $eco!ered in other tests. To repeat a test is to repeat some aspecto a previous test. Theminefeld heuristic is saying that its better to try to do something you ha!en*t %et done,then to do something you already have done.

I you disagree with this idea, or i you agree with it, please read urther. Because...

...this analysis is too simplistic- In act, even though diversity in testing is important andpowerul, and even though the argument against repetition is generally valid, I do knowo ten e"ceptions. There are ten specifc reasons why, in some particular situation, it isnot unreasonable to repeat tests. It may even be importantto repeat some tests.

For technical reasonsyou might rationally repeat tests...

. Recharge:i there is a substantial probability o a new problem or a recurring oldproblem that would be caught by a particular e"isting test, or i an old test isapplied to a new code base. This includes re/running a test to veriy a f", orrepeating a test on successively earlierbuilds as you try to discover when aparticular problem or behavior was introduced. This also includes running an oldtest on the same sotware that is running on a new 012. In other words, a tired oldtest can be #recharged# by changes to the technology under test. ote that the
mailto:[email protected]:[email protected]


2/4

recharge e'ect doesnt necessarily mean you shouldrun the same old tests, onlythat it isnt necessaril%irrational to do so.

3. Intermittence:i you suspect that the discovery o a bug is not guaranteed byone correctrun o a test, perhaps due to important variables involved that youcant control in your tests. 4erorming a test that is, to you, e"actly the same as a

test youve perormed beore, may result in discovery o a bug that was alwaysthere but not revealed until the uncontrolled variables line up in a certain way.This is the same reason that a gambler at a slot machine plays again ater losingthe frst time.

5. Retry:i you arent sure that the test was run correctl%the other time6s* it wasperormed. ! variant o this is having several testers ollow the same instructionsand check to see that they all get the same result.

7. Mutation:i you are changing an important part o the test while keepinganother part constant. 8ven though you are repeating some elements o the test,the test as a whole is new, and may reveal new behavior. I mutate a test because

although I have covered something beore, I havent yet covered it well enough. !common orm o mutation is to operate the product the same way while usingdi'erent data. The key di'erence between mutating a test and intermittence orretry is that with mutation the change is directly under your control. )utation isintentional, intermittence results rom incidental actors, and you retry a testbecause o accidental actors.

9. Benchmark:i the repeated tests comprise a perormance standard that gets itsvalue by comparison with previous e"ecutions o the same e"act tests. :henhistorical test data is used as an oracle, then you must take care that the testsyou perorm are comparable to the historical data. $olding tests constant may notbe the only way to make results comparable, but it might be the best choiceavailable.

For business reasonsyou might rationally repeat tests...

;. Inexpensive:i they have some value and are su


3/4

@. Mandated:i, due to contract, management edict, or regulation, you are forcedto run the same e"act tests. $owever, even in these situations, it is oten notnecessary that the mandated tests be the only tests you perorm. &ou may beable to run new tests without violating the mandate.

A. Indiference!"voidance:i the #tests# are being run or some reason other than

fnding bugs, such as or training purposes, demo purposes 6such as anacceptance test that you desperately hope will pass when the customer iswatching*, or to put the system into a certain state. I one o your goals in runninga test is to a!oidbugs, then the principal argument or variation disappears.

I have collected these reasons in the course o probably a hundred hours o debate withtesting students and colleagues. )any o my colleagues preer di'erent words or adi'erent breakdown o reasons. Theres nothing particularly sacred about my way odoing it 6e"cept that some breakdowns would lead to long lists o very similar items*. Theimportant thing is that when I hear a reason that seems not to ft within the ones Ialready have, I add that reason to this list. I started with two reasons, in @@=. I addedthe tenth one in late 3AA7.

"pplying the Mine#eld: "n xample

:ard +unningham wrote #I believe the automation reuired o T(( CTest (riven (esignD6and Eit* is e"empt rom the analogy because the searching we are doing is or the beste"pression o a program in the presence o tests, not the best tests.#

$eres how I think it applies?

&our units tests might pass or they might ail. &ou write them so that they will ail in theevent that some interesting e"pectation is violated. 2o, you call them tests and theyseem to be tests.

:e introduce the minefeld criticism the frst time you run any given test in your unit testsuite. The frst time you run it, it ails, right% 0 course, since it wouldnt be T((,otherwise. The uestions below are inspired by the )inefeld heuristic #vary your testsinstead o repeating them.#

$uestion:-h% run it again.

"ns%er:8"ception F, #recharge.# &ou run it again because you have added code tomake the test pass, thereore running the test again is not merely redundant, the valueo the test has been recharged by the code changing around it.

$uestion: During the course of de!elopment, $ut after the #rst time the test passes,wh% not delete it. -h% $other to run it again.

"ns%er:2everal reasons. Gecharge still applies a little bit, since you may accidentallybreak the product during development, but it could be argued that most o those unittests most o the time dont ail, and some o them are e"tremely unlikely to ail even iyou change the code uite a bit. But here you have the second reason? e"ception F;,#ine"pensive.# Its so cheap to create these tests and to run them and to keep themrunning, while at the same time they do have somevalue, even i not a lot. !nd you havea third reason or some o the tests? e"ception F=, #importance.# Eor a good many o theunit tests, ailure would indicate a very serious problem, were it to occur. I you aretesting something that is particularly comple", or involves many interacting sub/systems,

you may also want to epeat because o e"ception F3, #intermittence#. 4erhapssomething will ail ater the orty/third run because o probabilistic actors in he test.


4/4

Einally, theres F5, the #retry# e"ception, which reminds us that we might not have runthe test correctly, beore. !s you once said, :ard, something might give o' a bad smellonly ater youve seen the test run a hundred times or so. In other words, as a result orunning a test many times, you might come to an insight about the product that revealsa ailure that was there all along, but never noticed.

$uestion:/et*s sa% that *m a reall% good de!eloper and though write good tests, the%0ust don*t fail $ecause 0ust don*t put $ugs into m% code& ha!e a whole lot of tests andthe% don*t fail& -hat was the sense in in!esting in such tests.

"ns%er:Two potential reasons. 8"ception FA, #avoidance1indi'erence.# &ou may createthe tests as a orm o documentation or uture developers and you like them to bee"actly the same in order to minimiHe the chance that they will ail 6and thus be lessuseul as documentation*. 0r maybe you want to impress a customer with your greatsotware, and they wont be as impressed i the tests dont pass. ! second reason ise"ception F@, #mandated.# you may work this way because your peer group or yourmanager reuires you to. This is a little like avoidance e"cept that with a mandate youdo, in act, want to fnd bugs. &ou are searching or them, you just are reuired to use a

certain techniue to do so.

:e thereore see that the airly simple, oten repeated unit tests o T(( may indeed bee"empt rom the minefeld/based argument in avor o varying tests, inasmuch as thereasons I cited apply. But T(( is not e"empt rom this kind o heuristic analysis. It isalwa%sreasonable to uestion the value o repeated tests, and thats what the minefeldinvites us to do.

reasons to repeat tests

Documents