Testing vs. Checking

August 29, 2009

Post-postscript: Think of this blog post with its feet up, enjoying a relaxing retirement after a strenuous career. Please read the new version first. In the years since the original post, I’ve further refined my take on the subject of testing and checking, mostly in collaboration with my colleague James Bach. Our current thinking on the topic appears on his blog, and I provide some followup here. We’ve also benefitted from comments and questions from other colleagues, so we encourage you to read the comments, and to comment yourself. Then come back here if you’re still interested. I’ll wait. Read it. (August 2014)

OK. Since you’re back, we can carry on into the past.

Postscript: Over the years, some people have misinterpreted this post as a rejection of checking, or of regression testing, or of testing that is assisted by automation. So in addition to reading this post, it is important that you also read this one.

This posting is an expansion of a lightning talk that I gave at Agile 2009. Many thanks to Arlo Belshee and James Shore for providing the platform. Many thanks also to the programmers and testers at the session for the volume of support that the talk and the idea received. Special thanks to Joe (J.B.) Rainsberger. Spread the meme!

There is confusion in the software development business over a distinction between testing and checking. I will now attempt to make the distinction clearer.

Checking Is Confirmation

Checking is something that we do with the motivation of confirming existing beliefs. Checking is a process of confirmation, verification, and validation. When we already believe something to be true, we verify our belief by checking. We check when we’ve made a change to the code and we want to make sure that everything that worked before still works. When we have an assumption that’s important, we check to make sure the assumption holds. Excellent programmers do a lot of checking as they write and modify their code, creating automated routines that they run frequently to check to make sure that the code hasn’t broken. Checking is focused on making sure that the program doesn’t fail.

Testing Is Exploration and Learning

Testing is something that we do with the motivation of finding new information. Testing is a process of exploration, discovery, investigation, and learning. When we configure, operate, and observe a product with the intention of evaluating it, or with the intention of recognizing a problem that we hadn’t anticipated, we’re testing. We’re testing when we’re trying to find out about the extents and limitations of the product and its design, and when we’re largely driven by questions that haven’t been answered or even asked before. As James Bach and I say in our Rapid Software Testing classes, testing is focused on “learning sufficiently everything that matters about how the program works and about how it might not work.”

Checks Are Machine-Decidable; Tests Require Sapience

A check provides a binary result—true or false, yes or no. Checking is all about asking and answering the question “Does this assertion pass or fail?” Such simple assertions tend to be machine-decidable and are, in and of themselves, value-neutral.

A test has an open-ended result. Testing is about asking and answering the question “Is there a problem here?” That kind of decision requires the application of many human observations combined with many value judgements.

When a check passes, we don’t know whether the program works; we only know that it’s still working within the scope of our expectations. The program might have serious problems, even though the check passes. To paraphrase Dkijstra, “checking can prove the presence of bugs, but not their absence.” Machines can recognize inconsistencies and problems that they have been programmed to recognize, but not new ones. Testing doesn’t tell us whether the program works either—certainty on such questions isn’t available—but testing may provide the basis of a strong inference addressing the question “problem or no problem?”

Testing is, in part, the process of finding out whether our checks have been good enough. When we find a problem through testing, one reasonable response is to write one or more checks to make sure that that particular problem doesn’t crop up again.

Whether we automate the process or not, if we could express our question such that a machine could ask and answer it via an assertion, it’s almost certainly checking. If it requires a human, it’s a sapient process, and is far more likely to be testing. In James Bach‘s seminal blog entry on sapient processes, he says, “My business is software testing. I have heard many people say they are in my business, too. Sometimes, when these people talk about automating tests, I think they probably aren’t in my business, after all. They couldn’t be, because what I think I’m doing is very hard to automate in any meaningful way. So I wonder… what the heck are they automating?” I have an answer: they’re automating checks.

When we talk about “tests” at any level in which we delegate the pass or fail decision to the machine, we’re talking about automated checks. I propose, therefore, that those things that we usually call “unit tests” be called “unit checks“. By the same token, I propose that automated acceptance “tests” (of the kind Ron Jeffries refers to in his blog post on automating story “tests”) become known as automated acceptance checks. These proposals appeared to galvanize a group of skilled programmers and testers in a workshop at Agile 2009, something about which I’ll have more to say in a later blog post.)

Testing Is Not Quality Assurance, But Checking Might Be

You can assure the quality of something over which you have control; that is, you can provide some level of assurance to some degree that it fulfills some requirement, and you can accept responsiblity if it does not fulfill that requirement. If you don’t have authority to change something, you can’t assure its quality, although you can evaluate it and report on what you’ve found. (See pages 6 and 7 of this paper, in which Cem Kaner explains the distinction between testing and quality assurance and cites Johanna Rothman‘s excellent set of questions that help to make the distinction.) Testing is not quality assurance, but acts in service to it; we supply information to programmers and managers who have the authority to make decisions about the project.

Checking, when done by a programmer, is mostly a quality assurance practice. When an programmer writes code, he checks his work. He might do this by running it directly and observing the results, or observing the behaviour of the code under the debugger, but often he writes a set of routines that exercise the code and perform some assertions on it. We call these unit “tests”, but they’re really checks, since the idea is to confirm existing knowledge. In this context, finding new information would be considered a surprise, and typically an unpleasant one. A failing check prompts the programmer to change the code to make it work the way he expects. That’s the quality assurance angle: a programmer helps to assure the quality of his work by checking it.

Testing, the search for new information, is not a quality assurance practice per se. Instead, testing informs quality assurance. Testing, to paraphrase Jerry Weinberg, is gathering information with the intention of informing a decision, or as James Bach says, “questioning a product in order to evaluate it.” Evaluation of a product doesn’t assure its quality, but it can inform decisions that will have an impact on quality. Testing might involve a good deal of checking; I’ll discuss that at more length below.

Checkers Require Specifications; Testers Do Not

A tester, as Jerry Weinberg said, is “someone who knows that things can be different”. As testers, it’s our job to discover information; often that information is in terms of inconsistencies between what people think and what’s true in reality. (Cem Kaner‘s definition of testing covers this nicely: “testing is an empirical, technical investigation of a product, done on behalf of stakeholders, with the intention of revealing quality-related information of the kind that they seek.”)

We often hear old-school “testing” proponents claim that good testing requires specifications that are clear, complete, up-to-date, and unambiguous. (I like to ask these people, “What do you mean by ‘unambiguous’?” They rarely take well to the joke. But I digress.) A tester does not require the certainty of a perfect specification to make useful observations and inferences about the product. Indeed, the tester’s task might be to gather information that exposes weakness or ambiguity in a specification, with the intention of providing information to the people who can clear it up. Part of the tester’s role might be to reveal problems when the plans for the product and the implementation have diverged at some point, even if part of the plan has never been written down. A tester’s task might be to reveal problems that occur when our excellent code calls buggy code in someone else’s library, for which we don’t have a specification. Capable testers can deal easily with such situations.

A person who needs a clear, complete, up-to-date, unambiguous specification to proceed is a checker, not a tester. A person who needs a test script to proceed is a checker, not a tester. A person who does nothing but to compare a program against some reference is a checker, not a tester.