Blog: Testing vs. Checking

This posting is an expansion of a lightning talk that I gave at Agile 2009. Many thanks to Arlo Belshee and James Shore for providing the platform. Many thanks also to the programmers and testers at the session for the volume of support that the talk and the idea received. Special thanks to Joe (J.B.) Rainsberger. Spread the meme!

Postscript: Over the years, some people have misinterpreted this post as a rejection of checking, or of regression testing, or of testing that is assisted by automation. So in addition to reading this post, it is important that you also read this one.

Post-postscript: This is now officially an old document. In the years since it’s been posted, I’ve further refined my take on the subject of testing and checking, mostly in collaboration with my colleague James Bach. Our current thinking on the topic appears on his blog, and I provide some followup here. We’ve also benefitted from comments and questions from other colleagues, so we encourage you to read the comments, and to comment yourself.(April, 2013)

There is confusion in the software development business over a distinction between testing and checking. I will now attempt to make the distinction clearer.

Checking Is Confirmation

Checking is something that we do with the motivation of confirming existing beliefs. Checking is a process of confirmation, verification, and validation. When we already believe something to be true, we verify our belief by checking. We check when we’ve made a change to the code and we want to make sure that everything that worked before still works. When we have an assumption that’s important, we check to make sure the assumption holds. Excellent programmers do a lot of checking as they write and modify their code, creating automated routines that they run frequently to check to make sure that the code hasn’t broken. Checking is focused on making sure that the program doesn’t fail.

Testing Is Exploration and Learning

Testing is something that we do with the motivation of finding new information. Testing is a process of exploration, discovery, investigation, and learning. When we configure, operate, and observe a product with the intention of evaluating it, or with the intention of recognizing a problem that we hadn’t anticipated, we’re testing. We’re testing when we’re trying to find out about the extents and limitations of the product and its design, and when we’re largely driven by questions that haven’t been answered or even asked before. As James Bach and I say in our Rapid Software Testing classes, testing is focused on “learning sufficiently everything that matters about how the program works and about how it might not work.”

Checks Are Machine-Decidable; Tests Require Sapience

A check provides a binary result—true or false, yes or no. Checking is all about asking and answering the question “Does this assertion pass or fail?” Such simple assertions tend to be machine-decidable and are, in and of themselves, value-neutral.

A test has an open-ended result. Testing is about asking and answering the question “Is there a problem here?” That kind of decision requires the application of many human observations combined with many value judgements.

When a check passes, we don’t know whether the program works; we only know that it’s still working within the scope of our expectations. The program might have serious problems, even though the check passes. To paraphrase Dkijstra, “checking can prove the presence of bugs, but not their absence.” Machines can recognize inconsistencies and problems that they have been programmed to recognize, but not new ones. Testing doesn’t tell us whether the program works either—certainty on such questions isn’t available—but testing may provide the basis of a strong inference addressing the question “problem or no problem?”

Testing is, in part, the process of finding out whether our checks have been good enough. When we find a problem through testing, one reasonable response is to write one or more checks to make sure that that particular problem doesn’t crop up again.

Whether we automate the process or not, if we could express our question such that a machine could ask and answer it via an assertion, it’s almost certainly checking. If it requires a human, it’s a sapient process, and is far more likely to be testing. In James Bach‘s seminal blog entry on sapient processes, he says, “My business is software testing. I have heard many people say they are in my business, too. Sometimes, when these people talk about automating tests, I think they probably aren’t in my business, after all. They couldn’t be, because what I think I’m doing is very hard to automate in any meaningful way. So I wonder… what the heck are they automating?” I have an answer: they’re automating checks.

When we talk about “tests” at any level in which we delegate the pass or fail decision to the machine, we’re talking about automated checks. I propose, therefore, that those things that we usually call “unit tests” be called “unit checks“. By the same token, I propose that automated acceptance “tests” (of the kind Ron Jeffries refers to in his blog post on automating story “tests”) become known as automated acceptance checks. These proposals appeared to galvanize a group of skilled programmers and testers in a workshop at Agile 2009, something about which I’ll have more to say in a later blog post.)

Testing Is Not Quality Assurance, But Checking Might Be

You can assure the quality of something over which you have control; that is, you can provide some level of assurance to some degree that it fulfills some requirement, and you can accept responsiblity if it does not fulfill that requirement. If you don’t have authority to change something, you can’t assure its quality, although you can evaluate it and report on what you’ve found. (See pages 6 and 7 of this paper, in which Cem Kaner explains the distinction between testing and quality assurance and cites Johanna Rothman‘s excellent set of questions that help to make the distinction.) Testing is not quality assurance, but acts in service to it; we supply information to programmers and managers who have the authority to make decisions about the project.

Checking, when done by a programmer, is mostly a quality assurance practice. When an programmer writes code, he checks his work. He might do this by running it directly and observing the results, or observing the behaviour of the code under the debugger, but often he writes a set of routines that exercise the code and perform some assertions on it. We call these unit “tests”, but they’re really checks, since the idea is to confirm existing knowledge. In this context, finding new information would be considered a surprise, and typically an unpleasant one. A failing check prompts the programmer to change the code to make it work the way he expects. That’s the quality assurance angle: a programmer helps to assure the quality of his work by checking it.

Testing, the search for new information, is not a quality assurance practice per se. Instead, testing informs quality assurance. Testing, to paraphrase Jerry Weinberg, is gathering information with the intention of informing a decision, or as James Bach says, “questioning a product in order to evaluate it.” Evaluation of a product doesn’t assure its quality, but it can inform decisions that will have an impact on quality. Testing might involve a good deal of checking; I’ll discuss that at more length below.

Checkers Require Specifications; Testers Do Not

A tester, as Jerry Weinberg said, is “someone who knows that things can be different”. As testers, it’s our job to discover information; often that information is in terms of inconsistencies between what people think and what’s true in reality. (Cem Kaner‘s definition of testing covers this nicely: “testing is an empirical, technical investigation of a product, done on behalf of stakeholders, with the intention of revealing quality-related information of the kind that they seek.”)

We often hear old-school “testing” proponents claim that good testing requires specifications that are clear, complete, up-to-date, and unambiguous. (I like to ask these people, “What do you mean by ‘unambiguous’?” They rarely take well to the joke. But I digress.) A tester does not require the certainty of a perfect specification to make useful observations and inferences about the product. Indeed, the tester’s task might be to gather information that exposes weakness or ambiguity in a specification, with the intention of providing information to the people who can clear it up. Part of the tester’s role might be to reveal problems when the plans for the product and the implementation have diverged at some point, even if part of the plan has never been written down. A tester’s task might be to reveal problems that occur when our excellent code calls buggy code in someone else’s library, for which we don’t have a specification. Capable testers can deal easily with such situations.

A person who needs a clear, complete, up-to-date, unambiguous specification to proceed is a checker, not a tester. A person who needs a test script to proceed is a checker, not a tester. A person who does nothing but to compare a program against some reference is a checker, not a tester.

Testing vs. Checking Is A Leaky Abstraction

Joel Spolsky has named a law worthy of the general systems movement, the Law of Leaky Abstractions (“All non-trivial abstractions, to some degree, are leaky.”). In the process of developing a product, we might alternate very rapidly between checking and testing. The distinction between the two lies primarily in our motivations. Let’s look at some examples.

  • A programmer who is writing some new code might be exploring the problem space. In her mind, she has a question about how she should proceed. She writes an assertion—a check. Then she writes some code to make the assertion pass. The assertion doesn’t pass, so she changes the code. The assertion still doesn’t pass. She recognizes that her initial conception of the problem was incomplete, so she changes the assertion, and writes some more code. This time the check passes, indicating that the assertion and the code are in agreement. She has an idea to write another bit of code, and repeats the process of writing a check first, then writing some code to make it pass. She also makes sure that the original check passes. Next, she sees the possibility that the code could fail given a different input. She believes it will succeed, but writes a new check to make sure. It passes. She tries different input. It fails, so she has to investigate the problem. She realizes her mistake, and uses her learning to inform a new check; then she writes functional code to fix the problem and pass the check.So far, her process has been largely exploratory. Even though she’s been using checks to support the process, her focus has been on learning, exploring the problem space, discovering problems in the code, and investigating those problems. In that sense, she’s testing as she’s programming. At the end of this burst of development, she now has some functional code that will go into the product. As a happy side effect, she has another body of code that will help her to check automatically for problems if and when the functional code gets modified.Mark Simpson, a programmer that I spoke to at Agile 2009, said that this cyclic process is like bushwhacking, hacking a new trail through the problem space. There are lots of ways that you could go, and you clear the bush of uncertainty around you in an attempt to get to where you’re going Historically, this process has been called “test-driven development”, which is a little unfortunate in that TDD-style “tests” are actually checks. Yet it would be hard, and even a little unfair, to argue that the overall process is not exploratory to a significant degree. Programmers engaged in TDD have a goal, but the path to the goal is not necessarily clear. If you don’t know exactly where you’re going to end up and exactly how you’re going to get there, you have to do some amount of exploration. The moniker “behavior-driven development” (BDD) helps to clear up the confusion to some degree, but it’s not yet in widespread adoption. BDD uses checks in the form “(The program) should…”, but the development process requires a lot of testing of the ideas as they’re being shaped.
  • Now our programmer looks over her code, and realizes that one of the variables is named in an unclear way, that one line of code would be more readable and maintainable expressed as two, and that a group of three lines could more elegantly and clearly expressed as a for loop. She decides to refactor. She addresses the problems one at a time, running her checks after each change. Her intention in running these checks is not to explore; it’s confirm that nothing’s been messed up. She doesn’t develop new checks; she’s pretty sure the old ones will do. At this point, she’s not really testing the product; she’s checking her work.
  • Much of the traditional “testing” literature suggests that “testing” is a process of validation and verification, as though we already know how the code should work. Although testing does involve some checking, a program that is only checked is likely to be poorly tested. Much of the testing literature focused on correctness—which can be checked—and ignores the sapience that is necessary to inform deeper questions about value, which must be tested. For example, that which is called “boundary testing” is usually boundary checking.The canonical example is that of the program that adds two two-digit integers, where values in the range from -99 to 99 are accepted, and everything else is rejected. The classic advice on how to “test” such a program focuses on boundary conditions, given in a form something like this: “Try -99 and 99 to verify that valid values are accepted, and try -100 and 100 to verify that invalid values are rejected.” I would argue that these “tests” are so weak as to be called checks; they’re frightfully obvious, they’re focused on confirmation, they focus on output rather than outcome, and they could be easily mechanized.If you wanted to test a program like that, you’d configure, operate, observe the product with eyes open to many more risks, including ones that aren’t at the forefront of your consciousness until a problem manifests itself. You’d be prepared to consider anything that might threaten the value of the product—problems related to performance, installability, usability, testability, and many other quality criteria. You’d tend to vary your tests, rather than repeating them. You’d engage curiosity, and perform a smattering of tests unrelated to your current models of risks and threats, with the goal of recognizing unanticipated risks. You might use automation to assist your exploration; perhaps you would use automation to generate data, to track coverage, to parse log files, to probe the registry or the file system for unanticipted effects. Even if you used automation to punch the keys for you, you’d use the automation in an exploratory way; you’d be prepared to change your line of investigation and your tactics when a test reveals surprising information.
  • The exploratory mindset is focused on questions like “What if…?” “I wonder…?” “How does this thing…?” “What happens when I…?” Even though we might be testing a program with a strongly exploratory approach, we will engage a number of confirmatory kinds of ideas. “If I press on that Cancel button, that dialog should go away.” “That field is asking for U.S. ZIP code; the field should accept at least five digits.” “I’ll double-click on ‘foo.doc’, and that file should open in Microsoft Word on this system.” Excellent testers hold these and dozens of other assumptions and assertions as a matter of course. We may not even be conscious of them being checks, but we’re checking sub-consciously as we explore and learn about the program. Should one of these checks fail, we might be prompted to seek new information, or if the behaviour seems reasonable, we might instead change our model of how the program is supposed to work. That’s a heuristic process (a fallible means of solving a problem or making a decision, conducive to learning; we presume that a heuristic usually works but that it might fail).
  • Also at Agile 2009, Chris McMahon gave a presentation called “History of a Large Test Automation Project using Selenium”. He described an approach of using several thousand automated checks (he called them “tests”) to find problems in the application with testing. How to describe the difference? Again, the difference is one of motivation. If you’re running thousands of automated checks with the intention of demonstrating that you’re okay today, just like you were yesterday, you’re checking. You could use those automated checks in a different way, though. If you are trying to answer new questions, “what would happen if we ran our checks on thirty machines at once to really pound the server?” (where we’re focused on stress testing), or “what would happen if we ran our automated checks on this new platform?” (where we’re focused on compatibility testing), or “what would happen if we were to run our automated checks 300 times in a row?” (where we’re focused on flow testing), the category of checks would be leaking into testing (which would be a fine thing in these cases).

There will be much, much more to say about testing vs. confirmation in the days ahead. I can guarantee that people won’t adopt this distinction across the board, nor will they do it overnight. But I encourage you to consider the distinction, and to make it explicit when you can.

Postscript: Over the years, some people have misinterpreted this post as a rejection of checking, or of regression testing, or of testing that is assisted by automation. So in addition to reading this post, it is important that you also read this one.

See more on testing vs. checking.

Related: James Bach on Sapience and Blowing People’s Minds

116 Responses to “Testing vs. Checking”

  1. [...] project lead and project manager that we should consider changing our “Testing” from Checking to Testing.  In order to convey the difference between checking and testing, I used the term Sapient Testing, [...]

  2. [...] Michael Bolton’s post on Testing vs Checking and the follow-up comments, he splits testing into exploratory testing and confirmatory testing [...]

  3. Rahul Verma says:

    Hi Michael,

    This is interesting stuff and led me to read all your posts on this subject and related ones by James.

    I initially thought to publish my comment here, but I’ve got a lot to comment on and discuss,so I decided to post my views as a set of posts on my website – Testing Perspective.

    Here’s the link to the first one:

    All Testing is Confirmatory:
    http://www.testingperspective.com/?p=428

    Following part of your comment#4 led me to write this:

    “To answer your last paragraph, note the last section of my original post: the distinction between tests and checks is leaky. In genearl, though, a test is exploratory if you’re seeking new information; it’s confirmatory (and therefore a check, as I’d prefer to call it) if you’re simply making sure of something that you believe you already know.”

  4. [...] – which, for the present discussion, is meant to include everything from test “checks” to testing “scenarios“, or even test [...]

  5. [...] But wait… that can’t all there is, right? What about Checking vs. Testing? [...]

  6. Quora says:

    Will it ever be possible to practically do largely automated UI testing?…

    From a theoretical standpoint, completely automated testing of any sort is only able to address problems that are amenable to detection by deterministic checking rules.  If these are the sorts of problems you are looking for in a UI, then it would be p…

    Michael replies: I guess Quora’s posting got cut off. Was it automated?

  7. [...] such an approach by pointing out to the poor level of confidence that a limited set of static checks can [...]

  8. [...] time John used the word “testing”, he meant it as (re)defined in Michael Bolton’s Testing vs. Checking. It’s a powerful redefinition, and simpler vocabulary than what I’ve always meant by [...]

  9. [...] And how testers should use their knowledge? Should they be involved in code reviews and writing automated checks to ensure the functionalities are working or should they use their programming skills to test the [...]

  10. [...] with testing skills do checking in 95% of the sprint, they do actual testing in 5%, which eventually leads to finding bugs at the [...]

  11. [...] as testers, and still feel I have to elaborate my comment a little more with the concept of testing vs checking and how these differ. While Janet already cleared out in a more recent comment that we are talking [...]

  12. [...] In short, I would answer – minimal, down to zero, unless all of “testing” is merely a checking. [...]

  13. [...] still works. The automated part is particularly important – if you're doing these tests (or, arguably more accurately, checks) manually, then their cost will keep growing as your application grows. What is it about a database [...]

  14. [...] little steps. I try to describe what to do and spent more time describing what to check (I wrote check, not test !) to get it useful to others. I think our test scripts are part of the product documentation. This [...]

  15. [...] And yet I often felt uncomfortable seeing all the limitations in testing brought by the very nature of static scripts. In order to create comprehensive automation I explore applications a lot. While asking and investigating endless “what if..” questions I often find important problems (i.e. – bugs). Same happens when data sets are fed into the automation for the first time. But afterwards, when the scripts were developed and ‘frozen’, the automation suite dramatically loses in value. Yes, the same thousands of tests can be run unattended again and again, but testing is not happening anymore. Tests were degraded to “checks“. [...]

  16. This is interesting and informative. For any successful application or product QA & Testing remains one of the crucial aspects that can’t be overlooked. Thanks for highlighting this topic.

    Michael replies: Thanks. I’ve looked at your Web site; perhaps you’d be interested in this, too: http://www.developsense.com/blog/2010/05/testers-get-out-of-the-quality-assurance-business/.

  17. [...] only take you so far if you wish to be taken seriously as a tester (to me, this type of person is a checker or at best a check analyst, not a [...]

  18. [...] replace an artist with a computerized robot. This to me is analogous to the discussion around testing vs. checking. Testing is a challenging sapient intellectual art which you cannot teach a computer to do anymore [...]

  19. [...] entendeu? tem um post excelente do michal bolton sobre Testar vs Checar (em inglês) , post o qual motivou este post que você esta lendo [...]

  20. [...] green because the build for that evening was successful. You can also see the number of tests (or checks) that were run. Those are the tests that run as part of the build and they run FAST. All 20 [...]

  21. [...] a talk at Agile 2009 from which he later on wrote about a distinction between testing and checking [1]. He has since then elaborated more in the area and digged deeper into it [2], and continue to do [...]

  22. [...] stumble upon this blog post Testing vs. Checking by Michael Bolton which I found very interesting and a fundamental for software testers.  I tried [...]

  23. [...] Michael Bolton http://www.developsense.com/blog/2009/08/testing-vs-checking/ Share this:TwitterFacebookLike this:LikeBe the first to like this [...]

  24. [...] like this, you apply your knowledge and start exploration and learning (see Michael Bolton testing vs checking and Ajay Balamurugadas on Scouting and [...]

  25. Badri says:

    Hi Michael ,

    You made a excellant point!.You actually nailed the difference between checking and testing.

  26. [...] by definition, right? The same goes for ideas like test framing, considering the difference between checks and tests, or the practice of working hard to master your [...]

  27. [...] What is an automated test? And how does it differ from a manually executed test? The following article by Michael Bolton gives a very good answer to these questions that can be condensed to the t…: Checks Are Machine-Decidable; Tests Require [...]

  28. [...] the software before you are expected to test it. Micheal Bolton wrote a series of wonderful posts about testing vs. checking. I think this is an important distinction to make and it has served me very well rhetorically when [...]

  29. [...] Bolton’s distinction between testing and checking seems to have had some unintended consequences. Don’t get me wrong: I have found immense value in [...]

  30. [...] lot of bits have traveled under the Internet bridge since Michael Bolton’s Testing vs Checking in [...]

  31. Jim Sears says:

    The difference in “testing” and “checking” is that “testing” is the first time the test was run and “checking” is every test execution after that.

    Michael replies: I don’t see it that way. As you’ll see in subsequent posts, a check consists of three things:

    • an observation that is linked to
    • a decision rule (one that produces a bit—yes/no, 1/0, pass/fail, true/false) such that
    • the observation and the decision rule can be applied non-sapiently.

    So it can be a check even the first time you run it. A check is not really a test; the testing stuff brackets the check.

    The benefit of the subsequent test executions that comes from ‘checking’ are did the developer didn’t break existing code, are the business requirements still satisfied. The most important reason is it gives the manual tester more time to perform exploratory testing.

    Have you noticed the exploratory testing that happens as the check is being developed? The exploratory testing that’s required to investigate a surprising outcome from a check?

    Note that checking does not tell you whether the business requirements are still satisfied. The best explication of that point that I’m aware of is here:

    Follow up: The best use of automation is to automate new functionality tests so that multiple code drops in the same release can be re-tested faster and additional manual testing can be performed. Problem is that most IT shops don’t invest in that level of automation development but those that do have much more test AND check coverage.

    I don’t think it’s a good idea to say “automation” when you mean “automated checks”. I presume you mean the latter. There are lots of other uses for automation. I don’t know how one could reasonably categorize any of those uses as “best”.

  32. [...] has gotten overloaded so much that it has lost much of its meaning.   I am reminded of another article.  In this one, Michael Bolton makes a distinction between Checking and Testing.  I love this.   [...]

  33. [...] stirred up the testing community with a suggestion to redefine testing and introduce checking http://www.developsense.com/blog/2009/08/testing-vs-checking/. After reading through different responses I decided to document my thoughts on the [...]

  34. [...] still works. The automated part is particularly important – if you’re doing these tests (or, arguably more accurately, checks) manually, then their cost will keep growing as your application grows. What is it about a database [...]

  35. [...] all of the steps to execute;   otherwise, we end up blindly following a script and we’re checking not testing. This highlights an issue of the commenter viewing a test case as a tangible item when in [...]

  36. [...] Bolton writes in his blog post “Testing vs. Checking” that: Checking is something that we do with the motivation of confirming existing beliefs. Checking [...]

  37. [...] can do it (Michael Bolton wrote a great blog about “testing vs checking” – read it here: http://www.developsense.com/blog/2009/08/testing-vs-checking/). Many managers also have other misconceptions of  testing techniques themselves, a prime example [...]

  38. [...] you’ll quickly see how passionate testers can be with their craft. Here’s a nice quote from Michael Bolton that pretty much sums it [...]

  39. [...] quickly see how passionate testers can be with their craft. Here’s a nice quote from Michael Bolton that pretty much sums it [...]

  40. [...] quickly see how passionate testers can be with their craft. Here’s a nice quote from Michael Bolton that pretty much sums it [...]

  41. [...] you’ll quickly see how passionate testers can be with their craft. Here’s a nice quote from Michael Bolton that pretty much sums it [...]

  42. [...] I separate Automated Testing from Quality Assurance as I see them as separate functions. Usually they are assigned to the QA team members, but I find it helps to think of them as separate functions. Automated testing includes Unit testing and test scripts, Quality Assurance is manually looking at the system not just for bugs, but for things like consistent look and feel, performance, User interface and design issues. (see: Testing vs Checking) [...]

  43. [...] Testing as a word became very misused and was used for simple exploring and demonstrations and for checking. The word was used for various activities which I did not consider SW testing and the meaning of the [...]

  44. [...] So we’re now supposed to do more checking rather than testing? [...]

  45. [...] 2009, Michael Bolton posted on his website an article in which he differentiated between “testing” and [...]

  46. Gabriel Favu says:

    Thank you for the article.
    After reading it, the difference between the two became obvious.

  47. [...] Having read their blog post and gone through the above slide deck, one can notice similarities. This is in contrast with the original work published on Michael Bolton’s blog. [...]

Leave a Reply