This posting is an expansion of a lightning talk that I gave at Agile 2009. Many thanks to Arlo Belshee and James Shore for providing the platform. Many thanks also to the programmers and testers at the session for the volume of support that the talk and the idea received. Special thanks to Joe (J.B.) Rainsberger. Spread the meme!
Postscript: Over the years, some people have misinterpreted this post as a rejection of checking, or of regression testing, or of testing that is assisted by automation. So in addition to reading this post, it is important that you also read this one.
There is confusion in the software development business over a distinction between testing and checking. I will now attempt to make the distinction clearer.
Checking Is Confirmation
Checking is something that we do with the motivation of confirming existing beliefs. Checking is a process of confirmation, verification, and validation. When we already believe something to be true, we verify our belief by checking. We check when we’ve made a change to the code and we want to make sure that everything that worked before still works. When we have an assumption that’s important, we check to make sure the assumption holds. Excellent programmers do a lot of checking as they write and modify their code, creating automated routines that they run frequently to check to make sure that the code hasn’t broken. Checking is focused on making sure that the program doesn’t fail.
Testing Is Exploration and Learning
Testing is something that we do with the motivation of finding new information. Testing is a process of exploration, discovery, investigation, and learning. When we configure, operate, and observe a product with the intention of evaluating it, or with the intention of recognizing a problem that we hadn’t anticipated, we’re testing. We’re testing when we’re trying to find out about the extents and limitations of the product and its design, and when we’re largely driven by questions that haven’t been answered or even asked before. As James Bach and I say in our Rapid Software Testing classes, testing is focused on “learning sufficiently everything that matters about how the program works and about how it might not work.”
Checks Are Machine-Decidable; Tests Require Sapience
A check provides a binary result—true or false, yes or no. Checking is all about asking and answering the question “Does this assertion pass or fail?” Such simple assertions tend to be machine-decidable and are, in and of themselves, value-neutral.
A test has an open-ended result. Testing is about asking and answering the question “Is there a problem here?” That kind of decision requires the application of many human observations combined with many value judgements.
When a check passes, we don’t know whether the program works; we only know that it’s still working within the scope of our expectations. The program might have serious problems, even though the check passes. To paraphrase Dkijstra, “checking can prove the presence of bugs, but not their absence.” Machines can recognize inconsistencies and problems that they have been programmed to recognize, but not new ones. Testing doesn’t tell us whether the program works either—certainty on such questions isn’t available—but testing may provide the basis of a strong inference addressing the question “problem or no problem?”
Testing is, in part, the process of finding out whether our checks have been good enough. When we find a problem through testing, one reasonable response is to write one or more checks to make sure that that particular problem doesn’t crop up again.
Whether we automate the process or not, if we could express our question such that a machine could ask and answer it via an assertion, it’s almost certainly checking. If it requires a human, it’s a sapient process, and is far more likely to be testing. In James Bach‘s seminal blog entry on sapient processes, he says, “My business is software testing. I have heard many people say they are in my business, too. Sometimes, when these people talk about automating tests, I think they probably aren’t in my business, after all. They couldn’t be, because what I think I’m doing is very hard to automate in any meaningful way. So I wonder… what the heck are they automating?” I have an answer: they’re automating checks.
When we talk about “tests” at any level in which we delegate the pass or fail decision to the machine, we’re talking about automated checks. I propose, therefore, that those things that we usually call “unit tests” be called “unit checks“. By the same token, I propose that automated acceptance “tests” (of the kind Ron Jeffries refers to in his blog post on automating story “tests”) become known as automated acceptance checks. These proposals appeared to galvanize a group of skilled programmers and testers in a workshop at Agile 2009, something about which I’ll have more to say in a later blog post.)
Testing Is Not Quality Assurance, But Checking Might Be
You can assure the quality of something over which you have control; that is, you can provide some level of assurance to some degree that it fulfills some requirement, and you can accept responsiblity if it does not fulfill that requirement. If you don’t have authority to change something, you can’t assure its quality, although you can evaluate it and report on what you’ve found. (See pages 6 and 7 of this paper, in which Cem Kaner explains the distinction between testing and quality assurance and cites Johanna Rothman‘s excellent set of questions that help to make the distinction.) Testing is not quality assurance, but acts in service to it; we supply information to programmers and managers who have the authority to make decisions about the project.
Checking, when done by a programmer, is mostly a quality assurance practice. When an programmer writes code, he checks his work. He might do this by running it directly and observing the results, or observing the behaviour of the code under the debugger, but often he writes a set of routines that exercise the code and perform some assertions on it. We call these unit “tests”, but they’re really checks, since the idea is to confirm existing knowledge. In this context, finding new information would be considered a surprise, and typically an unpleasant one. A failing check prompts the programmer to change the code to make it work the way he expects. That’s the quality assurance angle: a programmer helps to assure the quality of his work by checking it.
Testing, the search for new information, is not a quality assurance practice per se. Instead, testing informs quality assurance. Testing, to paraphrase Jerry Weinberg, is gathering information with the intention of informing a decision, or as James Bach says, “questioning a product in order to evaluate it.” Evaluation of a product doesn’t assure its quality, but it can inform decisions that will have an impact on quality. Testing might involve a good deal of checking; I’ll discuss that at more length below.
Checkers Require Specifications; Testers Do Not
A tester, as Jerry Weinberg said, is “someone who knows that things can be different”. As testers, it’s our job to discover information; often that information is in terms of inconsistencies between what people think and what’s true in reality. (Cem Kaner‘s definition of testing covers this nicely: “testing is an empirical, technical investigation of a product, done on behalf of stakeholders, with the intention of revealing quality-related information of the kind that they seek.”)
We often hear old-school “testing” proponents claim that good testing requires specifications that are clear, complete, up-to-date, and unambiguous. (I like to ask these people, “What do you mean by ‘unambiguous’?” They rarely take well to the joke. But I digress.) A tester does not require the certainty of a perfect specification to make useful observations and inferences about the product. Indeed, the tester’s task might be to gather information that exposes weakness or ambiguity in a specification, with the intention of providing information to the people who can clear it up. Part of the tester’s role might be to reveal problems when the plans for the product and the implementation have diverged at some point, even if part of the plan has never been written down. A tester’s task might be to reveal problems that occur when our excellent code calls buggy code in someone else’s library, for which we don’t have a specification. Capable testers can deal easily with such situations.
A person who needs a clear, complete, up-to-date, unambiguous specification to proceed is a checker, not a tester. A person who needs a test script to proceed is a checker, not a tester. A person who does nothing but to compare a program against some reference is a checker, not a tester.
Testing vs. Checking Is A Leaky Abstraction
Joel Spolsky has named a law worthy of the general systems movement, the Law of Leaky Abstractions (“All non-trivial abstractions, to some degree, are leaky.”). In the process of developing a product, we might alternate very rapidly between checking and testing. The distinction between the two lies primarily in our motivations. Let’s look at some examples.
- A programmer who is writing some new code might be exploring the problem space. In her mind, she has a question about how she should proceed. She writes an assertion—a check. Then she writes some code to make the assertion pass. The assertion doesn’t pass, so she changes the code. The assertion still doesn’t pass. She recognizes that her initial conception of the problem was incomplete, so she changes the assertion, and writes some more code. This time the check passes, indicating that the assertion and the code are in agreement. She has an idea to write another bit of code, and repeats the process of writing a check first, then writing some code to make it pass. She also makes sure that the original check passes. Next, she sees the possibility that the code could fail given a different input. She believes it will succeed, but writes a new check to make sure. It passes. She tries different input. It fails, so she has to investigate the problem. She realizes her mistake, and uses her learning to inform a new check; then she writes functional code to fix the problem and pass the check.So far, her process has been largely exploratory. Even though she’s been using checks to support the process, her focus has been on learning, exploring the problem space, discovering problems in the code, and investigating those problems. In that sense, she’s testing as she’s programming. At the end of this burst of development, she now has some functional code that will go into the product. As a happy side effect, she has another body of code that will help her to check automatically for problems if and when the functional code gets modified.Mark Simpson, a programmer that I spoke to at Agile 2009, said that this cyclic process is like bushwhacking, hacking a new trail through the problem space. There are lots of ways that you could go, and you clear the bush of uncertainty around you in an attempt to get to where you’re going Historically, this process has been called “test-driven development”, which is a little unfortunate in that TDD-style “tests” are actually checks. Yet it would be hard, and even a little unfair, to argue that the overall process is not exploratory to a significant degree. Programmers engaged in TDD have a goal, but the path to the goal is not necessarily clear. If you don’t know exactly where you’re going to end up and exactly how you’re going to get there, you have to do some amount of exploration. The moniker “behavior-driven development” (BDD) helps to clear up the confusion to some degree, but it’s not yet in widespread adoption. BDD uses checks in the form “(The program) should…”, but the development process requires a lot of testing of the ideas as they’re being shaped.
- Now our programmer looks over her code, and realizes that one of the variables is named in an unclear way, that one line of code would be more readable and maintainable expressed as two, and that a group of three lines could more elegantly and clearly expressed as a for loop. She decides to refactor. She addresses the problems one at a time, running her checks after each change. Her intention in running these checks is not to explore; it’s confirm that nothing’s been messed up. She doesn’t develop new checks; she’s pretty sure the old ones will do. At this point, she’s not really testing the product; she’s checking her work.
- Much of the traditional “testing” literature suggests that “testing” is a process of validation and verification, as though we already know how the code should work. Although testing does involve some checking, a program that is only checked is likely to be poorly tested. Much of the testing literature focused on correctness—which can be checked—and ignores the sapience that is necessary to inform deeper questions about value, which must be tested. For example, that which is called “boundary testing” is usually boundary checking.The canonical example is that of the program that adds two two-digit integers, where values in the range from -99 to 99 are accepted, and everything else is rejected. The classic advice on how to “test” such a program focuses on boundary conditions, given in a form something like this: “Try -99 and 99 to verify that valid values are accepted, and try -100 and 100 to verify that invalid values are rejected.” I would argue that these “tests” are so weak as to be called checks; they’re frightfully obvious, they’re focused on confirmation, they focus on output rather than outcome, and they could be easily mechanized.If you wanted to test a program like that, you’d configure, operate, observe the product with eyes open to many more risks, including ones that aren’t at the forefront of your consciousness until a problem manifests itself. You’d be prepared to consider anything that might threaten the value of the product—problems related to performance, installability, usability, testability, and many other quality criteria. You’d tend to vary your tests, rather than repeating them. You’d engage curiosity, and perform a smattering of tests unrelated to your current models of risks and threats, with the goal of recognizing unanticipated risks. You might use automation to assist your exploration; perhaps you would use automation to generate data, to track coverage, to parse log files, to probe the registry or the file system for unanticipted effects. Even if you used automation to punch the keys for you, you’d use the automation in an exploratory way; you’d be prepared to change your line of investigation and your tactics when a test reveals surprising information.
- The exploratory mindset is focused on questions like “What if…?” “I wonder…?” “How does this thing…?” “What happens when I…?” Even though we might be testing a program with a strongly exploratory approach, we will engage a number of confirmatory kinds of ideas. “If I press on that Cancel button, that dialog should go away.” “That field is asking for U.S. ZIP code; the field should accept at least five digits.” “I’ll double-click on ‘foo.doc’, and that file should open in Microsoft Word on this system.” Excellent testers hold these and dozens of other assumptions and assertions as a matter of course. We may not even be conscious of them being checks, but we’re checking sub-consciously as we explore and learn about the program. Should one of these checks fail, we might be prompted to seek new information, or if the behaviour seems reasonable, we might instead change our model of how the program is supposed to work. That’s a heuristic process (a fallible means of solving a problem or making a decision, conducive to learning; we presume that a heuristic usually works but that it might fail).
- Also at Agile 2009, Chris McMahon gave a presentation called “History of a Large Test Automation Project using Selenium”. He described an approach of using several thousand automated checks (he called them “tests”) to find problems in the application with testing. How to describe the difference? Again, the difference is one of motivation. If you’re running thousands of automated checks with the intention of demonstrating that you’re okay today, just like you were yesterday, you’re checking. You could use those automated checks in a different way, though. If you are trying to answer new questions, “what would happen if we ran our checks on thirty machines at once to really pound the server?” (where we’re focused on stress testing), or “what would happen if we ran our automated checks on this new platform?” (where we’re focused on compatibility testing), or “what would happen if we were to run our automated checks 300 times in a row?” (where we’re focused on flow testing), the category of checks would be leaking into testing (which would be a fine thing in these cases).
There will be much, much more to say about testing vs. confirmation in the days ahead. I can guarantee that people won’t adopt this distinction across the board, nor will they do it overnight. But I encourage you to consider the distinction, and to make it explicit when you can.
Postscript: Over the years, some people have misinterpreted this post as a rejection of checking, or of regression testing, or of testing that is assisted by automation. So in addition to reading this post, it is important that you also read this one.
See more on testing vs. checking.
Related: James Bach on Sapience and Blowing People’s Minds
Michael – at Agile 2008, I participated in your session in which each group created a McLuhan tetrad. Our group's topic was "automated testing" and I had a mini-epiphany because I realized that when pushed to its limits, automated testing becomes a tool used for confirmation instead of questioning. (e.g. Checking vs. Testing).
Thank you for your insights and persistence in continuing to question.
Thank you for your comment, Marisa; warmly appreciated.
—Michael B.
Michael,
You puzzle me. Almost left me with an identity crisis (I call myself test automator or test engineer).
I'm wondering if this is the definition of testing or your definition of testing. Have I been using a wrong definition of testing in all these years?
You and James have made it clear that there is no best definition of testing. Now, I know that this is not really a definition, but certainly a clear explanation of what you believe is the difference between checking and testing. But is this difference linguistically correct? And since English is not my native tongue, does that distinction also exist in other languages? E.g. 'Test' is also a Dutch word, but check is merely borrowed from English. In fact if you try to translate check to Dutch, you end up with testen (testing), onderzoeken (investigate), uitproberen (try out). Am I using the Dutch meaning of Testing too literal when I communicate in English?
Let's stick to English, for now. Are you saying a pregnancy test should be called a pregnancy check? That SAT's should be called SAC's? How about drug tests in sports? On the other hand psychological tests are more like exploratory tests, I give you that.
Isn't it that Exploratory Tests is a type of tests, but not all tests are Exploratory Tests? I'd say a test consisting of only checks is another type of tests, but still a test.
Hi, Arjan…
Good questions.
First, there is no "the" defintion of testing. It is always "someone's" defintion of testing—just as there is no property of quality that exists in a product without reference to some person and his or her notion of value.
Next, there are three definitions of "testing" that I use:
1) "Questioning a product in order to evalutate it." That comes from James Bach.
2) "An empirical, technical investigation of a product, done on behalf of stakeholders, with the intention of revealing quality-related information of the kind that they seek." That's Cem Kaner.
3) "Gathering information with the intention of informing a decision." That's a mild paraphrase of Jerry Weinberg's description of testing in Perfect Software.
Note that these all add up to pretty much the same thing. There are other definitions of testing out there. None of them is right or wrong in any absolute sense, but I (and many of my colleagues) would disagree.
Here's an example: Wikipedia's disambiguation page for "test" says that Software Testing is "the process of verifying that a software program works as expected". I disagree with that definition. It's far too limited for my purposes. If we tested according to that definition, we would focus on confirming our expectations, rather than challenging our expectations. This would leave us vulnerable to Black Swans. (In Europe, it was considered certain that all swans were white until European explorers travelled to Australia.)
Is the distinction linguistically correct? It is for me, and it seemed to be for a gang of programmers and testers in a room at Agile 2009. It might not be for you. But either way, "correctness" isn't at issue, since there's no authority who can enforce the distinction. The question is whether the distinction works for you, if it's helpful, if it triggers a different way of thinking. In addition, there's no particular harm in using "test" and "check" interchangably in common parlance.
Your example of a "pregnancy check" is excellent. Whether you call it a "check" or a "test", it's the same activity. However, consider two scenarios.
In the first, woman has been planning to have a baby, and wants to stop drinking alcohol the moment she's aware she's pregnant. Her menstrual period is a couple of days late, so she obtains a pregnancy testing kit from the pharmacy, and she performs the test. She's uncertain. She's not confirming an existing belief; she's seeking information. At that point, I'd say she's testing.
In the second scenario, it's three months later. She did the home test two months ago. After that, she went to her doctor. Her doctor confirmed that she was pregnant. Her periods stopped. She has morning sickness. Her belly is larger. Where she to get the home pregnancy kit out at this point, she'd be checking. That is, the pregnancy test wouldn't be oriented towards revealing new information; it would be strictly confirmatory. And, I think you'll agree, it would be a pretty silly thing to do.
Now there's something else that she might do at that stage, though: she might have a concern about having a baby with Down's syndrome or some other condition. She and her doctor agree that amniocentesis would be warranted. Since this is a search for new information, it's a test.
To answer your last paragraph, note the last section of my original post: the distinction between tests and checks is leaky. In genearl, though, a test is exploratory if you're seeking new information; it's confirmatory (and therefore a check, as I'd prefer to call it) if you're simply making sure of something that you believe you already know.
I hope that helps. Thanks for writing.
—Michael B.
I understand it is indeed a leaky abstraction.
We actually did the pregnancy test the day before Christmas. Indeed for alcohol reasons. The end of that project is nearing. Can't wait to test for toes and fingers
First of all, I like this article a lot – it makes it easy to understand on what some people and companies are focused on and what is being missed.
If I may poke the theory a bit – you use “check” on a high level setting it on par with “test”. Pregnancy test and pregnancy check for example.
But you also use it on a miniature level to explain what a test really is – confirmation of what the tester thinks should happen “When I press cancel this window should close”. In this example the test of the cancel button is actually a check.
I would call testing a mindset which lives in the grey area of seeking new information and confirming what we believe to know already. It includes the unexpected “I tried to confirm that the cancel button closes the window which happens, but also another window opened”. In that case two things occured. We a) completed our check and b) new information was presented to us that was outside our intention to check for.
So it comes down to a mindset – do I sit down and intent to run checks in order for the unexpected to happen? Or do I sit down with the intent to confirm what I think I know.
More often than not we focus on the latter rather than letting the unexpected happen, helped by the experience and conscious decisions of the tester.
Thomas
Thanks for a great post!
When I read the section "The exploratory mindset is focused on questions…" it made me think that
it looks like a paradox.
The more testing you do will result in less testing and more checking.
I.e., the more you test, the more you know, the more heuristics you will develop; and the more you become a checker for things that you assume or already know.
Interesting!
Great post Michael — it helps to reconcile lots of misconceptions about testing world today … "testing is bug finding", "test cases are like atoms of testing", "output/outcome based pricing", "testing by number of bugs" and so on. Now, I can explain many of these "strange" behaviors by simply saying "Well… that is checking not testing". I wonder if someone will say "So what ….? we are fine with checking" …
To extend your views on testing/checking further … let me say "Checking is confirmation through comparison (mostly)" whereas "Testing is problem solving through investigation and sapience"
I will use "checking" more often now ….
Shrini
When I was looking for a definition to Software Testing I saw the following definition:
"Software Testing is an empirical investigation conducted to provide stakeholders with information about the quality of the product or service under test[1], with respect to the context in which it is intended to operate. Software Testing also provides an objective, independent view of the software to allow the business to appreciate and understand the risks at implementation of the software." (http://en.wikipedia.org/wiki/Software_testing).
I think we are facing a context mistake.
Checking is a goal, an action to examine so as to determine accuracy, quality, or condition. In other words, make an examination or investigation, be compatible, similar or consistent, coincide in their characteristics.
Testing is an examination of the characteristics of something, a way, a technique, an instrument by which we can explore, discover, investigate, and consequently learn something.
We are able to to examine or investigate something with a purpose beyond checking, and also to explore, discover, investigate and learning something by another technique other than testing.
Indeed, we can check using testing (or not, such as theorem proving), and we can test to check something (or not, just to investigate discover limits, for example).
Checking doesn´t demand testing in the same way that testing is done not just to check.
I hope I got my concerns clear.
Hi Michael,
This is an excellent post!
When I think of scripted tests chock full of hard-coded steps and data I think of rudimentary 'checks.' Checks are often useful in our business for verifying consistency (regression, build verification, etc.).
But, of course, well-designed tests do require an intelligent and creative person which is perhaps why Beizer stated that "all testing is exploratory in nature."
- Bj -
In regards functional specs and testing. I view a functional spec as a second "product or deliverable" to be tested. While I'm "checking" the software I'm "testing" the functional spec. I’ve found a functional spec can be as or more riddled with flaws than the product itself but provides an excellent opportunity for testing. Most “old-school” proponents see this artifact/document as a way to guide and simplify testing (really checking), but I would argue that it can complicate it because now there are two products/deliverables that need to be tested (I allot time and effort toward functional spec testing).
–Benjamin Yaroch
Wonderful posting! And excellent comments too.
When *I* read the posting it struck me, that Testing consists of Exploring and Checking, while Checking is done by .. erhm.. checking.
So Testing belongs to the process domain (how we do things), while checking can be both process and execution-domain (what we do when we do things) – to be a bit academic.
This makes perfect sense – to me, at least. We may choose to be exploratory in our testing process, in which we ask new questions and do some checking to answer them, or confirmative in our testing process, where we just do the checking we've established as being sufficient.
So – no testing without checking, but we can check without being testing. In short. Perhaps too short.
Thank you for this wonderful post.
@Arjan…
We actually did the pregnancy test the day before Christmas. Indeed for alcohol reasons. The end of that project is nearing. Can't wait to test for toes and fingers
Wow… congratulations! As a daddy with a daughter, I can assure you it's a fabulous trip!
—Michael B.
@Thomas…
I like your comments. They amplify nicely on my point. Thanks!
—Michael B.
@Fernando…
I agree with the definition of testing that you quote from Wikipedia. (It comes from Cem Kaner.)
I hope I got my concerns clear.
I'm sorry, but I don't understand the distinctions that you're making. You're welcome, of course, to try again.
—Michael B.
@bj
Thanks for the kind words.
—Michael B.
@DynamoBen
Yes, you suggest an important point: if you're checking the product against the functional spec, it seems to me that you're really testing neither one. It seems to me that review, like any other testing activity, can be done in a confirmatory way or in an investigative way—going through the motions, or as a serious probe.
—Michael B.
@Carsten…
Thanks for the comments, especially
…no testing without checking, but we can check without it being testing.
—Michael B.
There is something amiss here. The word "experiment" appears nowhere in the blog post. You say that testing "is a process of exploration, discovery, investigation, and learning." OK, a very important process to be sure. But in science this process is known as experimentation. I do not recommend using the word "testing" differently than how it is used in other technical (scientific/engineering/financial) domains. It will do nothing but add to any confusion that already exists. Make up a new word or add an adjective if required. I suggest "experimental testing."
When a science experiment is performed the required equipment must be designed, fabricated, and assembled and the necessary instrumentation chosen and installed. This introduces the problem of experimenter's regress. Often a great deal of of analysis and testing must be performed on the system and components before the actual experiment can even be attempted. This preliminary analysis and testing is analogous to the type of testing you point out that requires "sapience".
So I agree that testers must do more than just check output against specifications (unless exhaustive testing, which is rarely possible). Good testers know that all computer languages and algorithms have typical failure modes independent of the whatever the functional specifications may be. The code should be examined (via "exploration and learning") for these weaknesses and checked for actual flaws. IMHO, I do not think the meme presented is the best way to bring this idea out.
Hi Michael,
First of all, thank you for the post, you have no idea of how liberating this whole "new" concept turned out to be… to make it short, I used to find the fact that I hated testing quite contradicting to my role as a Tester and now, Test Architect… but now I realize that I hate checking… better yet, that I hate having to check without having the time to test (due to allocation/deadlines, mainly)… since there’s confusion, we sometimes get stuck at it…
About the main topic here… I believe it might be a bit difficult in the beginning because there is a conflict between the "old" and "new" meaning of the word testing and people still tend to use them interchangeably… which, in my discussion groups, have made us go around in circles (pretty much like the "Bush-Hu" joke http://tinyurl.com/nl3v58)…
But then, I read the @gmcrews comment, which put the word "experiment" in the loop… which, at least for now, may make things a bit clearer… let me take @Carsten's sentence as an example: "…no testing without checking, but we can check without it being testing." My understanding of this, using the word "experimenting" was: …no testing without checking, but we can check without it being experimenting . So, since correctness isn’t at issue, and the distinction already triggered different way of thinking, in an attempt to try to help people here to understand while the new meaning sinks in, we maintain the old meaning of “testing” and subdivided it into two categories: checking and experimenting. I believed it helped.
For the record, yes, I feel more comfortable in keeping these two categories labeled as “checking” and “testing” for what I mentioned in my first paragraph and will later work on disseminating this new meaning.
Thanks again for the post and to all who commented, and sorry for the long comment.
I think this is an important and useful distinction. But I think it is too late to try to purify the usage of the word "test(ing)" to match your definition. Too many people who you say are just "checking" are called testers" and too many people consider "testing" to be what you call "checking." It would be a lot easier for you to find a new word for what you call "testing" and leave "testing" as the overarching abstraction that includes both. That way, people who are "checking" something and call it testing are being vague rather than wrong. And when someone uses your new word, we'll know that they are being precise and not talking about "checking".
Hi, all…
@gmcrews: Thank you for bringing up the experiment angle. In a later post, I hope to address the idea that we can do testing without experimentation. While people are waiting, I'd refer them to Jerry (Gerald M.) Weinberg's book Perfect Software and Other Illusions About Testing for some ideas on testing without experimentation. For example, reviews are a form of testing that aren't exactly experiments in the scientific sense. Good, novel, and useful experiments tend to be tests, but to perform them well, the experimenter must do a lot of checking, just as in software testing. In addition, some "experiments" (say, the ones that are presented in many high school classes) look more like checks than tests.
@Newton: Thanks for writing. I was inspired to give the lightning talk at Agile 2009 after someone said, loudly, "Manual testing sucks!" Nah; maybe manual checking sucks, but manual testing is great (and manual testing assisted by judicious automation is even greater). Yours is just the reaction (the thinking part, and the enjoyment of fulfilling testing) that I'm hoping to stimulate.
@Gerard: I'm not really interested in "purifying" the usage, although it was fun to hear so many people picking up on the idea at Agile 2009. There are ways of testing that reveal more or less information of different kinds in different contexts, and there are ways of testing that are more or less expensive. But for a context-driven person there's no right or wrong way to test (or check) in any absolute sense; not even context-driven testing is always the right way to go. Nor is there a right or wrong way to use the words. If people adopted testing vs. checking universally, that would be cool, I guess, but it's a pretty unrealistic to believe that they would. I'm glad you agree the distinction is useful. I'd refer you to my own comment above: "…there's no particular harm in using 'test' and 'check' interchangably in common parlance." Count me in with McLuhan; "I don't want them to agree with me; I just want them to think."
Sorry for my confusing comment. I thought the discussion was around checking in a opposite meaning of testing in the same domain. I was trying to call attention to the point that we may check by testing, and not *exclusivelly* by testing in the same manner we may test with the purpose of checking, and not *exclusivelly* to check.
After reading all point of view posted in this loop – and I am grateful to @gmcrews – I realized your intention in not to impose a new concept, better than that, point out to everybody the challenge of testing. Your post has brought me a great reflexion about testing activities in a such way that I hope to see exploratory testing being largely included – and strongly recommended – to the most of software projects from now on (at least here with my coworkers).
I believe @Newton made a comment which summarize my understanding by mentioning that testing comes in a two flavours: checking (sour) and experimenting (candy).
This is a brilliant post!
I think the distinction can be very useful to explain and market your work (as a tester), and possibly also good when discussing test strategies with management.
But I don't like that "validation" has been put in the checking group.
I agree with Wikipedia that 'validation is ensuring "you built the right product" and verification is ensuring "you built the product as intended." Validation is confirming that it satisfies stakeholder's or user's needs.'
To do this, you need to know things about the product's intended use, sapience is necessary, but not requirements.
And if you use validation on smaller part against certain quality attributes, e.g. "is this feature easy to use", I think it should be part of the "true" testing.
Michael,
Thanks for raising this perspective and I do see the merit of 'action/verb' distinction between testing and checking. But I do see a problem with the use of "check" as a noun in an Agile context.
For example, I may write an acceptance example/test/check prior to working on a new feature. The example is now used as a collaboration vehicle between the product owner and the team. As work progresses the example will be refined and may be used to faciliate both automated and exploratory evaluation of the new feature. The example may drive the design of the feature via TDD/ATDD. When the feature is complete the example is used to check that the agreed upon behaviour still works.
My point is that this single artifact has been used for many purposes. So to call this example or automated story test a "check" would not be accurate as it does so much more!
So, I agree with the verb distinction that you mention but don't see a case to change our wording around the nouns like 'test'.
Declan
@Declan
First, as both the original text and the comments outline, I'm not really interested in changing people's language. Got something that is being run entirely by automation and that receives no human evaluation? Want to call it a test? Go ahead; I wouldn't want to stop you. As long as you've considered the distinction, that's fine with me. (This topic is going to have to get its own independent blog post.)
Second (and this is the important bit) as long as the check is subject to ongoing design, analysis, evaluation, and attention from human people. It's when the engagement of humans diminishes or disappears—when the sapience gets left out—it becomes a check. If it does more than checking, it's not a check, by definition.
Cheers,
—Michael B.
OK Michael now I'm confused. You say you don't want to change the language people use yet you said in your post "I propose that automated acceptance 'tests' (of the kind Ron Jeffries refers to in his blog post on automating story 'tests') become known as automated acceptance checks".
Seems like you do want people to use the word 'check' instead of 'test'.
@declan
"Propose" has, as its first entry in Oxford (http://www.askoxford.com/concise_oed/propose?view=uk),
"put forward (an idea or plan) for consideration by others". That's exactly what I'm doing. If I had said, "I declare that everyone must use the word 'checks', instead of 'tests', for 'checks'" that would be a different statment, and I would have phrased it thus.
Again, this deserves its own blog post. I'm working on that.
—Michael B.
So by your logic, a pregnancy check is a non-pregnancy test?
I think it's fine to emphasize the value of exploration vs asserting, However I find your attempt at creating a new dictionary to solve the worlds problems incredibly naive and manipulative.
Words mean what people on average choose them to mean; they don't even obtain their meaning from mainstream dictionaries let alone ones concocted on a blog.
What was the bloody problem with just using the phrase 'exploratory testing' ? I think what you are trying to do is create a 'newspeak' in which naughty thoughts are harder to express.
@anonymous (too shy to sign)
That's actually pretty funny. Thanks.
The answer is that it might be. I think subsequent blog posts might shed further light on the subject. Some are already published:
http://www.developsense.com/2009/09/transpection-and-three-elements-of.html
http://www.developsense.com/2009/09/pass-vs-fail-vs-is-there-problem-here.html
http://www.developsense.com/2009/09/elements-of-testing-and-checking.html
Some are yet to come. Thank you for whatever patience you can lend me.
—Michael B. (brave enough to sign)
Author's note: I received another Anonymous comment, but I don't know if it's from the same Anonymous or a different Anonymous. Somehow in the process of moderating it, it got lost, so here it is.
I think it's fine to emphasize the value of exploration vs asserting, However I find your attempt at creating a new dictionary to solve the worlds problems incredibly naive and manipulative.
Words mean what people on average choose them to mean; they don't even obtain their meaning from mainstream dictionaries let alone ones concocted on a blog.
What was the bloody problem with just using the phrase 'exploratory testing' ? I think what you are trying to do is create a 'newspeak' in which naughty thoughts are harder to express.
I'll have an answer for you shortly, Anon.
Michael,
This is thought provoking stuff, so much so it nearly got me run over whilst walking the dog and mulling it over…
I wonder though, it this distinction necessary? We already have a similar distinction between two subsets of testing: verification and validation.
Now, I note that you have included validation within your definition of checking. One other reader has expressed discomfort with this, and I share that discomfort.
Let me define verification and validation as I use them;
Verification is the confirmation that a product fulfills its explicitly specified requirements, which can be evaluated as true or false. This is synonymous with what I think you mean when you say checking.
Validation is the confirmation that a product will fulfill its intended use. This transcends explicit requirements and requires subjective value judgments.
For example, I have yet to see a requirements specification which give verifiable details on expected user experience – this is a function of validation. Nor have I ever seen a requirement specification that stated “there will be no memory leaks”, yet I routinely test for memory issues because such problems limit the capacity of a product to fulfill its intended use. Again, this is validation.
Requirements, design documents, use cases, activity diagram etc are all abstractions intended convey some part of how a given product should operate. Of course, these are all “leaky”, there will be omissions, ambiguities and contradictions. These leaks are where verification ends and validation starts.
Please could you explain what you mean by validation that makes it a subset of checking?
Cheers,
Iain
@Iain…
This is thought provoking stuff, so much so it nearly got me run over whilst walking the dog and mulling it over…
I'm gratified that I've provoked so much thought. I'd be less happy about people dying, so I urge: don't think about this too much. "Too much" in this context means that sloppy thinking about testing should be vulnerable, but the well-being of people and dogs should not.
Verification is the confirmation that a product fulfills its explicitly specified requirements, which can be evaluated as true or false. This is synonymous with what I think you mean when you say checking. Validation is the confirmation that a product will fulfill its intended use. This transcends explicit requirements and requires subjective value judgments.
Personally, I've never found the distinction betwen verification and validation to be terribly helpful. It goes back to the old saw that "verification shows that we've built the product right; validation shows that we've built the right product". Or is that the other way around? Just this week I was asked to mediate a discussion between two testers who were arguing which was which entirely outside of a particular context, and without reference to a problem that it might help them solve. Who cares?
The reason that neither one particularly works for me is that both are focused on confirmation; both are focused on right. That's something that's relatively easy to demonstrate. It's relatively straightforward to show that a program satisfies a condition, or a set of conditions, that we already have in mind. There's a wonderful passage in Testing Computer Software: "Give us a list of your test cases. We can write a program that will pass all your tests but still fail spectacularly on an input you missed. If we can do this deliberately, our contention is that we or other programmers can do it accidentally."
Now, it appears to me that you have a differnt spin from mine on what "validation" means to you:
For example, I have yet to see a requirements specification which give verifiable details on expected user experience – this is a function of validation. Nor have I ever seen a requirement specification that stated “there will be no memory leaks”, yet I routinely test for memory issues because such problems limit the capacity of a product to fulfill its intended use. Again, this is validation.
First, I have seen a requirements spec that states “there will be no memory leaks”; I've written more than one. In any case, there are things that you can do to check for memory leaks—using a static analysis tool, for example. I think you're suggesting that that might not be enough, and I might be inclined to agree with you. But (for example) in a Windows environment, where getting a clear handle on memory management is relatively difficult and a static analysis tool is relatively easier, I might well choose to check for that, rather than test. The impulse to ask and answer the question is testing; the mechanics of answering it, in this case, could be delegated to checking.
Requirements, design documents, use cases, activity diagram etc are all abstractions intended convey some part of how a given product should operate. Of course, these are all “leaky”, there will be omissions, ambiguities and contradictions. These leaks are where verification ends and validation starts.
I think you probably see validation as a more exploratory activity than I do. That's okay; were we working together on a real project, we'd sort that out. You have a more expansive notion of "confirmation" than I, maybe. However, I think we agree that your activities that go outside the spec amount to testing, the way you do them. Words remain slippery.
Please could you explain what you mean by validation that makes it a subset of checking?
Maybe the other way around; checking is an activity that focuses on validation.
—Michael B.
Hi Michael,
Great topic, and great responses! Much has been said already, much I agree with and some I don't. So here is my 2 cents:-)
Everybody in the thread agrees that "there are 2 different kind of things that should not be confused". A lot of the discussion that goes on is about what that difference actually is and if those things should be called testing, checking, validation, verification or whatever. Also a lot of debate was about the issue whether or not testing must be exploratory.
I'll try not to repeat too much of what has been said already, but add another dimension to the discussion: We're in the business of "providing answers to certain questions"
For convenience I'll stick with the words "testing" and "checking".
- Checking is used to provide an answer to a SIMPLE question, such as "Does this function, when activting this option, produce that result?"
- Testing is used to provide an answer to a COMPLICATED question, such as "Does this function work as desired?" or "Would it be wise to ship this version of the product?".
Usually the answer to a simple question can be a simple answer, such as Yes or No, much as you pointed out before.
). It deserves a complicated answer, that presents various views on the problem and various shades between the black and white (or Yes and No).
A complicated question usually cannot be answered with a simple answer (Exception: the question about Life, the Universe and Everything. See Douglas Adams
With "checking", the question is (or even stronger: must be) absolutely clear, and answering it should be fairly straightforward. Of course we want such a question to be "testable" (Hmmm.. this might become confusing now
With "testing", answering the question is not straightforward at all because the question is usually not yet clear at all and triggers lots of other questions.
Testing is indeed investigating and it provides an answer that takes a while to formulate and explain to the people that asked the question.
I think that that investigation can be done in different ways:
- the exploratory approach: We don't know yet where we want to go and what to do exactly, but we'll find out along the way.
- the planned and scripted approach: First we find out where we want to go and what we want to achieve and then do what we planned. (People using this aproach often forget that 'unplanned events' are inevitable and that their plan should include how to cope with that.)
Both approaches may or may not include the activity of answering simple questions (running a test case could serve this purpose). So here I agree with people who suggested that checking can be a part of testing but not the other way round.
Which approach do I believe in?
I like to first find out as much as possible about where we (or rather: the customer) want to go and then plan how to get there. Then I find out things and learn along the way, and change the plan (including 'where we want to go') accordingly.
Is that 'planned and scripted'? Or is it 'exploratory'? I don't like to have to choose between the two. I don't believe in black and white… unless it is about chocolates
Kind regards, Bart
@Bart…
Thank you for the feedback.
Checking is used to provide an answer to a SIMPLE question, such as "Does this function, when activting this option, produce that result?"
- Testing is used to provide an answer to a COMPLICATED question, such as "Does this function work as desired?" or "Would it be wise to ship this version of the product?".
Yes. However, simplicity and complexity are in the mind of the beholder. A check might involve an extremely complicated algorithm that comes to a decidable result, where a test might involve some relatively simple observations. Yet I think you're identifying something important. I'd suggest that the simplicity is not in the questions, but in the answers. Checking questions are closed questions that have a narrow domain of answers (true/false, yes/no, pass/fail), and that testing questions are open questions (good enough/not good enough, too slow/fast enough, problem/no problem). There's an absolute nature to a check, and a much more fuzzy nature to a test.
Cheers,
—Michael B.
Thanks for a brilliant post. I especially liked the description of programming using TDD, about how tests become checks.
The post has put some ideas into concrete which were floating round my head.
This just makes me more certain that there is a need for automated checking *and* real testing using human beings
I second Arjan comments you left me with identity crisis. I am also involved in automating of tests (rather automating checks
).I do know the limitations of automation checking, it can’t find unexpected failure and does check only against expected results. As I am not native English speaker, I was using automation testing, but this is really an eye opener.
I do have difficulty in convincing my managers that automation can be done only for a set of checks like Build verification checks and to some extent in regression checking. Automation never finds an unexpected failure. It checks against an expected result. But now I can always refer this article to argue with my managers
Thanks for making it so simple,now I can easily use checking and testing at appropriate places
–Dhanasekar S
Well done, Michael! With this post, you took my perfectly good understanding of testing, shook it up by contaminating it with "checking", then put it into order again, but with two distinct parts that immediately make sense to me. I now agree… there is testing and there is checking. I think this post is one of the most (in)famous and important posts to come along in our industry in a long time.
[...] http://www.developsense.com/blog/2009/08/testing-vs-checking/ http://www.developsense.com/blog/2009/09/transpection-and-three-elements-of/ http://www.developsense.com/blog/2009/09/pass-vs-fail-vs-is-there-problem-here/ [...]
[...] vs Checking was covered in the interview. Check out Michael’s blog entry Testing vs Checking for full context on this excellent discussion [...]
At last. Common sense comes to testing!
Thanks for writing, Christo.
Most interesting article Michael. I often hear managers say things like “Lets automate and we can save on testers”. My response, after beating them in the head
is: “Machines don’t find new defects, at least not on purpose. Only humans do”. Do you agree with this statement?
Not with the first part! Machines don’t find any defects, not even on purpose. Only humans do. That is, I agree with you, but more than you do. [grin /]
It’s important to note that machines extend our senses, but they don’t do any reasoning on their own. The machine can’t tell if there’s a problem in the product, in the test, in the instrumentation. The machine doesn’t know, can’t know, if what it’s detecting is a defect or not. Only a human can do that, as you rightly point out.
Great post! nicely differentiated between Checking and Testing.
my only concern is, more distinctions can lead more confusions. Still many testers are confused between verification & validation, QA & QC…and so on.
- Selim
Thank you again for commenting.
I encourage you to read the whole series, especially Tests vs. Checks: The Motive for Distinguishing.
When you suggest that more distinctions cause more confusions, there’s a corresponding idea that people would be less confused if we had fewer distinctions. If that were true, wouldn’t we clear up all confusion by referring to everything as a thing, and referring to every activity as “doing”?
I think more distinctions cause more confusion when the distinctions aren’t terribly meaningful or useful or relevant to people. I assert that testers get confused over quality assurance and quality control because testers do neither of the two, and thus the distinction isn’t grounded in relevance to their own work. “It says ‘quality assurance’ in my job description, but what I do is test.” That’s why they get confused.
“I encourage you to read the whole series, especially Tests vs. Checks: The Motive for Distinguishing.”
Sure, i will do. Also yesterday Pradeep Soundararajan posted experience report of testing vs checking on his blog based on your series of posts about these. Might be i was wrong to comment before reading the whole series.
Thanks for your response and great work!
- Selim
[...] Bolton expresses his views on confirmation in his blog post here. In general, his view is that confirmation is the simple act of checking the correctness of known [...]
[...] testing once you execute the already created test cases. Read Michael Bolton’s Testing vs Checking [...]
[...] answer for the nature of automation and why it is not “testing” so much as “checking.” As usual he makes a very compelling case in approximately a thousand words or less, so [...]
[...] between testing and checking. If you’re not familiar with this then go and check out this article now. Sure I want you to read my article, but Michael’s argument is important and you really [...]
Based on what you’re writing here, a test case would be something that strives to reveal new information without forcing the tester to follow set rules ie. it would need to be high-level and urging the tester to explore independently. So, this got me thinking – shouldn’t we actually dump the term “test case”, or at least complement it with the term “check case”?
What people – generally – seem to mean with the term “test case” is actually pretty close to your description of checking ie. following a clear (and strict, to a varying degree) set of instructions to ensure the application works as expected or, in other words, not necessarily revealing anything new but just getting the confidence “it’s working okay”.
As a tester, I may be doing exploratory work while *creating* a test case for a new functionality or feature. However, once the exploratory work is done (presuming that I will complete writing the test case in a single run) and we move to an execution phase the work may no longer be exploratory and, based on your blog post, the test case ceases being a test case any longer and becomes a check case instead.
While this may or may not be a moot point to consider I like consistency as it helps narrow down communication gaps.