Blog Posts from February, 2015

Very Short Blog Posts (26): You Don’t Need Acceptance Criteria to Test

Tuesday, February 24th, 2015

You do not need acceptance criteria to test.

Reporters do not need acceptance criteria to investigate and report stories; scientists do not need acceptance criteria to study and learn about things; and you do not need acceptance criteria to explore something, to experiment with it, to learn about it, or to provide a description of it.

You could use explicit acceptance criteria as a focusing heuristic, to help direct your attention toward specific things that matter to your clients; that’s fine. You might choose to use explicit acceptance criteria as claims, oracles that help you to recognize a problem that happens as you test; that’s fine too. But there are many other ways to identify problems; quality criteria may be tacit, not explicit; and you may discover many problems that explicit acceptance criteria don’t cover.

You don’t need acceptance criteria to decide whether something is acceptable or unacceptable. As a tester you don’t have decision-making authority over acceptability anyway. You might use acceptance criteria to inform your testing, and to identify threats to the value of the product. But you don’t need acceptance criteria to test.

Very Short Blog Posts (25): Testers Don’t Break the Software

Tuesday, February 17th, 2015

Plenty of testers claim that they break the software. They don’t really do that, of course. Software doesn’t break; it simply does what it has been designed and coded to do, for better or for worse. Testers investigate systems, looking at what the system does; discovering and reporting on where and how the software is broken; identifying when the system will fail under load or stress.

It might be a good idea to consider the psychological and public relations problems associated with claiming that you break the software. Programmers and managers might subconsciously harbour the idea that the software was fine until the testers broke it. The product would have shipped on time, except the testers broke it. Normal customers wouldn’t have problems with the software; it’s just that the testers broke it. There are no systemic problems in the project that lead to problems in the product; nuh-uh, the testers broke it.

As an alternative, you could simply say that you investigate the software and report on what it actually does—instead of what people hope or wish that it does. Or as my colleague James Bach puts it, “We don’t break the software. We break illusions about the software.”

Give Us Back Our Testing

Saturday, February 14th, 2015

“Program testing involves the execution of a program over sample test data followed by analysis of the output. Different kinds of test output can be generated. It may consist of final values of program output variables or of intermediate traces of selected variables. It may also consist of timing information, as in real time systems.

“The use of testing requires the existence of an external mechanism which can be used to check test output for correctness. This mechanism is referred to as the test oracle. Test oracles can take on different forms. They can consist of tables, hand calculated values, simulated results, or informal design and requirements descriptions.”

—William E. Howden, A Survey of Dynamic Analysis Methods, in Software Validation and Testing Techniques, IEEE Computer Society, 1981

Once upon a time, computers were used solely for computation. Humans did most of the work that preceded or followed the computation, so the scope of a computer program was limited. In the earliest days, testing a program mostly involved checking to see if the computations were being performed correctly, and that the hardware was working properly before and after the computation.

Over time, designers and programmers became more ambitious and computers became more powerful, enabling more complex and less purely numerical tasks to be encoded and delegated to the machinery. Enormous memory and blinding speed largely replaced the physical work associated with storing, retrieving, revising, and transmitting records. Computers got smaller and became more powerful and protean, used not only by mathematicians but also by scientists, business people, specialists, consumers, and kids. Software is now used for everything from productivity to communications, control systems, games, audio playback, video displays, thermostats… Yet many of the software development community’s ideas about testing haven’t kept up. In fact, in many ways, they’ve gone backwards.

Ask people in the software business to describe what testing means to them, and many will begin to talk about test cases, and about comparing a program’s output to some predicted or expected result. Yet outside of software development, “testing” has retained its many more expansive meanings. A teenager tests his parents’ patience. When confronted with a mysterious ailment, doctors perform diagnostic tests with open expectations and results that must be interpreted. Writers in Cook’s Illustrated magazine test techniques for roasting a turkey, and report on the different outcomes involving the different factors—flavours, colours, moisture, textures—that they obtain. The Mythbusters, says Wikipedia, “use elements of the scientific method to test the validity of rumors, myths, movie scenes, adages, Internet videos, and news stories.”

Notice that all of these things called “testing” are focused on exploration, investigation, discovery, and learning. Yet over the last several decades, Howden’s notions of testing as checking for correctness, and of an oracle as a mechanism (or an artifact) became accepted by many people in the development and testing communities at large. Whether people were explicitly aware of those notions, they certainly seemed tacitly to have subscribed to the idea that testing should be focused on analysis of the output, displacing those deeper meanings of testing. That idea might have been more reasonable when computers did nothing but compute. Today, computers and their software are richly intertwined with daily social life and things that we value. Yet for many in software development, “testing” has this narrow, impoverished meaning, limited to what James Bach and I call checking. Checking is a tactic of testing; the part of testing that can be encoded as algorithms and that therefore can be performed entirely by machinery. It is analogous to compiling, the part of programming that can be performed algorithmically.

Oddly, since we started distinguishing between testing and checking, some people have claimed that we’re “redefining” testing. We disagree. We believe that we are recovering testing’s meaning, restoring it to its original, rich, investigative sense. Testing’s meaning was stolen; we’re stealing it back.

Very Short Blog Posts (24): You Are Not a Bureaucrat

Saturday, February 7th, 2015

Here’s a pattern I see fairly often at the end of bug reports:

Expected: “Total” field should update and display correct result.
Actual: “Total” field updates and displays incorrect result.

Come on. When you write a report like that, can you blame people for thinking you’re a little slow? Or that you’re a bureaucrat, and that testing work is mindless paperwork and form-filling? Or perhaps that you’re being condescending?

It is absolutely important that you describe a problem in your bug report, and how to observe that problem. In the end, a bug is an inconsistency between a desired state and an observed state; between what we want and what we’ve got. It’s very important to identify the nature of that inconsistency; oracles are our means of recognizing and describing problems. But in the relationship between your observation and the desired state, the expectation is the middleman. Your expectation is grounded in a principle based on some desirable consistency. If you need to make that principle explicit, leave out the expectation, and go directly for a good oracle instead.

The Rapid Software Testing Namespace

Monday, February 2nd, 2015

Just as no one has the right to tell you what language to speak at home, nobody outside of your project has the authority to tell you how to speak inside your project. Every project develops its own namespace, so to speak, and its own formal or informal criteria for naming things inside it. Rapid Software Testing is, among other things, a project in that sense. For years, James Bach and I have been developing labels for ideas and activities that we talk about in our work and in our classes. While we’re happy to adopt useful ideas and terms from other places, we have the sole authority (for now) to set the vocabulary formally within Rapid Software Testing (RST). We don’t have the right to impose our vocabulary on anyone else. So what do we do when other people use a word to mean something different from what we mean by the same word?

We invoke “the RST namespace” when we talk about testing and checking, for example, so that we can speak clearly and efficiently about ideas that we bring up in our classes and in the practice of Rapid Software Testing. From time to time, we also try to make it clear why we use words in a specific way. For example, we make a big deal about testing and checking. We define checking as “the process of making evaluations by applying algorithmic decision rules to specific observations of a product” (and a check is an instance of checking). We define testing as “the process of evaluating a product by learning about it through exploration and experimentation, which includes to some degree: questioning, study, modeling, observation, inference, etc.” (and a test is an instance of testing).

This is in contrast with the ISTQB, which in its Glossary defines “test” as “a set of test cases”—along with “test case” as “a set of input values, execution preconditions, expected results and execution postconditions, developed for a particular objective or test condition, such as to exercise a particular program path or to verify compliance with a specific requirement.” Interesting, isn’t it: the ISTQB’s definition of test looks a lot like our definition of check. In Rapid Software Testing, we prefer to put learning and experimentation (rather than satisfying requirements and demonstrating fitness for purpose) at the centre of testing. We prefer to think of a test as something that people do as an act of investigation; as a performance, not as an artifact.

Because words convey meaning, we converse (and occasionally argue, and sometimes passionately) the value we see in the words we choose and the ways we think of them. Our goal is to describe things that people haven’t noticed, or to make certain distinctions clear, with the goal of reducing the risk that someone will misunderstand—or miss—something important. Nonetheless, we freely acknowledge that we have no authority outside of Rapid Software Testing. There’s nothing to stop people from using the words we use in a different way; there are no language police in software development. So we’re also willing to agree to use other people’s labels for things when we’ve had the conversation about what those labels mean, and have come to agreement.

People who tout a “common language” often mean “my common language”, or “my namespace”. They also have the option to certify you as being able to pass a vocabulary test, if anyone thinks that’s important. We don’t. We think that it’s important for people to notice when words are being used in different ways. We think it’s important for people to become polyglots—and that often means working out which namespace we might be using from one moment to the next. In our future writing, conversation, classes, and other work, you might wonder what we’re talking about when we refer to “the RST namespace”. This post provides your answer.