Blog Posts for the ‘Rapid Software Testing’ Category

(At Least) Four Things for Testers To Do in Planning Meetings

Wednesday, October 18th, 2017

There’s much talk these days of DevOps, and Agile development, and “shift left”. Apparently, in these process models, it’s a revelation that testers can do more than test a built product, and that testers can and should be involved at every step of development.

In Rapid Software Testing, that’s not exactly news. From the beginning, we’ve rejected the idea that the product has to be complete, or has to pass some kind of “quality gate” or meet “acceptance criteria” before we start testing. We welcome the opportunity to test anything that anyone is willing to give us. We’ll happily do testing at any time from the moment someone has an idea for a product until long after the product has been released.

When testers are invited to planning meetings, there’s clearly no product to test. So what are we there for?

We’re there to learn. Testing is evaluating a product by learning about it through exploration and experimentation. At the meeting, there is a product to test. Running code is not the only kind of product we can test—not by a long shot. Ideas, designs, documents, drawings, and prototypes are products too. We can explore them, and perform thought experiments on them—and we can learn about them and evaluate them.

At the meeting, we’re there to learn about the product; to learn about the technology; to learn about the contexts in which the product will be used; to learn about plans for building the product. Our role is to become aware of all of the sources of information that might aid in our testing, and in development of the product generally. We’re there to find out about risks that threaten the value of the product in the short and long term, and about problems that might threaten the on-time, successful completion of the product.

We’re there to advocate for testability. Testability might happen by accident, without our help. It’s the role of a responsible tester to make sure that testability happens intentionally, by design. Note that testability is not just about stuff that’s intrinsic to the product. There are factors in the project, in our notions of value, and in our understanding of the risk gap that influence testability. Testability is also subjective with respect to us, our knowledge and skills, and our relationship to the team. So part of our jobs during preparation for development is to ask for the help we’ll need to make ourselves more powerful testers.

We’re there to challenge. Other people are in roles oriented towards building the product. They are focused on synthesis, and envisioning success. If they’re designers, they might be focused on helping the user to accomplish a task, on efficiency, or effectiveness, or on esthetics. If they’re business people, they might be focused on accomplishing some business goal, or meeting a deadline. Developers are often focused more on the details than on the big pictures. All of those people may be anxious to declare and meet a definition of “done”.

The testing role is to think critically about the product and the project; to ask how we might be fooling ourselves. We’re tilted towards asking good questions instead of getting “the right answer”; towards analysis more than synthesis; towards skepticism and suspicion more than optimism; towards anticipating problems more than seeking solutions. We can do those other things, but when we do, we pop for that moment out of a testing role and into a building role.

As testers, we’re trying to notice problems in what people are talking about in the meetings. We’re trying to identify obstacles that might hinder the user’s task; ways in which the product might be ineffective, inefficient, or unappealing. We’re trying to recognize how the business goal might not be met, or how the deadline could be blown. We’re alternating between small details and the big picture. We’re trying to figure out how our definition of done might be inadequate; how we might be fooling ourselves into believing we’re done when we’re not. We’re here to challenge the idea that something is okay when it might not be okay.

We’re there to establish our roles as testers. A role is a heuristic that helps in managing time, focus, and responsibility. The testing role is a commitment to perform valuable and necessary services: to focus on discovering problems, ideally early when they’re small, so that they can be prevented from turning into bigger problems later; to build a product and a project that affords rapid, inexpensive discovery and learning; and to challenge the ideas and artifacts that represent what we think we know about the product and its design. These tasks are socially, psychologically, emotionally, and politically difficult. Unless we handle them gracefully, our questioning, problem-focused role will be seen as merely disruptive, rather than an essential part of the process of building something excellent.

In Rapid Software Testing, we don’t claim that someone must be in the testing role, or must have the job title “tester”. We do believe that having someone responsible for the testing role helps to put focus on the task of providing helpful feedback. This should be a service to the project, not an obstacle. It requires us to maintain close social distance while maintaining a good deal of critical distance.

Of course, the four things that I’ve mentioned here can be done in any development model. They can be done not only in planning meetings, but at any time when we are engaging with others, at any time in the product’s development, at any level of granularity or formality. DevOps and Agile and “shift left” are context. Testing is always testing.

Some related posts:

What Exploratory Testing Is Not (Part 2): After-Everything-Else Testing

Exploratory Testing and Review

Exploratory Testing is All Around You

Testers Don’t Prevent Problems

What Is A Tester?

Testing is…

The Honest Manual Writer Heuristic

Monday, May 30th, 2016

Want a quick idea for a a burst of activity that will reveal both bugs and opportunities for further exploration? Play “Honest Manual Writer”.

Here’s how it works: imagine you’re the world’s most organized, most thorough, and—above all—most honest documentation writer. Your client has assigned you to write a user manual, including both reference and tutorial material, that describes the product or a particular feature of it. The catch is that, unlike other documentation writers, you won’t base your manual on what the product should do, but on what it does do.

You’re also highly skeptical. If other people have helpfully provided you with requirements documents, specifications, process diagrams or the like, you’re grateful for them, but you treat them as rumours to be mistrusted and challenged. Maybe someone has told you some things about the product. You treat those as rumours too. You know that even with the best of intentions, there’s a risk that even the most skillful people will make mistakes from time to time, so the product may not perform exactly as they have intended or declared. If you’ve got use cases in hand, you recognize that they were written by optimists. You know that in real life, there’s a risk that people will inadvertently blunder or actively misuse the product in ways that its designers and builders never imagined. You’ll definitely keep that possibility in mind as you do the research for the manual.

You’re skeptical about your own understanding of the product, too. You realize that when the product appears to be doing something appropriately, it might be fooling you, or it might be doing something inappropriate at the same time. To reduce the risk of being fooled, you model the product and look at it from lots of perspectives (for example, consider its structure, functions, data, interfaces, platform, operations, and its relationship to time; and business risk, and technical risk). You’re also humble enough to realize that you can be fooled in another way: even when you think you see a problem, the product might be working just fine.

Your diligence and your ethics require you to envision multiple kinds of users and to consider their needs and desires for the product (capability, reliability, usability, charisma, security, scalability, performance, installability, supportability…). Your tutorial will be based on plausible stories about how people would use the product in ways that bring value to them.

You aspire to provide a full accounting of how the product works, how it doesn’t work, and how it might not work—warts and all. To do that well, you’ll have to study the product carefully, exploring it and experimenting with it so that your description of it is as complete and as accurate as it can be.

There’s a risk that problems could happen, and if they do, you certainly don’t want either your client or the reader of your manual to be surprised. So you’ll develop a diversified set of ways to recognize problems that might cause loss, harm, annoyance, or diminished value. Armed with those, you’ll try out the product’s functions, using a wide variety of data. You’ll try to stress out the product, doing one thing after another, just like people do in real life. You’ll involve other people and apply lots of tools to assist you as you go.

For the next 90 minutes, your job is to prepare to write this manual (not to write it, but to do the research you would need to write it well) by interacting with the product or feature. To reduce the risk that you’ll lose track of something important, you’ll probably find it a good idea to map out the product, take notes, make sketches, and so forth. At the end of 90 minutes, check in with your client. Present your findings so far and discuss them. If you have reason to believe that there’s still work to be done, identify what it is, and describe it to your client. If you didn’t do as thorough a job as you could have done, report that forthrightly (remember, you’re super-honest). If anything that got in the way of your research or made it more difficult, highlight that; tell your client what you need or recommend. Then have a discussion with your client to agree on what you’ll do next.

Did you notice that I’ve just described testing without using the word “testing”?

On Scripting

Saturday, July 4th, 2015

A script, in the general sense, is something that constrains our actions in some way.

In common talk about testing, there’s one fairly specific and narrow sense of the word “script”—a formal sequence of steps that are intended to specify behaviour on the part of some agent—the tester, a program, or a tool. Let’s call that “formal scripting”. In Rapid Software Testing, we also talk about scripts as something more general, in the same kind of way that some psychologists might talk about “behavioural scripts”: things that direct, constrain, or program our behaviour in some way. Scripts of that nature might be formal or informal, explicit or tacit, and we might follow them consciously or unconsciously. Scripts shape the ways in which people behave, influencing what we might expect people to do in a scenario as the action plays out.

As James Bach says in the comments to our blog post Exploratory Testing 3.0, “By ‘script’ we are speaking of any control system or factor that influences your testing and lies outside of your realm of choice (even temporarily). This does not refer only to specific instructions you are given and that you must follow. Your biases script you. Your ignorance scripts you. Your organization’s culture scripts you. The choices you make and never revisit script you.” (my emphasis, there)

When I’m driving to a party out in the country, the list of directions that I got from the host scripts me. Many other things script me too. The starting time of the party—combined with cultural norms that establish whether I should be very prompt or fashionably late—prompts me to leave home at a certain time. The traffic laws and the local driving culture condition my behaviour and my interactions with other people on the road. The marked detour along the route scripts me, as do the weather and the driving conditions. My temperament and my current emotional state script me too. In this more general sense of “scripting”, any activity can become heavily scripted, even if it isn’t written down in a formal way.

Scripts are not universally bad things, of course. They often provide compelling advantages. Scripts can save cognitive effort; the more my behaviour is scripted, the less I have to think, do research, make choices, or get confused. In my driving example, a certain degree of scripting helps me to get where I’m going, to get along with other drivers, and to avoid certain kinds of trouble. Still, if I want to get to the party without harm to myself or other people, I must bring my own agency to the task and stay vigilant, present, and attentive, making conscious and intentional choices. Scripts might influence my choices, and may even help me make better choices, but they should not control me; I must remain in control. Following a script means giving up engagement and responsibility for that part of the action.

From time to time, testing might include formal testing—testing that must be done in a specific way, or to check specific facts. On those occasions, formal scripting—especially the kind of formal script followed by a machine—might be a reasonable approach enabling certain kinds of tasks and managing them successfully. A highly scripted approach could be helpful for rote activities like operating the product following explicitly declared steps and then checking for specific outputs. A highly scripted approach might also enable or extend certain kinds of variation—randomizing data, for example. But there are many other activities in testing: learning about the product, designing a test strategy, interviewing a domain expert, recognizing a new risk, investigating a bug—and dealing with problems in formally scripted activities. In those cases, variability and adaptation are essential, and an overly formal approach is likely to be damaging, time-consuming, or outright impossible. Here’s something else that is almost never formally scripted: the behaviour of normal people using software.

Notice on the one hand that formal testing is, by its nature, highly scripted; most of the time, scripting constrains or even prevents exploration by constraining variation. On the other hand, if you want to make really good decisions about what to test formally, how to test formally, why to test formally, it helps enormously to learn about the product in unscripted and informal ways: conversation, experimentation, investigation… So excellent scripted testing and excellent checking are rooted in exploratory work. They begin with exploratory work and depend on exploratory work. To use language as Harry Collins might, scripted testing is parasitic on exploration.

We say that any testing worthy of the name is fundamentally exploratory. We say that to test a product means to evaluate it by learning about it through experimentation and exploration. To explore a product means to investigate it, to examine it, to create and travel over maps and models of it. Testing includes studying the product, modeling it, questioning it, making inferences about it, operating it, observing it. Testing includes reporting, which itself includes choosing what to report and how to contextualize it. We believe these activities cannot be encoded in explicit procedural scripting in the narrow sense that I mentioned earlier, even though they are all scripted to some degree in the more general sense. Excellent testing—excellent learning—requires us to think and to make choices, which includes thinking about what might be scripting us, and deciding whether to control those scripts or to be controlled by them. We must remain aware of the factors that are scripting us so that we can manage them, taking advantage of them when they help and resisting them when they interfere with our mission.

Exploratory Testing 3.0

Tuesday, March 17th, 2015

This blog post was co-authored by James Bach and me. In the unlikely event that you don’t already read James’ blog, I recommend you go there now.

The summary is that we are beginning the process of deprecating the term “exploratory testing”, and replacing it with, simply, “testing”. We’re happy to receive replies either here or on James’ site.

Oracles Are About Problems, Not Correctness

Thursday, March 12th, 2015

As James Bach and I have have been refining our ideas of testing, we’ve been refining our ideas about oracles. In a recent post, I referred to this passage:

Program testing involves the execution of a program over sample test data followed by analysis of the output. Different kinds of test output can be generated. It may consist of final values of program output variables or of intermediate traces of selected variables. It may also consist of timing information, as in real time systems.

The use of testing requires the existence of an external mechanism which can be used to check test output for correctness. This mechanism is referred to as the test oracle. Test oracles can take on different forms. They can consist of tables, hand calculated values, simulated results, or informal design and requirements descriptions.

—William E. Howden, A Survey of Dynamic Analysis Methods, in Software Validation and Testing Techniques, IEEE Computer Society, 1981

While we have a great deal of respect for the work of testing pioneers like Prof. Howden, there are some problems with this description of testing and its focus on correctness.

  • Correct output from a computer program is not an absolute; an outcome is only correct or incorrect relative to some model, theory, or principle. Trivial example: Even the mathematical rule “one divided by two equals one-half” is a heuristic for dividing things. In most domains, it’s true, but as in George Carlin’s joke, when you cut a crumb in two, you don’t have two half-crumbs; you have two crumbs.
  • A product can produce a result that is functionally correct, and yet still be deeply unsatisfactory to its user. Trivial example: a calculator returns the value “4” from the function “2 + 2″—and displays the result in white on a white background.
  • Conversely, a product can produce an incorrect result and still be quite acceptable. Trivial example: a computer desktop clock’s internal state and second hand drift a few tenths of a second each second, but the program resets itself to be consistent with an atomic clock at the top of every minute. The desktop clock almost never shows the right time precisely, but the human observer doesn’t notice and doesn’t really care. Another trivial example: a product might return a calculation inconsistent with its oracle in the tenth decimal place, when only the first two or three decimal places really matter.
  • The correct outcome of a program or function is not always known in advance. Some development and testing work, like some science, is done in an attempt to discover something new; to establish what a correct answer might look like; to explore a mathematical model; to learn about the limitations of a novel system. In such cases, our ideas of correctness or acceptability are not clear from the outset, and must be developed. (See Collins and Pinch’s The Golem books, which discuss the messiness and confusion of controversial science.) Trivial example: in benchmarking, correctness is not at issue. Comparison between one system and another (or versions of the same system at different times) is the mission of testing here.
  • As we’re developing and testing a product, we may observe things that are unexpected, under-described or completely undescribed. In order to program a machine to make an observation, we must anticipate that observation and encode it. The machine doesn’t imagine, invent, or learn, and a machine cannot produce an unanticipated oracle in response to an observation. By contrast, human observers continually learn and refine their ideas on what to observe. Sometimes we observe a problem without having anticipated it. Sometimes we become aware that we’re making a new observation—one that may or may not represent a problem. Distinct from checking, testing continually affords new things to observe. Testing prompts us to decide when new observations represent problems, and testing informs decisions about what to do about them.
  • An oracle may be in error, or irrelevant. Trivial examples: a program that checks the output of another program may have its own bugs. A reference document may be outdated. A subject matter expert who is usually a reliable source of information may have forgotten something.
  • Oracles might be inconsistent with each other. Even though we have some powerful models for it, temperature measurement in climatology is inherently uncertain. What is the “correct” temperature outdoors? In the sunlight? In the shade? When the thermometer is near a building or farther away? Over grass, or over pavement? Some of the issues are described in this remarkable article (read the comments, too).
  • Although we can demonstrate incorrectness in a program, we cannot prove a program to be correct. As Djikstra put it, testing can only show the presence of errors, not their absence; and to go even deeper, Popper pointed out that theories can only be falsified, and not proven. Trivial example: No matter how many tests we run on that calculator, we can never know that it will always return 4 given the inputs 2 + 2; we can only infer that it will do so through induction, and induction can be deeply problemmatic. In a Nassim Taleb’s example (cribbed from Bertrand Russell and David Hume), every day the turkey uses induction to reinforce his belief in the farmer’s devotion to the desires and interests of turkeys—until a few days before Thanksgiving, when the turkey receives a very sudden, unpleasant, and (alas for the turkey) momentary flash of insight.
  • Sometimes we don’t need to know the correct result to know that the observed result is wrong. Trivial example: the domain of the cosine function ranges from -1 to 1. I don’t need to know the correct value for cos(72) to know that an output of 4.2 is wrong. (Elaine Weyuker discusses this in a paper called “On Testing Nontestable Programs” (Weyuker, Elaine, “On Testing Nontestable Programs”, Department of Computer Science, Courant Institute of Mathematical Sciences, New York University). “Frequently the tester is able to state with assurance that a result is incorrect without actually knowing the correct answer.”)

Checking for correctness—especially when the test output is observed and evaluated mechanically or indirectly—is a risky business. All oracles are fallible. A “passing” test, based on comparison with a fallible oracle cannot prove correctness, and no number of “passing” tests can do that. In this, a test is like a scientific experiment: an experiment’s outcome can falsify one theory while supporting another, but an experiment cannot prove a theory to be true. A million observations of white swans says nothing about the possibility that there might be black swans; a million passing tests, a million observations of correct behaviour cannot eliminate the possibility that there might be swarms of bugs. At best, a passing test is essentially the observation of one more white swan. We urge those who rely on passing acceptance tests to remember this.

A check can suggest the presence of a problem, or can at best provide support for the idea that the program can work. But no matter what oracle we might use, a test cannot prove that a program is working correctly, or that the program will work . So what can oracles actually do for us?

If we invert the focus on correctness, we can produce a more robust heuristic. We can’t logically use an oracle to prove that a system is behaving correctly or that it will behave correctly, but we can use an oracle to help falsify the theory that it is behaving correctly. This is why, in Rapid Software Testing, we say that an oracle is a means by which we recognize a problem when it happens during testing.

Give Us Back Our Testing

Saturday, February 14th, 2015

“Program testing involves the execution of a program over sample test data followed by analysis of the output. Different kinds of test output can be generated. It may consist of final values of program output variables or of intermediate traces of selected variables. It may also consist of timing information, as in real time systems.

“The use of testing requires the existence of an external mechanism which can be used to check test output for correctness. This mechanism is referred to as the test oracle. Test oracles can take on different forms. They can consist of tables, hand calculated values, simulated results, or informal design and requirements descriptions.”

—William E. Howden, A Survey of Dynamic Analysis Methods, in Software Validation and Testing Techniques, IEEE Computer Society, 1981

Once upon a time, computers were used solely for computation. Humans did most of the work that preceded or followed the computation, so the scope of a computer program was limited. In the earliest days, testing a program mostly involved checking to see if the computations were being performed correctly, and that the hardware was working properly before and after the computation.

Over time, designers and programmers became more ambitious and computers became more powerful, enabling more complex and less purely numerical tasks to be encoded and delegated to the machinery. Enormous memory and blinding speed largely replaced the physical work associated with storing, retrieving, revising, and transmitting records. Computers got smaller and became more powerful and protean, used not only by mathematicians but also by scientists, business people, specialists, consumers, and kids. Software is now used for everything from productivity to communications, control systems, games, audio playback, video displays, thermostats… Yet many of the software development community’s ideas about testing haven’t kept up. In fact, in many ways, they’ve gone backwards.

Ask people in the software business to describe what testing means to them, and many will begin to talk about test cases, and about comparing a program’s output to some predicted or expected result. Yet outside of software development, “testing” has retained its many more expansive meanings. A teenager tests his parents’ patience. When confronted with a mysterious ailment, doctors perform diagnostic tests with open expectations and results that must be interpreted. Writers in Cook’s Illustrated magazine test techniques for roasting a turkey, and report on the different outcomes involving the different factors—flavours, colours, moisture, textures—that they obtain. The Mythbusters, says Wikipedia, “use elements of the scientific method to test the validity of rumors, myths, movie scenes, adages, Internet videos, and news stories.”

Notice that all of these things called “testing” are focused on exploration, investigation, discovery, and learning. Yet over the last several decades, Howden’s notions of testing as checking for correctness, and of an oracle as a mechanism (or an artifact) became accepted by many people in the development and testing communities at large. Whether people were explicitly aware of those notions, they certainly seemed tacitly to have subscribed to the idea that testing should be focused on analysis of the output, displacing those deeper meanings of testing. That idea might have been more reasonable when computers did nothing but compute. Today, computers and their software are richly intertwined with daily social life and things that we value. Yet for many in software development, “testing” has this narrow, impoverished meaning, limited to what James Bach and I call checking. Checking is a tactic of testing; the part of testing that can be encoded as algorithms and that therefore can be performed entirely by machinery. It is analogous to compiling, the part of programming that can be performed algorithmically.

Oddly, since we started distinguishing between testing and checking, some people have claimed that we’re “redefining” testing. We disagree. We believe that we are recovering testing’s meaning, restoring it to its original, rich, investigative sense. Testing’s meaning was stolen; we’re stealing it back.

The Rapid Software Testing Namespace

Monday, February 2nd, 2015

Just as no one has the right to tell you what language to speak at home, nobody outside of your project has the authority to tell you how to speak inside your project. Every project develops its own namespace, so to speak, and its own formal or informal criteria for naming things inside it. Rapid Software Testing is, among other things, a project in that sense. For years, James Bach and I have been developing labels for ideas and activities that we talk about in our work and in our classes. While we’re happy to adopt useful ideas and terms from other places, we have the sole authority (for now) to set the vocabulary formally within Rapid Software Testing (RST). We don’t have the right to impose our vocabulary on anyone else. So what do we do when other people use a word to mean something different from what we mean by the same word?

We invoke “the RST namespace” when we talk about testing and checking, for example, so that we can speak clearly and efficiently about ideas that we bring up in our classes and in the practice of Rapid Software Testing. From time to time, we also try to make it clear why we use words in a specific way. For example, we make a big deal about testing and checking. We define checking as “the process of making evaluations by applying algorithmic decision rules to specific observations of a product” (and a check is an instance of checking). We define testing as “the process of evaluating a product by learning about it through exploration and experimentation, which includes to some degree: questioning, study, modeling, observation, inference, etc.” (and a test is an instance of testing).

This is in contrast with the ISTQB, which in its Glossary defines “test” as “a set of test cases”—along with “test case” as “a set of input values, execution preconditions, expected results and execution postconditions, developed for a particular objective or test condition, such as to exercise a particular program path or to verify compliance with a specific requirement.” Interesting, isn’t it: the ISTQB’s definition of test looks a lot like our definition of check. In Rapid Software Testing, we prefer to put learning and experimentation (rather than satisfying requirements and demonstrating fitness for purpose) at the centre of testing. We prefer to think of a test as something that people do as an act of investigation; as a performance, not as an artifact.

Because words convey meaning, we converse (and occasionally argue, and sometimes passionately) the value we see in the words we choose and the ways we think of them. Our goal is to describe things that people haven’t noticed, or to make certain distinctions clear, with the goal of reducing the risk that someone will misunderstand—or miss—something important. Nonetheless, we freely acknowledge that we have no authority outside of Rapid Software Testing. There’s nothing to stop people from using the words we use in a different way; there are no language police in software development. So we’re also willing to agree to use other people’s labels for things when we’ve had the conversation about what those labels mean, and have come to agreement.

People who tout a “common language” often mean “my common language”, or “my namespace”. They also have the option to certify you as being able to pass a vocabulary test, if anyone thinks that’s important. We don’t. We think that it’s important for people to notice when words are being used in different ways. We think it’s important for people to become polyglots—and that often means working out which namespace we might be using from one moment to the next. In our future writing, conversation, classes, and other work, you might wonder what we’re talking about when we refer to “the RST namespace”. This post provides your answer.

Testing is…

Tuesday, October 28th, 2014

Every now and again, someone makes some statement about testing that I find highly questionable or indefensible, whereupon I might ask them what testing means to them. All too often, they’re at a loss to reply because they haven’t really thought deeply about the matter; or because they haven’t internalized what they’ve thought about; or because they’re unwilling to commit to any statement about testing. And then they say something vague or non-commital like “it depends” or “different things to different people” or “that’s a matter of context”, without suggesting relevant dependencies, people, or context factors. So, for those people, I offer a set of answers from which they can choose one; or they can adopt the entire list wholesale; or they use one or more items as a point of departure for something of their own invention. You don’t have to agree with any of these things; in that case, invent your own ideas about testing from whole cloth But please: if you claim to be a tester, or if you are making some claim about testing, please prepare yourself and have some answer ready when someone asks you “what is testing?”. Please. Here are some possible replies; I believe everything is Tweetable, or pretty close.

Testing is—among other things—reviewing the product and ideas and descriptions of it, looking for significant and relevant inconsistencies.
Testing is—among other things—experimenting with the product to find out how it may be having problems—which is not “breaking the product”.
Testing is—among other things—something that informs quality assurance, but is not in and of itself quality assurance.
Testing is—among other things—helping our clients to make empirically informed decisions about the product, project, or business.
Testing is—among other things—a process by which we systematically examine any aspect of the product with the goal of preventing surprises.
Testing is—among other things—a process of interacting with the product and its systems in many ways that challenge unwarranted optimism.
Testing is—among other things—observing and evaluating the product, to see where all those defect prevention ideas might have failed.
Testing is—among other things—a special part of the development process focused on discovering what could go badly (or what is going badly).
Testing is—among other things—exploring, discovering, investigating, learning, and reporting about the product to reveal new information.
Testing is—among other things—gathering information about the product, its users, and conditions of its use, to help defend value.
Testing is—among other things—raising questions to help teams to develop products that more quickly and easily reveal their own problems.
Testing is—among other things—helping programmers and the team to learn about unanticipated aspects of the product we’re developing.
Testing is—among other things—helping our clients to understand the product they’ve got so they can decide if it’s the product they want.
Testing is—among other things—using both tools and direct interaction with the product to question and evaluate its behaviours and states.
Testing is—among other things—exploring products deeply, imaginatively, and suspiciously, to help find problems that threaten value.
Testing is—among other things—performing actual and thought experiments on products and ideas to identify problems and risks.
Testing is—among other things—thinking critically and skeptically about products and ideas around them, with the goal of not being fooled.
Testing is—among other things—evaluating a product by learning about it through exploration, experimentation, observation and inference.

You’re welcome.

How Models Change

Saturday, July 19th, 2014

Like software products, models change as we test them, gain experience with them, find bugs in them, realize that features are missing. We see opportunities for improving them, and revise them.

A product coverage outline, in Rapid Testing parlance, is an artifact (a map, or list, or table…) that identifies the dimensions or elements of a product. It’s a kind of inventory of aspects of the product that could be tested. Many years ago, my colleague and co-author James Bach wrote an article on product elements, identifying Structure, Function, Data, Platform, and Operations (SFDPO; think “San Francisco DePOt”, he suggested) as a set of heuristic guidewords for creating or structuring or reviewing the highest levels of a coverage outline.

A few years later, I was working as a tester. While I was on that assignment, I missed a few test ideas and almost missed a few bugs that I might have noticed earlier had I thought of “Time” as another guideword for modeling the product. After some discussion, I persuaded James that Time was a worthy addition to the Product Elements list. I wrote my own article on that, Time for New Test Ideas).

Over the years, it seemed that people were excited by the idea of using SFDPOT as the starting point for a general coverage outline. Many people reported getting a lot of value out of it, so in my classes, I’ve placed more and more emphasis on using and practicing the application of that part of the Heuristic Test Strategy Model. One of the exercises involves creating a mind map for a real software product. I typically offer that one way to get started on creating a coverage outline is to walk through the user interface and enumerate each element of the UI in the mind map.

(Sometimes people ask, “Why bother? Don’t the specifications or the documentation or the Help file provide maps of the UI? What’s the point of making another one?” One answer is that the journey, rather than the map, is the point. We learn one set of things by reading about a product; we learn different things—and we typically learn more deeply—by touring the product, interacting with it, gaining experience with it, and organizing descriptions of what we’ve found. Moreover, at each moment, we may notice, infer, or wonder about things that the documentation doesn’t address. When we recognize something new, we can add it to our coverage model, our risk list, or our test ideas—plus we might recognize and note some bugs or issues along the way. Another answer is that we should treat anything that any documentation says about a product as a rumour until we’ve engaged with the product.)

One issue kept coming up in class: on the product coverage outline, where should the map of the user interface go? Under Functions (what the product does)? Or Operations (how people use the product)? Or Structure (the bits and pieces of the product)? My answer was that it doesn’t matter much where you put things on your coverage outline, as long as it fits for you and the people with whom you might be sharing the map. The idea is to identify things that could be tested, and not to miss important stuff.

After one class, I was on the phone with James, and I happened to mention that day’s discussion. “I prefer to put the UI under Structure,” I noted.

What? That’s crazy talk! The UI goes under Functions!”

“What?” I replied. “That’s crazy talk. The UI isn’t Functions. Sure, it triggers functions. But it doesn’t perform those functions.”

“So what?” asked James. “If it’s how the user gets at functions, it fits under Functions just fine. What makes you think the UI goes under Structure?”

“Well, the UI has a structure. It’s… structural.”

Everything has a structure,” said James. “The UI goes under Functions.”

And so we argued on. Then one of us—and I honestly don’t remember who—suggested that maybe the UI was important enough to be its own top-level product element. I do remember James pointing out that if when we think of interfaces, plural, there might be several of them—not just the graphical user interface, but maybe a command-line interface. An application programming interface.

“Hmmm…,” I said. This reminded me of the four-user model mentioned in How to Break Software (human user, API user, operating system user, file system user). “Interfaces,” I said. “Operating system interface, file system interface, network interface, printer interface, debugging interface, other devices…”

“Right,” said James. “Plus there are those other interface-y things—importing and exporting stuff, for instance.”

“Aren’t those covered under ‘Functions’?”

“Sure. Or they might be, depending on how you think about it. But the point of this kind of model isn’t to be a template, or a form you fill out. It’s to help us reduce the chances that we might miss something important. Our models are leaky abstractions; overlaps are okay,” said James. Which, of course, was exactly the same argument I had used on him several years earlier when we had added Time to the model. Then he paused. “Ah! But we don’t want to break the mnemonic, do we? San Francisco DePOT.”

“We can deal with that. Just misspell ‘depot’ San Francisco DIPOT. SFDIPOT.”

And so we updated the model.

I wonder what it will look like five years from now.

Very Short Blog Posts (11): Passing Test Cases

Wednesday, January 29th, 2014

Testing is not about making sure that test cases pass. It’s about using any means to find problems that harm or annoy people. Testing involves far more than checking to see that the program returns a functionally correct result from a calculation. Testing means putting something to the test, investigating and learning about it through experimentation, interaction, and challenge. Yes, tools may help in important ways, but the point is to discover how the product serves human purposes, and how it might miss the mark. So a skilled tester does not ask simply “Does this check pass or fail?” Instead, the skilled tester probes the product and asks a much more rich and fundamental question: Is there a problem here?