Archive for the ‘Exploratory Testing’ Category

What Exploratory Testing Is Not (Part 5): Undocumented Testing

Wednesday, December 21st, 2011

This week I had the great misfortune of reading yet another article which makes the false and ridiculous claim that exploratory testing is “undocumented”. After years and years of plenty of people talking about and writing about and practicing excellent documentation as part of an exploratory testing approach, it’s depressing to see that there are still people shovelling fresh manure onto a pile that should have been carted off years ago.

Like the other approaches to test activities that have been discussed in this series (“touring“, “after-everything-else“, “tool-free“, and “quick testing“), “documented vs. undocumented” is in a category orthogonal to “exploratory vs. scripted”. True: usually scripted activities are performed by some agency following a set of instructions that has been written down somewhere. But we could choose to think of “scripted” in a slightly different and more expansive way, as “prescriptive”, or “mimeomorphic“. A scripted activity, in this sense, is one for which the actions to be performed have been established in advance, and the choices of the actions are not determined by the agency performing them. In that sense, a cook at McDonalds doesn’t read a script as he prepares your burger, but the preparation of a McDonald’s burger is a highly scripted activity.

Thus any kind of testing can be heavily documented or completely undocumented. A thoroughly documented test might be highly exploratory in nature, or it might be highly scripted.

In the Rapid Software Testing class, James Bach and I point out that when someone says “that should be documented”, what they’re really saying is “that should be documented if and how and when it serves our purposes.” So, let’s start by looking at the “when”.

When we question anything in order to evaluate it, there are moments in the process in which we might choose to record ideas or actions. I’ve broken these down into three basic categories that I hope you find helpful:

  • Before

  • During

  • After

There are “before”, “during”, and “after” moments with respect to any test activity, whether it’s a part of test design, test execution, result interpretation, or learning. Again, a hallmark of exploratory testing is the tester’s freedom and responsibility to optimize the value of the work as it’s happening. That means that when it’s important to record something, the tester is not only welcome but encouraged to

  • pick up a pen
  • take a screen shot
  • launch a session of Rapid Reporter
  • create or update a mind map
  • fire up a screen recorder
  • initiate logging (if it doesn’t start by default on the product you’re testing—and if logging isn’t available, you might consider identifying that as a testability problem and a related product and project risk)
  • sketch out a flowchart diagram
  • type notes into a private or shared repository
  • add to a table of data in Excel
  • fire off a note to a programmer or a product owner

and that’s an incomplete list. But they’re all forms of documentation.

Freedom to document at will should also mean that the tester is free to refrain from documenting something when the documentation doesn’t add value. At the same time, the tester is responsible and accountable for that decision. In Rapid Testing, we recommend writing down (or saving, or illustrating) only the things that are necessary or valuable to the project, and only when the value of doing so exceeds the cost. This doesn’t mean no documentation; it means the most informative yet fastest and least expensive documentation that completely fulfils the testing mission. Integrating that with testing work leads, we hold, to excellent testing—but it takes practice and skill.

For most test activities, it’s possible to relay information to other people orally, or even sometimes by allowing people to observe our behaviour. (At the beginning of the Rapid Testing class, I sometimes silently hold aloft a 5″ x 8″ index card in landscape orientation. I fold it in half along the horizontal axis, and write my first name on one side using a coloured marker. Everyone in the class mimics my actions. Without a single word of instruction being given or questions being asked, either verbally or in writing, the mission has been accomplished: each person now has a tent card in front of him.)

There’s a potential risk associated with an exploratory approach: that the tester might fail to document something important. In that case, we do what skilled people do with risk: we manage it. James Bach talks at length about managing exploratory testing sessions here. Producing appropriate documentation is partly a technical process, but the technical considerations are dominated by business imperatives: cost, value, and risk. There are also social considerations, too. The tester, the test lead, the test manager, the programmers, other managers, and the product owner determine collaboratively what’s important to document and what’s not so important with respect to the current testing mission. In an exploratory approach, we’re more likely to be emphasizing the discovery of new information. So we’re less likely to spend time on documenting what we will do, more likely to document what we are doing and what we have done. We could do a good deal of preparatory reading and writing, even in an exploratory approach—but we realize that there’s an ever-increasing risk that new discoveries will undermine the worth of what we write ahead of time.

That leads directly to “our purposes”, the task that we want to accomplish when documenting something. Just as testing itself has many possible missions, so too does test documentation. Here’s a decidedly non-exhaustive list, prepared over a couple of minutes:

  • to express testing strategy and tactics for an entire project, or for projects in general
  • to keep a set of personal notes to help structure a debriefing conversation
  • to outline testing activities for a test cycle
  • to report on activities during testing execution
  • to outline attributes of a particular quality criterion
  • to catalogue ideas about risk
  • to describe test coverage
  • to account for the work that we’ve done
  • to program a machine to perform a given set of actions
  • to alert people to potential problems in the product
  • to guide a tester’s actions over a test session
  • to identify structures in the application or service
  • to provide a description of how to use a particular test tool that we’ve crafted
  • to describe the tester’s role, skills, and qualifications
  • to explain business rules to someone else on the team
  • to outline scenarios in which the product might be used or tested
  • to identify, for a tester, a specific, explicit sequence of actions to perform, input to provide, and observations to make

That last item is the classic form of highly scripted testing, and that kind of documentation is usually absent from exploratory testing. Even so, a tester can take an exploratory approach using a script as a point of departure or as a reference, just as you might use a trail map to help guide an off-trail hike (among other things, you might want to discover shortcuts or avoid the usual pathways). So when someone says that “exploratory testing is undocumented”, I hear them saying something else. I hear them saying, “I only understand one form of test documentation, and I’ve successfully ignored every other approach to it or purpose for it.”

If you look in the appendices for the Rapid Software Testing class (you can find a .PDF at http://www.satisfice.com/rst-appendices.pdf), you’ll see a large number of examples of documentation that are entirely consistent with an exploratory approach. That’s just one source. For each item in my partial list above, here’s a partial list of approaches, examples, and tools.

Testing strategy and tactics for an entire project, or for projects in general.
Look at the Satisfice Heuristic Test Strategy Model and the Context Model for Heuristic Test Planning (these also appear in the RST Appendices).

An outline of testing activities for a test cycle.
Look at the General Functionality and Stability Test Procedure for Certifed for Microsoft Windows Logo. See also the OWL Quality Plan (and the Risk and Task Correlation) in the RST Appendices.

Keeping a set of personal notes to help structure a debriefing or other conversation.
See the “Beans ‘R Us Test Report” in the RST Appendices; or see my notes on testing an in-flight entertainment system which I did for fun on a flight from India to Amsterdam.

Recording activities and ideas during test execution
A video camera or a screen recording tool can capture the specific actions of a tester for later playback and review. Well-designed log files may also provide a kind of retrospective record about what was testing. Still neither of these provide insight into the tester’s mind. Recorded narration or conversation can do that; tools like BB Test Assistant, Camtasia, or Morae can help. The classic approach, of course, is to take notes. Have a look at my presentation, “An Exploratory Tester’s Notebook“, which has examples of freestyle notes taken during an impromptu testing session, and detailed, annotated examples of Session-Based Test Management sessions. Shmuel Gerson’s Rapid Reporter and Jonathan Kohl’s Session Tester are tools oriented towards taking notes (and, in the former case, including screen captures) of testing sessions on the fly.

Outlining many attributes of a particular quality criterion
See “Heuristics of Software Testability” in the RST Appendices for one example.

Cataloguing ideas about risk
Several examples of this in the RST Appendices, most extensively in the “Deployment Planning and Risk Analysis” example. You’ll also find an “Install Risk Catalog”; “The Risk of Incompatibility”; the Risk vs. Tasks section in the “OWL Quality Plan”; the “Y2K Compliance Report”; “Round Results Risk A”, which shows a mapping of Risk Areas vs. Test Strategy and Tasks.

Describing or outlining test coverage
A mapping establishes or illustrates relationships between things. We can use any of these to help us think about test coverage. In testing, a map might look like a road map, but it might also look like a list, a chart, a table, or a pile of stories. These can be constructed before, after, or during a given test activity, with the goal of covering the map with tests, or using testing to extend the map. I catalogued several ways of thinking about coverage and reporting on it, in three articles Got You Covered, Cover or Discover, and A Map By Any Other Name. Several examples of lightweight coverage outlines can be found in the RST Appendices (“Putt Putt Saves the Zoo”, “Table Formatting Test Notes”, There are also coverage ideas incorporated into the Apollo mission notes that we’ve titled “Guideword Heuristics for Astronauts”).

Accounting for testing work that we’ve done.
See Session-Based Test Management, and see “An Exploratory Tester’s Notebook“. Darren McMillan provides excellent examples of annotated mind maps; scroll down to the section headed “Session Reports”, and continue through “Simplifying feedback to management” and “Simplifying feedback to groups”. A forthcoming article, written by me, shows how a senior test manager tracks testing sessions at a half-day granularity level.

Programming a machine to help you to explore
See all manner of books on programming, both references and cookbooks, but for testers in particular, have a look at Brian Marick’s Everyday Scripting with Ruby. Check out Pete Houghton’s splendid examples of exploratory test automation that begin here. Cem Kaner (often in collaboration with Doug Hoffman) write extensively about automation-assisted exploratory testing; an example is here.

Alerting people to potential problems in the product
In general, bug reporting systems provide one way to handle the task of recording and reporting problems in the product. James Bach provides an example of a report that he provided to a client (along with a more informal account of the session).

Guiding a tester’s actions over a test session
Guiding a tester involves skills like chartering and checklisting. Start with the documentation on Session Based Test Management (http://www.satisfice.com/sbtm). Selena Delesie has produced an excellent blog post on chartering exploratory testing sessions. The title of Cem Kaner’s presentation at CAST 2008, The Value of Checklists and the Danger of Scripts: What legal training suggests for testers describes the content perfectly. Michael Hunter’s You Are Not Done Yet lists can be used and adapted to your context as a set of checklists.

To identify structures in the application or service
The “Product Elements” section in the Heuristic Test Strategy Model provides a kind of framework for documenting product structures. In the RST Appendices, the test notes for “Putt Putt Saves the Zoo” and “Diskmapper”, and the “OWL Quality Plan” provide examples of identifying several different structures in the programs under test. Mind mapping provides a means of describing and illustrating structures, too; see Darren McMillan’s examples here and here. Ruud Cox and Ru Cindrea used a mind map of product elements to help win the Best Bug Report award in the Test Lab at EuroSTAR 2011. I’ve created a list of structures that support exploratory testing, and many of these are related to structures in the product.

Providing a description of how to use a particular test tool that we’ve crafted
While working at a bank, I developed (in Excel and VBA) a tool that could be used as an oracle and as a way of recording test results. (Thanks to non-disclosure agreements, I can describe these, but cannot provide examples.) When I left the project, I was obliged to document my work. I didn’t work on the assumption that anyone off the street would be reading the document. Instead, I presumed that anyone assigned to that testing job and to using that tool, would have the rapid learning skill to explore the tool, the product, and the business domain in a mutually supportive way. So I crafted documentation that was intended to tell testers just enough to get them exploring.

Explaining business rules to someone else on the team
I did include documentation for novices of one kind: within the documentation for that testing tool, I included a general description of how foreign exchange transactions worked from the bank’s perspective, and how appropriate accounts got credited and debited. I had learned this by reverse-engineering use cases and consulting with the local business analyst. I summarized it with a two-page document written in simple, direct language, referring disrectly to the simpler use cases and explaining the more confusing bits in more detail. For those whose learning style was oriented toward code, I also described the tables and array formulas that applied the business rules.

Outlining scenarios in which the product might be used or tested
I discuss some issues about scenarios here—why they’re important, and why it’s important to keep them open-ended and open to interpretation. It’s more important to record than to prescribe, since in a good scenario, you’ll observe and discover much more than you’ve articulated in advance. Cem Kaner gives ideas on how to produce scenarios; Hans Buwalda presents examples of soap opera testing.

Identifying required tester skill
People with skill don’t need prescriptive documentation for every little thing. Responsible managers identify the skills needed to test, and who commit to employing people who either have those skills or can develop them quickly. James Bach eliminated 50 pages of otiose documentation with two paragraphs. (Otiose is a marvelous word; it’s fun to look it up in a thesaurus.)

Identifying, for a tester, a particular explicit sequence of actions to perform, input to provide, and observations to make.
Again, a document that attempts to specify exactly what a tester should do is the hallmark of scripted testing. James Bach articulates a paradox that has not yet been noted clearly in our craft: in order to perform a scripted test well, you need signficant amounts of skill and tacit knowledge (and you also need to ignore the script on occasion, and you need to know when those occasions are). There’s another interesting issue here: preparing such documents usually depends on exploratory activity. There’s no script to tell you how to write a script. (You might argue there’s one exception. You can follow this script to write a test script: take each line of a requirements document, and add the words “Verify that” to the beginning of each line.)

Now, just as you can perform testing badly using any approach, you can perform exploratory testing and document it inappropriately, either by under-documenting it OR over-documenting it using any of the kinds of documentation above. But, as this document shows, the notion that exploratory testing is by its nature undocumented is not only ignorant, but aggressively ignorant about both testing and documentation. Whenever you see someone claim that exploratory testing is undocumented, I’d ask you to help by setting the record straight. Feel free to refer to this blog post, if you find it helpful; also, please point me to other exemplars of excellent documentation that are consistent with exploratory approaches. If we all work together, we can bury this myth, while providing excellent records and reports for our clients.

What Exploratory Testing Is Not (Part 4): Quick Tests

Sunday, December 18th, 2011

Quick testing is another approach to testing that can be done in a scripted way or an exploratory way. A tester using a highly exploratory approach is likely to perform many quick tests, and quick tests are often key elements in an exploratory approach. Nonetheless, quick testing and exploratory testing aren’t the same.

Quick tests are inexpensive tests that require little time or effort to prepare or perform. They may not even require a great deal of knowledge about the application being tested or its business domain, but they can help to develop new knowledge rapidly. Rather than emphasizing comprehensiveness or integrity, quick tests are designed to reveal information in a hurry at a minimal cost.

A quick test can be a great way to learn about the product, or to identify areas of risk, fragility, or confusion. A tester can almost always sneak a quick test or two into other testing activity. A burst of quick tests can help as the first activities during a smoke or sanity test. Cycles of relatively unplanned, informal quick tests may help to you discover or refine a more comprehensive or formal plan.

James Bach and I provide examples of many kinds of quick tests in the Rapid Software Testing class. You’ll notice that some of them are called tours. Note that not all tours are quick, and not all quick tests are tours. Here’s a catalog.

Happy Path
Perform a task, from start to finish, that an end-user might be expected to do. Use the product in the most simple, expected, straightforward way, just as the most optimistic programmer or designer might imagine users to behave. Look for anything that might confuse, delay, or irritate a reasonable person. Cem Kaner sometimes calls this “sympathetic testing”. Lean towards learning about the product, rather than finding bugs. If you do see obvious problems, it may be bad news for the product.

Variable Tour
Tour a product looking for anything that is variable and vary it. Vary it as far as possible, in every dimension possible. If you’re using quick tests for learning, seek and catalog the important variables. Look for potential relationships between them. Identifying and exploring variations is part of the basic structure of our testing when we first encounter a product.

Sample Data Tour
Employ any sample data you can, and all that you can. For one kind of quick tests, prefer simple values whose effects are easy to see or calculate. For a different kind of quick test, choose complex or extreme data sets. Observe the units or formats in which data can be entered, and try changing them. Challenge the assumption that the programmers have thought to reject or massage inappropriate data. Once you’ve got a handle on your ideas about reasonable or mildly challenging data, you might choose to try…

Input Constraint Attack
Discover sources of input and attempt to violate constraints on that input. Try some samples of pathological data: use zeroes where large numbers are expected; use negative numbers where positive numbers are expected; use huge numbers where modestly-sized ones are expected; use letters in every place that’s supposed to handle only numbers, and vice versa. Use a geometrically expanding string in a field. Keep doubling its length until the product crashes. Use characters that are in some way distinct from your idea of “normal” or “expected”. Inject noise of any kind into a system and see what happens. Use Satisfice’s PerlClip utility to create strings of arbitrary length and content; use PerlClip’s counterstring feature to create a string that tells you its own length so that you can see where an application cuts off input.

People tend to talk a lot about input constraint attacks. Perhaps that’s because input constraint attacks are used by hackers to compromise systems; perhaps it’s because input constraint attacks can be performed relatively straightforwardly; perhaps it’s because they can be described relatively easily; perhaps it’s because input constraint attacks can produce dramatic and highly unexpected results. Yet they’re by no means the only kind of quick test, and they’re certainly not the only way to test using an exploratory approach.

Documentation Tour
Look in the online help or user manual and find some instructions about how to perform some interesting activity. Do those actions explicitly. Then improvise from them and experiment. If your product has a tutorial, follow it. You may expose a problem in the product or in the documentation; either way, you’ve found an inconsistency that is potentially important. Even if you don’t expose a problem, you’ll still be learning about the product.

File Tour
Have a look at the folder where the program’s .EXE file is found. Check out the directory structure, including subs. Look for READMEs, help files, log files, installation scripts, .cfg, .ini, .rc files.
Look at the names of .DLLs, and extrapolate on the functions that they might contain or the ways in which their absence might undermine the application. Use whatever supplemental material you’ve got to guide or focus your actions. Another way to gather information for this kind of test: use tools to monitor the installation, and take the output from the tool as a point of departure.

Complexity Tour
Tour a product looking for the most complex features, the most challenging data sets, and the biggest interdepencies. Look for hidden nooks and crannies, but also look for the program’s high-traffic areas, busy markets, big office buildings, and train stations—places where there’s lots of interactions, and where bugs might be blending in with the crowd.

Menu, Window, and Dialog Tour
Tour a product looking for all the menus (main and context menus), menu items, windows, toolbars, icons, and other controls. Walk through them. Try them. Catalog them, or construct a mind map.

Keyboard and Mouse Tour
Tour a product looking for all the things you can do with a keyboard and mouse. Hit all of the keys on the keyboard. Hit all the F-keys; hit Enter, Tab, Escape, Backspace; run through the alphabet in order, and combine each key with Shift, Ctrl, Alt, the Windows key, CMD or Option, on other platforms, the AltGr key in Europe. Click (right, left, both, double, triple) on everything. Combine clicks with shifted keys.

Interruptions
Start activities and stop them in the middle. Stop them at awkward times. Perform stoppages using cancel buttons, O/S level interrupts (ctrl-alt-delete or task manager). Arrange for other programs to interrupt (such as screensavers or virus checkers). Also, try suspending an activity and returning later. Put your laptop into sleep or hibernation mode.

Undermining
Start using a function when the system is in an appropriate state, then change the state part way through.
Delete a file while it is being edited; eject a disk; pull net cables or power cords) to get the machine an inappropriate state. This is similar to interruption, except you are expecting the function to interrupt itself by detecting that it no longer can proceed safely.

Adjustments
Set some parameter to a certain value, then, at any later time, reset that value to something else without resetting or recreating the containing document or data structure. Programmers often expect settings or variables to be adjusted through the GUI. Hackers and tinkerers expect to find other ways.

Dog Piling
Whatever you’re doing, do more of it and do other stuff as well while you’re at it. Get more processes going at once; try to establish more states existing concurrently. Invoke nested dialog boxes and non-modal dialogs. On multi-user systems, get more people using the system or simulate that with tools. If your test seems to trigger odd behaviour, pile on in the same place until the odd becomes crazy.

Continuous Use
While testing, do not reset the system. Leave windows and files open; let disk and memory usage mount.
You’re hoping to show that the system loses track of things or ties itself in knots over time.

Feature Interactions
Discover where individual functions interact or share data. Look for any interdependencies. Explore them, exploit them, and stress them out. Look for places where the program repeats itself or allows you to do the same thing in different places. For example, for data to be displayed in different ways and in different places, and seek inconsistencies. For example, load up all the fields in a form to their maximums and then traverse to the report generator.

Summon Help
Bring up the context-sensitive help feature during some operation or activity. Does the product’s help file explain things in a useful way, or does it offend the user’s intelligence by simply restating what’s already on the screen? Is help even available at all?

Click Frenzy
Ever notice how a cat or a kid can crash a system with ease? Testing is more than “banging on the keyboard”, but that phrase wasn’t coined for nothing. Try banging on the keyboard. Try clicking everywhere. Poke every square centimeter of every screen until you find a secret button.

Shoe Test
Use auto-repeat on the keyboard for a very cheap stress test. Look for dialog boxes constructed such that pressing a key leads to, say, another dialog box (perhaps an error message) that also has a button connected to the same key that returns to the first dialog box. Place a shoe on the keyboard and walk away. Let the test run for an hour. If there’s a resource or memory leak, this kind of test could expose it. Note that some lightweight automation can provide you with a virtual shoe.

Blink Test
Find an aspect of the product that produces huge amounts of data or does some operation very quickly. Look through a long log file or browse database records, deliberately scrolling too quickly to see in detail. Notice trends in line lengths, or the look or shape of the data. Use Excel’s conditional formatting feature to highlight interesting distinctions between cells of data. Soften your focus. If you have a test lab with banks of monitors, scan them or stroll by them; patterns of misbehaviour can be surprisingly prominent and easy to spot.

Error Message Hangover
Programmers are rewarded for implementing features. There’s a psychological problem with errors or exceptions: the label itself suggests that something has gone wrong. People often actively avoid thinking about problems or mistakes, and as a consequence, programmers sometimes handle errors poorly. Make error messages happen and test hard after they are dismissed. Watch for subtle changes in behaviour between error and normal paths. With automation, make the same error conditions appear thousands of times.

Resource Starvation
Progressively lower memory, disk space, display resolution, and other resources. Keep starving the product until it collapses, or gracefully (we hope) degrades.

Multiple Instances
Run a lot of instances of the app at the same time. Open, use, update, and save the same files. Manipulate them from different windows. Watch for competition for resources and settings.

Crazy Configs
Modify the operating system’s configuration in non-standard or non-default ways, either before or after installing the product. Turn on “high contrast” accessibility mode, or change the localization defaults. Change the letter of the system hard drive. Put things in non-default directories. Use RegEdit (for registry entries) or a text editor (for initialization files) to corrupt your program’s settings in a way that should trigger an error message, recovery, or an appropriate default behavior.

Again: quick tests tend to be highly exploratory, but they represent only one kind of exploratory testing. Don’t be fooled into believing that quick testing—or certain kinds of quick testing—is all there is to exploratory testing.

Testing Problems Are Test Results

Tuesday, September 6th, 2011

I often do an exercise in the Rapid Software Testing class in which I ask people to catalog things that, for them, make testing harder or slower. Their lists fit a pattern I hear over and over from testers (you can see an example of the pattern in this recent question on Stack Exchange). Typical points include:

  • I’m a tester working alone with several programmers (or one of a handful of testers working with many programmers).
  • I’m under enormous time pressure. Builds are coming in continuously, and we’re organized on one- or two-week development cycles.
  • The product(s) I’m testing is (are) very complex.
  • There are many interdependencies between modules within the product, or between products.
  • I’m seeing a consistent pattern of failures specifically related to those interdependencies; the tiniest change here can have devastating impact there—or anywhere.
  • I believe that I have to run a complete regression test on every build to try to detect those failures.
  • I’m trying to cope by using automated checks, but the complexity makes the automation difficult, the program’s testing hooks are minimal at best, and frequent product changes make the whole relationship brittle.
  • The maintenance effort for the test automation is significant, at a cost to other testing I’d like to do.
  • I’m feeling overwhelmed by all this, but I’m trying to cope.

On top of that,

  • The organization in which I’m working calls itself Agile.
  • Other than the two-week iterations, we’re actually using at most two other practices associated with Agile development, (typically) daily scrums or Kanban boards.

Oh, and for extra points,

  • The builds that I’m getting are very unstable. The system falls over under the most basic of smoke tests. I have to do a lot of waiting or reconfiguring or both before I can even get started on the other stuff.

How might we consider these observations?

We could choose to interpret them as problems for testing, but we could think of them differently: as test results.

Test results don’t tell us whether something is good or bad, but they may inform a decision or an evaluation or more questions. People observe test results and decide whether there are problems and what the problems are, what further questions are warranted, and what decisions should be made. Doing that requires human judgement and wisdom, consideration of lots of factors, and a number of possible interpretations.

Just as for automated checks and other test results, it’s important to consider a variety of explanations and interpretations for testing meta-results—observations about testing—lest we miss an important problem. As Jerry Weinberg points out in Perfect Software and Other Illusions About Testing, whatever else something might be, it’s information. If testing is, as Jerry says, gathering information with the intention of informing a decision, it seems a mistake to leave potentially valuable observations lying around on the floor. Indeed, rather than thinking of them as problems for testing, we could choose to think of them as symptoms of product or project problems—problems that testing can help to solve.

For example, when a tester feels outnumbered by programmers, or when a tester feels under time pressure, that’s a test result. The feeling often comes from the programmers generating more work and more complexity than the tester can handle. Yet complexity, like quality, is a relationship between some person and something else. Complexity on its own isn’t necessarily a problem; it’s how people deal with it and its attendant risks that’s a problem. When we observe the ways in which people react to a perception of complexity, we might learn a lot.

  • Are people conscious of the risks—especially the Black Swans—that typically accompany complexity?
  • If people are conscious of risk, are they paying attention to it? Are they panicking over it? Or are they ignoring it and whistling past the graveyard? Or…
  • Are people reacting calmly and pragmatically? Are they acknowledging and dealing with the complexity of the product? If they can’t make the product or the process that it models less complex, are they at least taking steps to make understanding of the product more tractable?
  • Might the programmers be generating or modifying code so quickly that they’re not taking the time to understand what’s really going on with it?
  • If someone feels that more testers are needed, what’s behind that feeling? (I took a stab at an answer to that question a few years back.)

How might we figure that out answers to those questions? One way might be to look at more of the test results and test meta-results.

  • Does someone perceive testing to be difficult or time-consuming? Who? What’s the basis for that perception? What assumptions underlie it?
  • Does the need to investigate and report bugs overwhelm the testers’ capacity to obtain good test coverage? (I wrote about that problem here.)
  • Does testing consistently reveal consistent patterns of failure?
  • Are programmers consistently surprised by such failures and patterns?
  • Do small changes in the code cause problems that are disproportionately large or hard to find?
  • Do the programmers understand the interdependencies clearly? Are those interdependencies necessary, or could they be eliminated?
  • Are programmers taking steps to anticipate or prevent problems related to interfaces and interactions?
  • If automated checks are difficult to develop and maintain, does that say something about the skill of the tester, the quality of the automation interfaces, or the scope of checks? Or about something else?
  • Are unstable builds a problem that get in the way of deeper testing? Or could we interpret them as a sign that the product has problems so numerous and serious that even shallow testing reveals them?
  • When a “stable” build appears after a long series of unstable builds, how stable is it really?

Perhaps, with the answers to those questions, we could raise even more questions.

  • What risks do those problems present for the success of the product, whether in the short term or the longer term?
  • When testing consistently reveals patterns of failures and attendant risk, what does the product team do with that information?
  • Are the programmers mandated to deliver code? Or are the programmers mandated to deliver code with a warrant that the code does what it should (and doesn’t do what it shouldn’t), to the best of their knowledge? Do the programmers adamantly prefer the latter mandate?
  • Is someone pressuring the programmers to make schedule or scope commitments that they can’t really fulfill?
  • Are the programmers and the testers empowered to push back on scope or schedule pressue when it adds to product or project risk?
  • Do the business people listen to the development team’s concerns? Are they aware of the risks that testers and programmers bring to their attention? When the development team points out risks, do managers and business people deal with them congruently?
  • Is the team working at a sustainable pace, or might we expect the product and the project to become overwhelmed by complexity, interdependencies, fragility, and problems that lurk just beyond the reach of our development and testing effort?
  • Is the development team really Agile, in the sense of the precepts of the Agile Manifesto? Or is “agility” being used in a cargo-cult way, using practices or artifacts to mask over an incoherent project?

Testers often feel that their role is to find, investigate, and report on bugs in the product. That’s usually true, but it’s also a pretty limited view of the kinds of information that testing reveals. When seen one way, the problems I’ve listed above sound like serious problems for testing. What if we also remembered Jerry’s definition of testing as “gathering information with the intention of informing a decision”? If that’s the case, then everything that we notice or discover during testing is a test result.

(See also this discussion for an example of looking beyond the test result for possible product and project risks.)

Can You Test a Clock in a Sealed Box?

Friday, September 2nd, 2011

A while ago, James Bach and I did a transpection session. The object of the conversation was to think critically about the common trope that every test consists of at least an input and an expected result. We wanted to go deeper than that, and in the process we discovered a number of useful ideas. A test can be informed by an expectation, but oracles can also be developed on the fly. Oracles can also be applied retrospectively, after the test has been “completed”, such that you never know when a test ends. James has a wonderful example of that here. We also came up with the notion of implicit and explicit inputs, and symbolic and non-symbolic inputs.

As the basis of our chat, James presented the thought experiment of testing a clock that you can’t get at. Just recently my friend Adam White pointed me to this little gem. Enjoy!

The Best Tour

Thursday, June 30th, 2011

Cem Kaner recently wrote a reply to my blog post Of Testing Tours and Dashboards. One way to address the best practice issue is to go back to the metaphor and ask “What would be the best tour of London?” That question should give rise to plenty of other questions.

  • Are you touring for your own purposes, or in support of someone else’s interests? To what degree are other people interested in what you learn on the tour? Are you working for them? Who are they? Might they be a travel agency? A cultural organization? A newspaper? A food and travel show on TV? The history department of a university? What’s your information objective? Does the client want quick, practical, or deep questions answered? What’s your budget?
  • How well do you know London already?  How much would you like to leave open the possibility of new discoveries?  What maps or books or other documentation do you have to help to guide or structure your tour?  Is updating those documents part of your purpose?
  • Is someone else guiding your tour? What’s their reputation? To what extent do you know and trust them? Are they going to allow you the opportunity and the time to follow your own lights and explore, or do they have a very strict itinerary for you to follow? What might you see—or miss—as a result?
  • Are you traveling with other people? What are they interested in? To what degree do you share your discovery and learning?
  • How would you prefer to get around? By Tube, to get around quickly? By a London Taxi (which includes some interesting information from the cabbie? By bus, so you can see things from the top deck? On foot? By tour bus, where someone else is doing all the driving and all the guiding (that’s scripted touring)?
  • What do you need to bring with you? Notepad? Computer? Mobile phone? Still camera? Video camera? Umbrella? Sunscreen? (It’s London; you’ll probably need the umbrella.)
  • How much time do you have available?   An afternoon?  A day?  A few days? A week?  A month?
  • What are you (or your clients) interested in? Historical sites? Art galleries? Food? Museums? Architecture? Churches? Shopping? How focused do you want your tour to be? Very specialized, or a little of this and a little of that? What do you consider “in London”, and what’s outside of it?
  • How are you going to organize your time? How are you going to account for time spent in active investigation and research versus moving from place to place, breaks, and eating? How are you going to budget time to collect your findings, structure and summarize your experience, and present a report?
  • How do you want to record your tour? If you’re working for a client, what kind of report do they want? A conversation? Written descriptions? Pictures? Do they want things in a specific format?

(Note, by the way, that these questions are largely structured around the CIDTESTD guidewords in the Heuristic Test Strategy Model (Customer, Information, Developer Relations, Equipment and Tools, Schedule, Test Item, and Deliverables)—and that there are context-specific questions that we can add as we model and explore the mission space and the testing assignment.)

There is no best tour of London; they have their strengths and weaknesses. Reasonable people who think about it for a moment realize that the “best” tour of London is a) relative to some person; b) relative to that person’s purposes and interests; c) relative to what the person already knows; d) relative to the amount of time available.  And such a reasonable person would be able to apply that metaphor to software testing tours too.

Common Languages Ain’t So Common

Tuesday, June 28th, 2011

A friend told me about a payment system he worked on once. In the system models (and in the source code), the person sending notification of a pending payment was the payer. The person who got that notice was called the payee. That person could designate somone else—the recipient—to pick up the money. The transfer agent would credit the account of the recipient, and debit the account of the person who sent notification—the payer, who at that point in the model suddenly became known as the sender. So, to make that clear: the payee sends email to the payer, who receives it. The sender pays money to the recipient (who accepts the payment.) Got that clear? It turns out there was a logical, historical reason for all this. Everything seemed okay at the beginning of the project; there was one entity named “payer” and another named “payee”. Payer A and Payee B exchanged both email and money, until someone realized that B might give someone else, C, the right to pick up the money. Needing another word for C, the development group settled on “recipient”, and then added “sender” to the model for symmetry, even though there was no real way for A to split into two roles as B had. Uh, so far.

There’s a pro-certification argument that keeps coming back to the discussion like raccoons to a garage: the claim that, whatever its flaws, “at least certification training provides us with a common language for testing.” It’s bizarre enough that some people tout this rationalization; it’s even weirder that people accept it without argument. Fortunately, there’s an appropriate and accurate response: No, it doesn’t. The “common language” argument is riddled with problems, several of which on their own would be showstoppers.

  • Which certification training, specifically, gives us a common language for testing? Aren’t there several different certification tribes? Do they all speak the same language? Do they agree, or disagree on the “common language”? What if we believe certification tribes present (at best) a shallow understanding and a shallow description of the ideas that they’re describing?
  • Who is the “us” referred to in the claim? Some might argue that “us” refers to the testing “industry”, but there isn’t one. Testing is practiced in dozens of industries, each with its own contexts, problems, and jargon.
  • Maybe “us” refers to our organization, or our development shop. Yet within our own organization, which testers have attended the training? Of those, has everyone bought into the common language? Have people learned the material for practical purposes, or have they learned it simply to pass the certification exam? Who remembers it after the exam? For how long? Even if they remember it, do they always and everafter use the language that has been taught in the class?
  • While we’re at it, have the programmers attended the classes? The managers? The product owners? Have they bought in too?
  • With that last question still hanging, who within the organization decides how we’ll label things? How does the idea of a universal language for testing fit with the notion of the self-organizing team? Shouldn’t choices about domain-specific terms in domain-specific teams be up to those teams, and specific to those domains?
  • What’s the difference between naming something and knowing something? It’s easy enough to remember a label, but what’s the underlying idea? Terms of art are labels for constructs—categories, concepts, ideas, thought-stuff. What’s in and what’s out with respect to a given category or label? Does a “common language” give us a deep understanding of such things? Please, please have a look at Richard Feynman’s take on differences between naming and knowing, http://www.youtube.com/watch?v=05WS0WN7zMQ.
  • The certification scheme has representatives from over 25 different countries, and must be translated into a roughly equivalent number of languages. Who translates? How good are the translations?
  • What happens when our understanding evolves? Exploratory testing, in some literature, is equated with “ad hoc” testing, or (worse) “error guessing”. In the 1990s, James Bach and Cem Kaner described exploratory testing as “simultaneous test design, test execution, and learning”. In 2006, participants in the Workshop on Heuristic and Exploratory Techniques discussed and elaborated their ideas on exploratory testing. Each contributed a piece to a definition synthesized by Cem Kaner: “Exploratory software testing is a style of software testing that emphasizes the personal freedom and responsibility of the individual tester to continually optimize the value of her work by treating test-related learning, test design, test execution, and test result interpretation as mutually supportive activities that run in parallel throughout the project.” That doesn’t roll off the tonque quite so quickly, but it’s a much more thorough treatment of the idea, identifying exploratory testing as an approach, a way that you do something, rather than something that you do. Exploratory work is going on all the time in any kind of complex cognitive activity, and our understanding of the work and of exploration itself evolves (as we’ve pointed out here, and here, and here, and here, and here.). Just as everyday, general-purpose languages adopt new words and ideas, so do the languages that we use in our crafts, in our communities, and with our clients.

In software development, we’re alway solving new problems. Those new problems may involve people to work with entirely new technological or business domains, or to bridge existing domains with new interactions and new relationships. What happens when people don’t have a common language for testing, or for anything else in that kind of development process? Answer: they work it out. As Peter Galison notes in his work on trading zones, “Cultures in interaction frequently establish contact languages, systems of discourse that can vary from the most function-specific jargons, through semispecific pidgins, to full-fledged creoles rich enough to support activities as complex as poetry and metalinguistic reflection.”  Each person in a development group brings elements of his or her culture along for the ride; each project community develops its own culture and its own language.

Yes, we do need common language for testing. Anthropology shows us that meaningful language develops organically when people gather for a common purpose in a particular context. Just as we need tests that are specific to a given context, we need terms that are that way too. So instead of focusing training on memorizing glossary entries, let’s teach testers more about the relationships between words and ideas. Let’s challenge each other to ask better questions about the language we’re using, and how it might be fooling us.

Exploratory Testing is All Around You

Monday, May 16th, 2011

I regularly converse with people who say they want to introduce exploratory testing in their organization. They say that up until now, they’ve only used a scripted approach.

I reply that exploratory testing is already going on all the time at your organization.  It’s just that no one notices, perhaps because they call it

  • “review”, or
  • “designing scripts”, or
  • “getting ready to test”, or
  • “investigating a bug”, or
  • “working around a problem in the script”, or
  • “retesting around the bug fix”, or
  • “going off the script, just for a moment”, or
  • “realizing the significance of what a programmer said in the hallway, and trying it out on the system”, or
  • “pausing for a second to look something up”, or
  • “test-driven development”, or
  • “Hey, watch this!”, or
  • “I’m learning how to use the product”, or
  • “I’m shaking out it a bit”, or
  • “Wait, let’s do this test first instead of that test”, or
  • “Hey, I wonder what would happen if…”, or
  • “Is that really the right phone number?”, or
  • “Bag it, let’s just play around for a while”, or
  • “How come what the script says and what the programmer says and what the spec says are all different from each other?”, or
  • “Geez, this feature is too broken to make further testing worthwhile; I’m going to go to talk to the programmer”, or
  • “I’m training that new tester in how to use this product”, or
  • “You know, we could automate that; let’s try to write a quickie Perl script right now”, or
  • “Sure, I can test that…just gimme a sec”, or
  • “Wow… that looks like it could be a problem; I think I’ll write a quick note about that to remind me to talk to my test lead”, or
  • “Jimmy, I’m confused… could you help me interpret what’s going on on this screen?”, or
  • “Why are we always using ‘tester’ as the login account? Let’s try ‘tester2′ today”, or
  • “Hey, I could cancel this dialog and bring it up again and cancel it again and bring it up again”, or
  • “Cool! The return value for each call in this library is the round-trip transaction time—and look at these four transactions that took thirty times longer than average!”, or
  • “Holy frijoles! It blew up! I wonder if I can make it blow up even worse!”, or
  • “Let’s install this and see how it works”, or
  • “Weird… that’s not what the Help file says”, or
  • “That could be a cool tool; I’m going to try it when I get home”, or
  • “I’m sitting with a new tester, helping her to learn the product”.

Now it’s possible that none of that stuff ever happens in your organization. Or maybe people aren’t paying attention or don’t know how to observe testing. Or both.

Then, just before I posted this blog post, James Bach offered me two more sure-fire clues that people are doing exploratory testing. If they say, “I am in no way doing exploratory testing”, or “We’re doing only highly rigorous formal testing”. In both cases, the emphatic nature of the claim guarantees that the claimant is not sufficiently observant about testing to realize that exploratory testing is happening all around them.

A Few Observations on Structure in Testing

Tuesday, April 12th, 2011

On Twitter, Johan Jonasson reported today that he was about to attend a presentation called “Structured Testing vs Exploratory Testing”. This led to a few observations and comments that I’d like to collect here.

Over the years, it’s been common for people in our community to mention exploratory testing, only to have someone reply, “Oh, so that’s like unstructured testing, right?” That’s a little like someone refer to a cadenza or a musical solo as “unstructured music”; to improv theatre as “unstructured theatre”; to hiking without a map or a precise schedule as “unstructured walking”; to all forms of education outside of a school as “unstructured learning”. When someone says “Exploratory testing is unstructured,” I immediately hear, “I have a very limited view of what ‘structure’ means.”

Typically, by “structured testing”, such people appear to mean “scripted testing”. To me, a script is a set of detailed, ordered set of steps of specific actions and specific observations. Some people call that a procedure, but I think a procedure is characterized by a set of activities that might be but are not necessarily highly specific, and that might be but are not necessarily strictly ordered. So what is structure?

Structure, to me, is any aspect of something (like an object or system) that is required to maintain the thing as it is. You can change some element of a system, add something to it or remove something from it. If the system explodes, or collapses, or changes such that we would consider it a different system, the element or the change was structural. If the system retains its identity, the element or the change wasn’t structural. Someone once described structure to me as “that which remains”.

A system, in James Bach’s useful definition, is a set of things in meaningful relationship. As such, structure is the specific set of things and factors and relationships that make the system a system. Note that the observer is central here: you may see a system where I see only a mess. That is, you see structure, but I don’t. When a system changes, you may call it a different system, where I might see the original system, but modified. In this case, I’m seeing structure that you don’t see or acknowledge as such. We see these things differently because systems, structures, elements, changes, meaningfulness, and relationships are all subject to the Relative Rule: “For any abstract X, X is X to some person.”

One of the principles of general systems thinking is that the extent to which you are able to observe and control a system is limited by the Law of Requisite Variety, coined by W. Ross Ashby: “only variety can destroy variety.” Jurgen Appelo recently produced a marvelous blog post on the subject, which you can read here.

Variety, says Ashby, is the number of distinct states of a system. In order to observe variety, “The observer and his powers of discrimination may have to be specified if the variety is to be well defined.” The observer is limited by the number of states that he (or she or it) can see. This also implies that however many states there are to a system, another system that controls it has to be able to observe that many states and have at least one state in which it can exert control. Wow, that’s pretty heady. What does it mean? The practical upshot is that if you intend to control something, you have to be able to observe it and to change something about it, and that means that you need to have more states available to you than the object has. Or, as Karl Weick put it in Sensemaking in Organizations, “if you want to understand something complicated, you have to complicate yourself.”

This pattern repeats in lots of places. Glenford Myers suggested that testing a program requires more creativity than writing one; more things can go wrong than go right in writing a program. Machines can do a lot of the work of keeping an aircraft aloft, but managing those machines and the work that they do requires more intelligent and more adaptable systems—people. The Law of Requisite Variety also suggests that we can never understand or control ourselves completely, people are insufficient to understand people completely.

So: some people might see exploratory testing as unstructured. I suspect they’re thinking of a particular kind of structure: that scripted procedure that I mentioned above. For me, that suggests an impoverished view of both exploratory testing and of structure. Exploratory testing has tons of structures.

Over the years, many colleagues have been collecting and cataloguing the structures of exploratory testing. James and Jon Bach have led the way, but there have been lots of other contributions from people like Cem Kaner, Mike Kelly, Elisabeth Hendrickson, Michael Hunter, Karen Johnston, Jonathan Kohl, and me (if there’s someone else who should appear in this list, remind me). I’ve collected some of these in this list of structures of exploratory testing. For those who are just joining us, and for the old hands who might need a refresher, you can find my list of exploratory testing structures here.

Programs had structures long before “structured programming” came along. Programming that isn’t structured programing is not “unstructured programming.” The structure in “structured programming” is a particular kind of structure. To many people, there is only one kind of testing structure: scripted procedures. I will guarantee that an incapacity to see alternative kinds of structure will limit your testing. And to paraphrase Weick, if you want to understand testing, you must test yourself.

Why Do Some Testers Find The Critical Problems?

Saturday, February 5th, 2011

Today, someone on Twitter pointed to an interesting blog post by Alan Page of Microsoft. He says:

“How do testers determine if a bug is a bug anyone would care about vs. a bug that directly impacts quality (or the customers perception of quality)? (or something in between?) Of course, testers should report anything that may annoy a user, but learning to differentiate between an ‘it could be better’ bug and a ‘oh-my-gosh-fix-this’ bug is a skill that some testers seem to learn slowly. … “So what is it that makes some testers zero in on critical issues, while others get lost in the weeds?”

I believe I have some answers to this. My answers are based on roughly 20 years of observation and experience in consulting, training, and working with other testers. The forms of interaction have included in-class training; online coaching via video, voice, and text; face-to-face conversation in workplaces, conferences, and workshops; direct collaboration with other working testers in mass-market commercial software, financial services, retail services, specialized mathematical applications, and several other domains.

My first answer is that testing, for a long time and in many places, has been myopically focused on functional correctness, rather than on value to people. Cem Kaner discusses this issue in his talk Software Testing as a Social Science, and later variations on it. This problem in testing is a subset of a larger problem in computer science and software engineering. Introductory texts often observe that a computer program is “a set of instructions for a computer”. Kaner’s defintion of a computer program as “a communication among several humans and computers, distributed over distance and time, that contains instructions that can be executed a computer” goes some distance towards addressing the problem; his explication that “the point of the program is to provide value to the stakeholders” goes further still. When the definition of programming is reduced to producing “a set of instructions for a computer”, it misses the point—value to people—and when testing is reduced to the checking of those instructions, the “testing” will miss the same point. I’ve suggested in recent talks that testing is “the investigation of systems composed of people, computer programs, related products and services.” Successful testers avoid a fascination with functional correctness, and focus on ways in which people might obtain value from a program—or have their value unfulfilled or threatened.

This first answer gives rise to my second: that when testing is focused on functional correctness, it becomes a confirmatory, verification-oriented task, rather than an exploratory, discovery-oriented set of processes. This is not a new problem. It’s old enough that Glenford Myers tried (more or less unsuccessfully, it seems) to argue against it in The Art of Software Testing in 1979. Myers’ point was the testing should be premised on trying to expose the program’s failures, rather than on trying to confirm that it works. Psychological research before and since Myers’ book (in particular Klayman and Ha’s paper on confirmation bias) shows that the positive test heuristic biases people towards choosing tests that demonstrate fit with a working hypothesis (showing THAT it works), rather than tests that drive towards final rule discovery (showing how it works, and more important, how it might fail). Worse yet, I’ve heard numerous reports of development and test managers urging testers to “make sure the tests pass”. The trouble with passing tests is that they don’t expose threats to value. Every function in the program code might be checked and found correct, but the product might be unusable. As in Alan’s example, the phone might make calls perfectly, but unless we model the way people actually use the product—talking for more than three minutes at a time, say—we will miss important problems. Every function might work perfectly, but we might fail to observe missing functionality. Every function might work perfectly, but we might miss terrible compatibility problems. Functional correctness is a very important thing in computer software, but it’s not the only thing. (See the “Quality Criteria” section of the Heuristic Test Strategy Model for suggestions.) Testers “who zero in critical issues” avoid the confirmation trap.

My third answer (related to the first two) is that when testing is focused on confirming functional correctness, a lot of other information gets left lying on the table. Testing becomes a search for finding errors, rather than on finding issues. That is, testers become oriented towards reporting bugs, and less oriented towards the discovery of issues—things that aren’t bugs, necessarily, but that threaten the value of testing and of the project generally. I’ve written recently about issues here. Successful testers recognize issues that represent obstacles to their missions and strategies, and work around them or seek help.

My fourth answer is that many (in my unscientific sample, most) testers are poorly versed in the skills of test framing. This is understandable, at least in part because test framing itself wasn’t known by that name as recently as a year ago as I write. Test framing is the set of logical connections that structure and inform a test. It involves the capacity to follow and express a line of perhaps informal yet reasonably structured logic that directly links the testing mission to the tests and their results. In my experience, most testers are unable to trace this logical line quickly and expertly. There are many roots for this problem. The earlier answers above provide part of the explanation; the mission of value to the customer is overwhelmed by the mission of proving functional correctness. In situations where the process of test design is separated from test execution (as in environments that take a highly scripted approach to testing), the steps to perform the test and observe the results are typically listed explicitly, but the motivation for performing the test is often left out. In situations where test execution, observation of outcomes, and reporting of test results is heavily delegated to automation, motivation is even further disconnected from the mission. In such environments, focus is directed towards getting the automation to follow a script, rather using than automation to assist in probing for problems. In such environments, focus is often on the quantity of tests or the quantity of bug reports, rather than on the quality, the value, of the information revealed by testing. Testers who find problems successfully can link tests, test activities, and test results to the mission. They’re far more concerned about the quality of the information they provide than the quantity.

My fifth answer is that in many organizations there is insufficient diversity of tester skills, mindsets, and approaches for finding the great diversity of problems that might lurk in the product. This problem starts in various ways. In some organizations, testers are drawn exclusively from the business. In others, testers are required to have programming skills before they can be considered for the job. And then things get left out. Testers who need training or experience in the business domain don’t get it, and are kept separated from the business people (that’s a classic example of an issue). Testers aren’t given training in software design, programming, or related skills. They’re not given training in testing, problem reporting and bug advocacy, design of experiments. They’re not given training or education in anthropology, critical thinking, systems thinking, or philosophy and other disciplines that inform excellent testing. Successful testers tend to take on diversified skills, knowledge, and tactics, and when those skills are lacking, they collaborate with people who have them.

Note that I’m not suggesting here that anyone become a Donald Knuth-level programmer, a Pierre Bourdieu-league anthropologist, a Ross Ashby-class systems thinker, a Wittgenstein-grade philosopher. I am suggesting that testers be given sufficient training and opportunity to learn to program to the level of Brian Marick’s Everyday Scripting with Ruby, and that they be given classes, experience, and challenges in observation, the business domain, systems thinking and critical thinking. I am suggesting that people who are testing computer software do need some exposure to core ideas about logic (if we see this, can we justifiably infer that?), about ontology (what are our systems of knowledge about the way things work—especially related to computer programs and to testing), and about epistemology (how do we know what we know?).

I’ve been told by people involved in the design of testing standards that “you can’t expect regular testers to learn epistemology, for goodness’ sake”. Well, I’m saying that we can and that we must at least provide opportunities for learning, to the degree that testers can frame their mission, their ideas about risk, their testing, and their evaluation of the product in the ways that their clients value. Moreover, I’ve worked with testing organizations that have done that, and the results have been impressive. Sometimes I hear people saying “what if we train our testers and they leave?” As one wag on Twitter replied (I wish I knew who), “What if you don’t train them and they stay?”

In our classes, James Bach and I have the experience of inspriring testers to become interested in and excited by these topics. We find that it’s not hard to do that. We remain concerned about the capacity of some organizations to sustain that enthusiasm, often because some middle managers’ misconceptions about the practice and value of testing can squash both enthusiasm and value in a hurry. Testers, to be successful, must be given the freedom and responsibility to explore and to contribute what they’ve learned back to their team and to the rest of the organization.

So, what would we advise?

Read this set of ideas as a system, rather than as a linear list:

  • The purpose of testing is to identify threats to the value of the program. Functional errors are only one kind of threat to the value of the program.
  • Take on expansive ideas about what might constitute—or threaten—the quality of the product.
  • Dynamically manage your focus to exercise the product and test those ideas about value.
  • In hiring, staffing, and training, focus on the mindset and the skill set of the individual tester as a member of a highly diversified team.
  • As an individual tester, develop and diversify your skills and your strategies.
  • Immediately identify report issues that threaten the value of the testing effort and of the project generally. Solve the ones you can; raise team and management awareness of the costs and risks of issues, in order to get attention and help.
  • Learn to frame your testing and to compose, edit, narrate and justify a compelling testing story.
  • Don’t try to control or restrain testers; grant them the freedom—along with the responsibility to discover what they will. Given that… they will.

One Test Per Requirement

Friday, February 4th, 2011

Despite all of the dragons that Agile approaches have attacked successfully, a few still live. As crazy as it is, the idea of one test check per requirement has managed to survive in some quarters.

Let’s put aside the fact that neither tests nor requirements are valid units of measurement, and focus on this: If you believe that there should be one test per requirement, then you have to assume that each requirement can have only one problem with it. Is that a valid assumption?