Blog Posts for the ‘Exploratory Testing’ Category

Why Would a User Do THAT?

Monday, March 4th, 2013

If you’ve been in testing for long enough, you’ll eventually report or demonstrate a problem, and you’ll hear this:

“No user would ever do that.”

Translated into English, that means “No user that I’ve thought of, and that I like, would do that on purpose, or in a way that I’ve imagined.” So here are a few ideas that might help to spur imagination.

  • The user made a simple mistake, based on his erroneous understanding of how the program was supposed to work.
  • The user had a simple slip of the fingers or the mind—inadvertently pasting a letter from his mother into the “Withdrawal Amount” field.
  • The user was distracted by something, and happened to omit an important step from a normal process.
  • The user was curious, and was trying to learn about the system.
  • The user was a hacker, and wanted to find specific vulnerabilities in the system.
  • The user is confused by the poor affordances in the product, and at that point was willing to try anything to get his task accomplished.
  • The user was poorly trained in how to use the product.
  • The user didn’t do that. The product did that, such that the user appeared to do that.
  • Users actually do that all the time, but the designer didn’t realize it, so product’s design is inconsistent with the way the user actually works.
  • The product used to do it that way, but to the user’s surprise now does it this way.
  • The user was looking specifically for vulnerabilities in the product as a part of an evaluation of competing products.
  • The product did something that the user perceived as unusual, and the user is now exploring to get to the bottom of it.
  • The user did that because some other vulnerability—say, a botched installation of the product—led him there.
  • The user was in another country, where they use commas instead of periods, dashes instead of slashes, kilometres instead of miles… Or where dates aren’t rendered the way we render them here.
  • The user was testing the product.
  • The user didn’t realize this product doesn’t work the way that product does, even though the products have important and relevant similarities.
  • The user did that, prompted by an error in the documentation (which in turn was prompted by an error in a designer’s description of her intentions).
  • To the designer’s surprise, the user didn’t enter the data via the keyboard, but used the clipboard or a programming interface to enter a ton of data all at once.
  • The user was working for another company, and was trying to find problems in an active attempt to embarrass the programmer.
  • The user observed that this sequence of actions works in some other part of the product, and figured that the same sequence of actions would be appropriate here too.
  • The product took a long time to respond, the user got impatient, and started doing other stuff before the product responded to his earlier request.

And I’m not even really getting started. I’m sure you can supply lots more examples.

Do you see? The space of things that people can do intentionally or unintentionally, innocently or malevolently, capably or erroneously, is huge. This is why it’s important to test products not only for repeatability (which, for computer software, is relatively easy to demonstrate) but also for adaptability. In order to do this, we must do much more than show that a program can produce an expected, predicted result. We must also expose the product to reasonably foreseeable misuse, to stress, to the unexpected, and to the unpredicted.

What Exploratory Testing Is Not (Part 5): Undocumented Testing

Wednesday, December 21st, 2011

This week I had the great misfortune of reading yet another article which makes the false and ridiculous claim that exploratory testing is “undocumented”. After years and years of plenty of people talking about and writing about and practicing excellent documentation as part of an exploratory testing approach, it’s depressing to see that there are still people shovelling fresh manure onto a pile that should have been carted off years ago.

Like the other approaches to test activities that have been discussed in this series (“touring“, “after-everything-else“, “tool-free“, and “quick testing“), “documented vs. undocumented” is in a category orthogonal to “exploratory vs. scripted”. True: usually scripted activities are performed by some agency following a set of instructions that has been written down somewhere. But we could choose to think of “scripted” in a slightly different and more expansive way, as “prescriptive”, or “mimeomorphic“. A scripted activity, in this sense, is one for which the actions to be performed have been established in advance, and the choices of the actions are not determined by the agency performing them. In that sense, a cook at McDonalds doesn’t read a script as he prepares your burger, but the preparation of a McDonald’s burger is a highly scripted activity.

Thus any kind of testing can be heavily documented or completely undocumented. A thoroughly documented test might be highly exploratory in nature, or it might be highly scripted.

In the Rapid Software Testing class, James Bach and I point out that when someone says “that should be documented”, what they’re really saying is “that should be documented if and how and when it serves our purposes.” So, let’s start by looking at the “when”.

When we question anything in order to evaluate it, there are moments in the process in which we might choose to record ideas or actions. I’ve broken these down into three basic categories that I hope you find helpful:

  • Before

  • During

  • After

There are “before”, “during”, and “after” moments with respect to any test activity, whether it’s a part of test design, test execution, result interpretation, or learning. Again, a hallmark of exploratory testing is the tester’s freedom and responsibility to optimize the value of the work as it’s happening. That means that when it’s important to record something, the tester is not only welcome but encouraged to

  • pick up a pen
  • take a screen shot
  • launch a session of Rapid Reporter
  • create or update a mind map
  • fire up a screen recorder
  • initiate logging (if it doesn’t start by default on the product you’re testing—and if logging isn’t available, you might consider identifying that as a testability problem and a related product and project risk)
  • sketch out a flowchart diagram
  • type notes into a private or shared repository
  • add to a table of data in Excel
  • fire off a note to a programmer or a product owner
and that’s an incomplete list. But they’re all forms of documentation.

Freedom to document at will should also mean that the tester is free to refrain from documenting something when the documentation doesn’t add value. At the same time, the tester is responsible and accountable for that decision. In Rapid Testing, we recommend writing down (or saving, or illustrating) only the things that are necessary or valuable to the project, and only when the value of doing so exceeds the cost. This doesn’t mean no documentation; it means the most informative yet fastest and least expensive documentation that completely fulfils the testing mission. Integrating that with testing work leads, we hold, to excellent testing—but it takes practice and skill.

For most test activities, it’s possible to relay information to other people orally, or even sometimes by allowing people to observe our behaviour. (At the beginning of the Rapid Testing class, I sometimes silently hold aloft a 5″ x 8″ index card in landscape orientation. I fold it in half along the horizontal axis, and write my first name on one side using a coloured marker. Everyone in the class mimics my actions. Without a single word of instruction being given or questions being asked, either verbally or in writing, the mission has been accomplished: each person now has a tent card in front of him.)

There’s a potential risk associated with an exploratory approach: that the tester might fail to document something important. In that case, we do what skilled people do with risk: we manage it. James Bach talks at length about managing exploratory testing sessions here. Producing appropriate documentation is partly a technical process, but the technical considerations are dominated by business imperatives: cost, value, and risk. There are also social considerations, too. The tester, the test lead, the test manager, the programmers, other managers, and the product owner determine collaboratively what’s important to document and what’s not so important with respect to the current testing mission. In an exploratory approach, we’re more likely to be emphasizing the discovery of new information. So we’re less likely to spend time on documenting what we will do, more likely to document what we are doing and what we have done. We could do a good deal of preparatory reading and writing, even in an exploratory approach—but we realize that there’s an ever-increasing risk that new discoveries will undermine the worth of what we write ahead of time.

That leads directly to “our purposes”, the task that we want to accomplish when documenting something. Just as testing itself has many possible missions, so too does test documentation. Here’s a decidedly non-exhaustive list, prepared over a couple of minutes:

  • to express testing strategy and tactics for an entire project, or for projects in general
  • to keep a set of personal notes to help structure a debriefing conversation
  • to outline testing activities for a test cycle
  • to report on activities during testing execution
  • to outline attributes of a particular quality criterion
  • to catalogue ideas about risk
  • to describe test coverage
  • to account for the work that we’ve done
  • to program a machine to perform a given set of actions
  • to alert people to potential problems in the product
  • to guide a tester’s actions over a test session
  • to identify structures in the application or service
  • to provide a description of how to use a particular test tool that we’ve crafted
  • to describe the tester’s role, skills, and qualifications
  • to explain business rules to someone else on the team
  • to outline scenarios in which the product might be used or tested
  • to identify, for a tester, a specific, explicit sequence of actions to perform, input to provide, and observations to make

That last item is the classic form of highly scripted testing, and that kind of documentation is usually absent from exploratory testing. Even so, a tester can take an exploratory approach using a script as a point of departure or as a reference, just as you might use a trail map to help guide an off-trail hike (among other things, you might want to discover shortcuts or avoid the usual pathways). So when someone says that “exploratory testing is undocumented”, I hear them saying something else. I hear them saying, “I only understand one form of test documentation, and I’ve successfully ignored every other approach to it or purpose for it.”

If you look in the appendices for the Rapid Software Testing class (you can find a .PDF at http://www.satisfice.com/rst-appendices.pdf), you’ll see a large number of examples of documentation that are entirely consistent with an exploratory approach. That’s just one source. For each item in my partial list above, here’s a partial list of approaches, examples, and tools.

Testing strategy and tactics for an entire project, or for projects in general.
Look at the Satisfice Heuristic Test Strategy Model and the Context Model for Heuristic Test Planning (these also appear in the RST Appendices).

An outline of testing activities for a test cycle.
Look at the General Functionality and Stability Test Procedure for Certifed for Microsoft Windows Logo. See also the OWL Quality Plan (and the Risk and Task Correlation) in the RST Appendices.

Keeping a set of personal notes to help structure a debriefing or other conversation.
See the “Beans ‘R Us Test Report” in the RST Appendices; or see my notes on testing an in-flight entertainment system which I did for fun on a flight from India to Amsterdam.

Recording activities and ideas during test execution
A video camera or a screen recording tool can capture the specific actions of a tester for later playback and review. Well-designed log files may also provide a kind of retrospective record about what was testing. Still neither of these provide insight into the tester’s mind. Recorded narration or conversation can do that; tools like BB Test Assistant, Camtasia, or Morae can help. The classic approach, of course, is to take notes. Have a look at my presentation, “An Exploratory Tester’s Notebook“, which has examples of freestyle notes taken during an impromptu testing session, and detailed, annotated examples of Session-Based Test Management sessions. Shmuel Gerson’s Rapid Reporter and Jonathan Kohl’s Session Tester are tools oriented towards taking notes (and, in the former case, including screen captures) of testing sessions on the fly.

Outlining many attributes of a particular quality criterion
See “Heuristics of Software Testability” in the RST Appendices for one example.

Cataloguing ideas about risk
Several examples of this in the RST Appendices, most extensively in the “Deployment Planning and Risk Analysis” example. You’ll also find an “Install Risk Catalog”; “The Risk of Incompatibility”; the Risk vs. Tasks section in the “OWL Quality Plan”; the “Y2K Compliance Report”; “Round Results Risk A”, which shows a mapping of Risk Areas vs. Test Strategy and Tasks.

Describing or outlining test coverage
A mapping establishes or illustrates relationships between things. We can use any of these to help us think about test coverage. In testing, a map might look like a road map, but it might also look like a list, a chart, a table, or a pile of stories. These can be constructed before, after, or during a given test activity, with the goal of covering the map with tests, or using testing to extend the map. I catalogued several ways of thinking about coverage and reporting on it, in three articles Got You Covered, Cover or Discover, and A Map By Any Other Name. Several examples of lightweight coverage outlines can be found in the RST Appendices (“Putt Putt Saves the Zoo”, “Table Formatting Test Notes”, There are also coverage ideas incorporated into the Apollo mission notes that we’ve titled “Guideword Heuristics for Astronauts”).

Accounting for testing work that we’ve done.
See Session-Based Test Management, and see “An Exploratory Tester’s Notebook“. Darren McMillan provides excellent examples of annotated mind maps; scroll down to the section headed “Session Reports”, and continue through “Simplifying feedback to management” and “Simplifying feedback to groups”. A forthcoming article, written by me, shows how a senior test manager tracks testing sessions at a half-day granularity level.

Programming a machine to help you to explore
See all manner of books on programming, both references and cookbooks, but for testers in particular, have a look at Brian Marick’s Everyday Scripting with Ruby. Check out Pete Houghton’s splendid examples of exploratory test automation that begin here. Cem Kaner (often in collaboration with Doug Hoffman) write extensively about automation-assisted exploratory testing; an example is here.

Alerting people to potential problems in the product
In general, bug reporting systems provide one way to handle the task of recording and reporting problems in the product. James Bach provides an example of a report that he provided to a client (along with a more informal account of the session).

Guiding a tester’s actions over a test session
Guiding a tester involves skills like chartering and checklisting. Start with the documentation on Session Based Test Management (http://www.satisfice.com/sbtm). Selena Delesie has produced an excellent blog post on chartering exploratory testing sessions. The title of Cem Kaner’s presentation at CAST 2008, The Value of Checklists and the Danger of Scripts: What legal training suggests for testers describes the content perfectly. Michael Hunter’s You Are Not Done Yet lists can be used and adapted to your context as a set of checklists.

To identify structures in the application or service
The “Product Elements” section in the Heuristic Test Strategy Model provides a kind of framework for documenting product structures. In the RST Appendices, the test notes for “Putt Putt Saves the Zoo” and “Diskmapper”, and the “OWL Quality Plan” provide examples of identifying several different structures in the programs under test. Mind mapping provides a means of describing and illustrating structures, too; see Darren McMillan’s examples here and here. Ruud Cox and Ru Cindrea used a mind map of product elements to help win the Best Bug Report award in the Test Lab at EuroSTAR 2011. I’ve created a list of structures that support exploratory testing, and many of these are related to structures in the product.

Providing a description of how to use a particular test tool that we’ve crafted
While working at a bank, I developed (in Excel and VBA) a tool that could be used as an oracle and as a way of recording test results. (Thanks to non-disclosure agreements, I can describe these, but cannot provide examples.) When I left the project, I was obliged to document my work. I didn’t work on the assumption that anyone off the street would be reading the document. Instead, I presumed that anyone assigned to that testing job and to using that tool, would have the rapid learning skill to explore the tool, the product, and the business domain in a mutually supportive way. So I crafted documentation that was intended to tell testers just enough to get them exploring.

Explaining business rules to someone else on the team
I did include documentation for novices of one kind: within the documentation for that testing tool, I included a general description of how foreign exchange transactions worked from the bank’s perspective, and how appropriate accounts got credited and debited. I had learned this by reverse-engineering use cases and consulting with the local business analyst. I summarized it with a two-page document written in simple, direct language, referring disrectly to the simpler use cases and explaining the more confusing bits in more detail. For those whose learning style was oriented toward code, I also described the tables and array formulas that applied the business rules.

Outlining scenarios in which the product might be used or tested
I discuss some issues about scenarios here—why they’re important, and why it’s important to keep them open-ended and open to interpretation. It’s more important to record than to prescribe, since in a good scenario, you’ll observe and discover much more than you’ve articulated in advance. Cem Kaner gives ideas on how to produce scenarios; Hans Buwalda presents examples of soap opera testing.

Identifying required tester skill
People with skill don’t need prescriptive documentation for every little thing. Responsible managers identify the skills needed to test, and who commit to employing people who either have those skills or can develop them quickly. James Bach eliminated 50 pages of otiose documentation with two paragraphs. (Otiose is a marvelous word; it’s fun to look it up in a thesaurus.)

Identifying, for a tester, a particular explicit sequence of actions to perform, input to provide, and observations to make.
Again, a document that attempts to specify exactly what a tester should do is the hallmark of scripted testing. James Bach articulates a paradox that has not yet been noted clearly in our craft: in order to perform a scripted test well, you need signficant amounts of skill and tacit knowledge (and you also need to ignore the script on occasion, and you need to know when those occasions are). There’s another interesting issue here: preparing such documents usually depends on exploratory activity. There’s no script to tell you how to write a script. (You might argue there’s one exception. You can follow this script to write a test script: take each line of a requirements document, and add the words “Verify that” to the beginning of each line.)

Now, just as you can perform testing badly using any approach, you can perform exploratory testing and document it inappropriately, either by under-documenting it OR over-documenting it using any of the kinds of documentation above. But, as this document shows, the notion that exploratory testing is by its nature undocumented is not only ignorant, but aggressively ignorant about both testing and documentation. Whenever you see someone claim that exploratory testing is undocumented, I’d ask you to help by setting the record straight. Feel free to refer to this blog post, if you find it helpful; also, please point me to other exemplars of excellent documentation that are consistent with exploratory approaches. If we all work together, we can bury this myth, while providing excellent records and reports for our clients.

This is the end of the series “What Exploratory Testing Is Not”, for me. But James Bach has one more.

And, of course, in the face of all these instances of what exploratory testing is not, you might want to know our current take on what exploratory testing is.

What Exploratory Testing Is Not (Part 4): Quick Tests

Sunday, December 18th, 2011

Quick testing is another approach to testing that can be done in a scripted way or an exploratory way. A tester using a highly exploratory approach is likely to perform many quick tests, and quick tests are often key elements in an exploratory approach. Nonetheless, quick testing and exploratory testing aren’t the same.

Quick tests are inexpensive tests that require little time or effort to prepare or perform. They may not even require a great deal of knowledge about the application being tested or its business domain, but they can help to develop new knowledge rapidly. Rather than emphasizing comprehensiveness or integrity, quick tests are designed to reveal information in a hurry at a minimal cost.

A quick test can be a great way to learn about the product, or to identify areas of risk, fragility, or confusion. A tester can almost always sneak a quick test or two into other testing activity. A burst of quick tests can help as the first activities during a smoke or sanity test. Cycles of relatively unplanned, informal quick tests may help to you discover or refine a more comprehensive or formal plan.

James Bach and I provide examples of many kinds of quick tests in the Rapid Software Testing class. You’ll notice that some of them are called tours. Note that not all tours are quick, and not all quick tests are tours. Here’s a catalog.

Happy Path
Perform a task, from start to finish, that an end-user might be expected to do. Use the product in the most simple, expected, straightforward way, just as the most optimistic programmer or designer might imagine users to behave. Look for anything that might confuse, delay, or irritate a reasonable person. Cem Kaner sometimes calls this “sympathetic testing”. Lean towards learning about the product, rather than finding bugs. If you do see obvious problems, it may be bad news for the product.

Variable Tour
Tour a product looking for anything that is variable and vary it. Vary it as far as possible, in every dimension possible. If you’re using quick tests for learning, seek and catalog the important variables. Look for potential relationships between them. Identifying and exploring variations is part of the basic structure of our testing when we first encounter a product.

Sample Data Tour
Employ any sample data you can, and all that you can. For one kind of quick tests, prefer simple values whose effects are easy to see or calculate. For a different kind of quick test, choose complex or extreme data sets. Observe the units or formats in which data can be entered, and try changing them. Challenge the assumption that the programmers have thought to reject or massage inappropriate data. Once you’ve got a handle on your ideas about reasonable or mildly challenging data, you might choose to try…

Input Constraint Attack
Discover sources of input and attempt to violate constraints on that input. Try some samples of pathological data: use zeroes where large numbers are expected; use negative numbers where positive numbers are expected; use huge numbers where modestly-sized ones are expected; use letters in every place that’s supposed to handle only numbers, and vice versa. Use a geometrically expanding string in a field. Keep doubling its length until the product crashes. Use characters that are in some way distinct from your idea of “normal” or “expected”. Inject noise of any kind into a system and see what happens. Use Satisfice’s PerlClip utility to create strings of arbitrary length and content; use PerlClip’s counterstring feature to create a string that tells you its own length so that you can see where an application cuts off input.

People tend to talk a lot about input constraint attacks. Perhaps that’s because input constraint attacks are used by hackers to compromise systems; perhaps it’s because input constraint attacks can be performed relatively straightforwardly; perhaps it’s because they can be described relatively easily; perhaps it’s because input constraint attacks can produce dramatic and highly unexpected results. Yet they’re by no means the only kind of quick test, and they’re certainly not the only way to test using an exploratory approach.

Documentation Tour
Look in the online help or user manual and find some instructions about how to perform some interesting activity. Do those actions explicitly. Then improvise from them and experiment. If your product has a tutorial, follow it. You may expose a problem in the product or in the documentation; either way, you’ve found an inconsistency that is potentially important. Even if you don’t expose a problem, you’ll still be learning about the product.

File Tour
Have a look at the folder where the program’s .EXE file is found. Check out the directory structure, including subs. Look for READMEs, help files, log files, installation scripts, .cfg, .ini, .rc files.
Look at the names of .DLLs, and extrapolate on the functions that they might contain or the ways in which their absence might undermine the application. Use whatever supplemental material you’ve got to guide or focus your actions. Another way to gather information for this kind of test: use tools to monitor the installation, and take the output from the tool as a point of departure.

Complexity Tour
Tour a product looking for the most complex features, the most challenging data sets, and the biggest interdepencies. Look for hidden nooks and crannies, but also look for the program’s high-traffic areas, busy markets, big office buildings, and train stations—places where there’s lots of interactions, and where bugs might be blending in with the crowd.

Menu, Window, and Dialog Tour
Tour a product looking for all the menus (main and context menus), menu items, windows, toolbars, icons, and other controls. Walk through them. Try them. Catalog them, or construct a mind map.

Keyboard and Mouse Tour
Tour a product looking for all the things you can do with a keyboard and mouse. Hit all of the keys on the keyboard. Hit all the F-keys; hit Enter, Tab, Escape, Backspace; run through the alphabet in order, and combine each key with Shift, Ctrl, Alt, the Windows key, CMD or Option, on other platforms, the AltGr key in Europe. Click (right, left, both, double, triple) on everything. Combine clicks with shifted keys.

Interruptions
Start activities and stop them in the middle. Stop them at awkward times. Perform stoppages using cancel buttons, O/S level interrupts (ctrl-alt-delete or task manager). Arrange for other programs to interrupt (such as screensavers or virus checkers). Also, try suspending an activity and returning later. Put your laptop into sleep or hibernation mode.

Undermining
Start using a function when the system is in an appropriate state, then change the state part way through.
Delete a file while it is being edited; eject a disk; pull net cables or power cords) to get the machine an inappropriate state. This is similar to interruption, except you are expecting the function to interrupt itself by detecting that it no longer can proceed safely.

Adjustments
Set some parameter to a certain value, then, at any later time, reset that value to something else without resetting or recreating the containing document or data structure. Programmers often expect settings or variables to be adjusted through the GUI. Hackers and tinkerers expect to find other ways.

Dog Piling
Whatever you’re doing, do more of it and do other stuff as well while you’re at it. Get more processes going at once; try to establish more states existing concurrently. Invoke nested dialog boxes and non-modal dialogs. On multi-user systems, get more people using the system or simulate that with tools. If your test seems to trigger odd behaviour, pile on in the same place until the odd becomes crazy.

Continuous Use
While testing, do not reset the system. Leave windows and files open; let disk and memory usage mount.
You’re hoping to show that the system loses track of things or ties itself in knots over time.

Feature Interactions
Discover where individual functions interact or share data. Look for any interdependencies. Explore them, exploit them, and stress them out. Look for places where the program repeats itself or allows you to do the same thing in different places. For example, for data to be displayed in different ways and in different places, and seek inconsistencies. For example, load up all the fields in a form to their maximums and then traverse to the report generator.

Summon Help
Bring up the context-sensitive help feature during some operation or activity. Does the product’s help file explain things in a useful way, or does it offend the user’s intelligence by simply restating what’s already on the screen? Is help even available at all?

Click Frenzy
Ever notice how a cat or a kid can crash a system with ease? Testing is more than “banging on the keyboard”, but that phrase wasn’t coined for nothing. Try banging on the keyboard. Try clicking everywhere. Poke every square centimeter of every screen until you find a secret button.

Shoe Test
Use auto-repeat on the keyboard for a very cheap stress test. Look for dialog boxes constructed such that pressing a key leads to, say, another dialog box (perhaps an error message) that also has a button connected to the same key that returns to the first dialog box. Place a shoe on the keyboard and walk away. Let the test run for an hour. If there’s a resource or memory leak, this kind of test could expose it. Note that some lightweight automation can provide you with a virtual shoe.

Blink Test
Find an aspect of the product that produces huge amounts of data or does some operation very quickly. Look through a long log file or browse database records, deliberately scrolling too quickly to see in detail. Notice trends in line lengths, or the look or shape of the data. Use Excel’s conditional formatting feature to highlight interesting distinctions between cells of data. Soften your focus. If you have a test lab with banks of monitors, scan them or stroll by them; patterns of misbehaviour can be surprisingly prominent and easy to spot.

Error Message Hangover
Programmers are rewarded for implementing features. There’s a psychological problem with errors or exceptions: the label itself suggests that something has gone wrong. People often actively avoid thinking about problems or mistakes, and as a consequence, programmers sometimes handle errors poorly. Make error messages happen and test hard after they are dismissed. Watch for subtle changes in behaviour between error and normal paths. With automation, make the same error conditions appear thousands of times.

Resource Starvation
Progressively lower memory, disk space, display resolution, and other resources. Keep starving the product until it collapses, or gracefully (we hope) degrades.

Multiple Instances
Run a lot of instances of the app at the same time. Open, use, update, and save the same files. Manipulate them from different windows. Watch for competition for resources and settings.

Crazy Configs
Modify the operating system’s configuration in non-standard or non-default ways, either before or after installing the product. Turn on “high contrast” accessibility mode, or change the localization defaults. Change the letter of the system hard drive. Put things in non-default directories. Use RegEdit (for registry entries) or a text editor (for initialization files) to corrupt your program’s settings in a way that should trigger an error message, recovery, or an appropriate default behavior.

Again: quick tests tend to be highly exploratory, but they represent only one kind of exploratory testing. Don’t be fooled into believing that quick testing—or certain kinds of quick testing—is all there is to exploratory testing.

Next (and last) in the series: What Exploratory Testing Is Not (Part 5): Undocumented Testing

And, of course, in the face of all these instances of what exploratory testing is not, you might want to know our current take on what exploratory testing is.

What Exploratory Testing Is Not (Part 3): Tool-Free Testing

Saturday, December 17th, 2011

People often make a distinction between “automated” and “exploratory” testing. This is like the distinction between “red” cars and “family” cars. That is, “red” (colour) and “family” (some notion of purpose) are in orthogonal categories. A car can be one colour or another irrespective of its purpose, and a car can be used for a particular purpose irrespective of its colour. Testing, whether exploratory or not, can make heavy or light use of tools. Testing, whether it entails the use of tools or not, can be highly scripted or highly exploratory.

“Exploratory” testing is not “manual” testing. “Manual” isn’t a useful word for describing software testing in any case. When you’re testing, it’s not the hands that do the testing, any more than when you’re riding a pedal bike it’s the feet that do the bike-riding. The brain does the testing; the hands, at best, provide one means of input and interaction with the thing we’re testing. And not even “manual” testing is manual in the sense of being tool- or machinery-free. You do you use a computer when you’re testing, don’t you?

(Well, mostly, but not always. If you’re reviewing requirements, specifications, code, or documentation, you might be looking at paper, but you’re still testing. A thought experiment or a conversation about a product is a kind of a test; you’re questioning something in order to evaluate it, pitting ideas against other ideas in an unscripted way. While you’re reviewing, are you using a pen to annotate the paper you’re reading? A notepad to record your observations? Sticky tabs to mark important places in the text? Then you’re using tools, low-tech as they might be.)

Some people think of test automation in terms of a robot that pounds on virtual keys more quickly, more reliably, and more deterministically than a human could. That’s certainly one potential notion of test automation, but it’s very limiting. That traditional view of test automation focuses on performing checks, but that’s not the only way in which automation can help testing.

In the Rapid Software Testing class, James Bach and I suggest a more expansive view of test automation: any use of (software- or hardware-based) tools to support testing. This helps keeps us open to the idea that machines can help us with almost any of the mimeomorphic, non-sapient aspects of testing, so that we can focus on and add power to the polimorphic, sapient aspects. Exploration is polimorphic activity, but it can include and be supported by mimeomorphic actions. Cem Kaner and Doug Hoffman take a similar tack: exploratory test automation is “computer-assisted testing that supports learning of new information about the quality of the software under test.” Learning new information is one of the hallmarks of exploratory testing, which usually points towards emphasizing variation rather than repetition.

That said, there can be a role for mechanized repetition, even when you’re using a highly exploratory approach: when repeating aspects of the test are intended to support discovery of something new or surprising. The key is not whether you’re mechanizing the activity. The key is what happens at the end of the activity. The less the results of one activity are permitted to inform the next, the more scripted the approach. If the repetition is part of a learning loop—a cycle of probing, discovering, investigating, and interpreting—that feeds back on itself immediately, then the approach is exploratory. James has also posted a number of motivations for repeating tests. Each one can (with the possible exception of “avoidance or indifference”) be entirely consistent with and supportive of exploration.

There are some actions that tools can perform better than humans, as long as the action doesn’t require human judgment or wisdom. Humanity can even get in the way of some desirable outcome. For example, when your exploration of some aspect of a product is based on statistical analysis, and randomization is part of the test design, it’s important to remember that people are downright lousy at generating randomized data. Even when people believe that they’re choosing numbers at random, there are underlying (and usually quite unconscious) patterns and biases that inform their choices. If you want random numbers, tools can help.

Tools can support exploration in plenty of other ways: data generation, system configuration; simulation; logging and video capture; probes that examine the internal state of the system; oracles that detect certain kinds of error conditions in a product or generate plausible results for comparison; visualization of data sets, key elements to observe, relationships, or timing; recording and reporting of test activity.

A few years back, I was doing testing of a teller workstation application at a bank (I’ve written about this in How to Reduce the Cost of Software Testing). The other testers, working on domestic transactions, were working from scripts that contained painfully detailed and explicit steps and observations. (Part of the pain came from the fact that the scripts were supplemented with screen shots, and the text and the images didn’t always agree.) My testing assignment involved foreign exchange, and the testing tasks I had been given were unscripted and, to a large degree, self-determined. In order to learn the application quickly, I had to explore, but this in no way meant that I didn’t use tools. On the contrary, in fact. In that context, Excel was the most readily available and powerful tool on hand. I used it (and its embedded Visual Basic for Applications) to:

  • maintain and update (at a key stroke) enormous tables of currencies, rates, and transaction types
  • access appropriate entries from the table via regular expression parsing
  • model the business rules of the application under test
  • display the intended flow of money through a transaction
  • add visual emphasis to the salient outcomes of tests and test scenarios
  • provide, using a comparable algorithm, clear results to which the product’s results could be compared
  • help in performing extremely rapid evaluation of a test idea
  • create tables of customer data so that I could perform a test using a variety of personas
  • accelerate my understanding of the product and the test space
  • enhance my learning about Boolean algebra and how it could be used in algorithms
  • record my work and illustrate outcomes for my clients
  • perform quick calculations when necessary
  • help me find more actual problems than the other four testers combined

All of this activity happened in a highly exploratory way; each of the activities interacted with the others. I used very rapid cycles of looking at what I needed to learn next about the application, experimenting with and performing tests, programming, asking questions of subject matter experts and programmers and managers, reporting, reading reference documentation, debugging, and learning. Tight loops of activities happening in parallel are what characterize exploratory processes. Yet this was not tool-free work; tools were absolutely central to my exploration of the product, to my learning about it, and to the mission of finding bugs. Indeed, without the tools, I would have had much more limited ideas about what could be tested, and how it could be tested.

The explorers of old used tools: compasses and astrolabes, maps and charts, ropes and pulleys, ships and wagons. These days, software testers explore applications by using mind-mapping software and text editors; spreadsheets and calculators; data generation tools and search engines; scripting tools and automation frameworks. The concept that characterizes exploratory testing is not the input mechanism, which can be fingers on a keyboard, tables of data pumped into the program via API calls, bits delivered through the network, signals from a variable voltage controller. Exploratory testing is about the way you work, and the extent to which test design, test execution, and learning support and reinforce each other. Tools are often a critical part of that process.

Next in the series: What Exploratory Testing Is Not (Part 4): Quick Tests

And, of course, in the face of all these instances of what exploratory testing is not, you might want to know our current take on what exploratory testing is.

What Exploratory Testing Is Not (Part 2): After-Everything-Else Testing

Friday, December 16th, 2011

Exploratory testing is not “after-everything-else-is-done” testing. Exploratory testing can (and does) take place at any stage of testing or development.

Indeed, TDD (test-driven development) is a form of exploratory development. TDD happens in loops, in which the programmer develops a check, then develops the code to make the check pass (along with all of the previous checks), then fixes any problems that she has discovered, and then loops back to implementing a new bit of behaviour and inventing a new check. The information obtained from each loop feeds into the next; and the activity is guided and structured by the person or people involved in the moment, rather than in advance. The checks themselves are scripted, but the activity required to produce them and to analyze the results is not. Compared to the complex cognitive activity—exploratory, iterative—that’s going on as code is being developed, the checks themselves—scripted, linear—are trivial.

Requirement review is an exploratory activity too. Review of requirements (or specifications, or user stories, or examples) tends happens early on in a development cycle, whether it’s a long or a short cycle. While review might be guided by checklists, the people involved in the activity are making decisions on the fly as they go through loops of design, investigation, discovery, and learning. The outcome of each loop feeds back into the next activity, often immediately.

Code review can also be done in a scripted way or an exploratory way. When humans analyze the code, it’s an unscripted, self-directed activity that happens in loops; so it is exploratory. We call it review, but it’s gathering information with the intention of informing a decision; so it is testing. There is a way to review code that involves the application of scripted processes, via a tools that people generally call “static testing tools. When a machine parses code and produces a report, by definition it’s a form of checking, and it’s scripted. Yet using those tools productively requires a great deal of exploratory activity. Parsing and interpreting the report and responding to it is polimorphic, human action—unscripted, open-ended, iterative, and therefore exploratory.

Learning about a new product or a new feature is an exploratory activity if you want to do it or foster it well. Some suggest that test scripts provide a useful means of training testers. Research into learning shows that people tend to learn more quickly and more deeply when their learning is based on interaction and feedback; guided, perhaps, but not controlled. If you really want to learn about a product, try creating a mind map, documenting some aspect of the program’s behaviour, or creating plausible scenarios in which people might use—or misuse—the product. All of these activities promote learning, and they’re all exploratory activities. There’s far more information that you can use, apply, and discover than a script can tell you about. Come to think of it… where does the script come from?

Developing a test procedure—even developing a test script, whether for a machine or a human to follow, or developing the kind of “test” that skilled testers would call a demonstration—is an exploratory activity. There is no script that specifies how to write a new script for a particular purpose. Heard about a new feature and pondering how you might test it? You’ve already begun testing; you’re doing test design and you’re probably learning as you go. To the extent that you use the product or interact with it, bounce ideas off other people, or think critically about your design, you’re testing, and you’re doing it in an unscripted way. Some might suggest that certain tools create scripts that can perform automatic checks. Yet reviewing those checks for appropriateness, interpreting the results, and troubleshooting unexpected outcomes are all exploratory activities.

Supposing that a programmer, midway through a sprint, decides that she’d like some feedback on the work that she’s done so far on a new module. She hands you a bit of code to look at. You might interact with the code directly through a test tool that she provided, or (say) via the Ruby interpreter, or you might write some script code to exercise some of the functions in the module. In any event, you find some problems in it. In order to investigate a problem that you’ve discovered, you must explore. You must explore whether your recognition of the problem was triggered by your own interaction with the program or by a mechanically executed script. You’re in control of the activity; each new test around the problem feeds back into your choice of the next activity, and into the story that you’re going to tell about the product.

All of the larger activities that I’ve described above are exploratory, and they all happen before you have a completed function or story or sprint. Exploratory testing is not a stage or phase of testing to be performed after you’ve performed your other test techniques. Exploratory testing is not an “other” test technique, because it’s not a technique at all. Exploratory testing is not a thing that you do, but rather a way that you work (and think, and act), the hallmarks being who (or what) is in control, and the extent to which your activity is part of a loop, rather than a straight line. Any test technique can be applied in a scripted way or in an exploratory way. To those who say “we do exploratory testing after our acceptance tests are all running green”, I would suggest looking carefully and observing the extent to which you’re doing exploratory testing all the way along.

Next in the series: What Exploratory Testing Is Not (Part 3): Tool-Free Testing

And, of course, in the face of all these instances of what exploratory testing is not, you might want to know our current take on what exploratory testing is.

What Exploratory Testing Is Not (Part 1): Touring

Thursday, December 15th, 2011

Touring is one way of structuring exploratory testing, but exploratory testing is not necessarily touring, and touring is not necessarily exploratory.

At one extreme, a tourist might parachute into a territory for which there is no detailed knowledge of the landscape, flora and fauna, or human culture, with the goal of identifying what’s there to be learned. Except in such cases, we wouldn’t call her a tourist; we’d call her an anthropologist, or a field botanist, or a field geologist, or an archaeologist. The activity is in this case is interactive with the territory. At the other extreme, a tourist might visit a travel agent, get on a plane to Orlando, meet a chartered bus at the airport, and sit through the rides at Disney World. The touring activity there is largely passive. and we would call someone engaged in it a “tourist”—although “engagement” is something of an exaggeration here. For this kind of person, serious explorers and locals would probably use the word “tourist” in a somewhat deprecating kind of way.

Touring a program can be done in a more scripted or more exploratory way, just as touring a city can be done in a more scripted or more exploratory way. A tourist has many options. Before going on a trip, a tourist might study what is already known about a particular destination. To prepare, she might supply herself with maps and travel guides, and some ideas about destinations of interest. Upon arrival, she might choose a set of walking tours from a guidebook and follow the routes closely, eating only at the restaurants identified in the guidebook, noting buildings and artifacts and other objects of interest by matching them with the descriptions and illustrations. At a given site, she might listen to a prepared audio guide that directs her observations very specifically. She might spend all of her time in the presence of a tour guide who tells her what to observe and how to interpret it. She might choose to accept everything the tour guide told her as the complete story, and refrain from asking questions. Even though the experience would be new to her, and she might learn something from it, she would not likely add much to what is already known. We call that activity touring, but it isn’t very exploratory, and a report on such a tour would largely recapitulate the guidebook. Is your testing like that?

On the other hand, rather than touring like a tourist, she might cover a territory as a historian, or a social scientist, or a travel writer. In that kind of role, she would have a research goal based on the idea of obtaining new knowledge. Learning something new and imparting it to other people requires a more open agenda than sitting on the bus while someone or something else directs your attention. Our researcher might make her way directly to particular destinations or landmarks and begin her research there, or she might amble through neighbourhoods or historical sites to discover new things about them. She could choose to focus on specific aspects of what’s there to observe, or she could choose to let the observations come to her—and, of course, she might do both. She might work with descriptions that she had been given with the intention of adding to them, or she might work from a set of questions that haven’t been asked before. Depending on her mission, she might choose to look for specific patterns or problems, or she might seek deeper understanding that would help her to identify or refine what kind of patterns or problems to look for. Even though the mission to discover new information might come from someone else, she remains in control of the specifics of the itinerary and of each activity from one moment to the next. Is your testing more like that?

One of the hallmarks of exploratory activity is the extent to which it is guided and structured by the person performing that activity. Another hallmark is the extent to which new knowledge feeds into choice of which action to perform next. Touring is not equivalent to exploration; touring can be done is a scripted way or an exploratory way.

This series continues with What Exploratory Testing Is Not (Part 2): After-Everything-Else Testing.

And, of course, in the face of all the forthcoming instances of what exploratory testing is not, you might want to know our current take on what exploratory testing is.

Testing Problems Are Test Results

Tuesday, September 6th, 2011

I often do an exercise in the Rapid Software Testing class in which I ask people to catalog things that, for them, make testing harder or slower. Their lists fit a pattern I hear over and over from testers (you can see an example of the pattern in this recent question on Stack Exchange). Typical points include:

  • I’m a tester working alone with several programmers (or one of a handful of testers working with many programmers).
  • I’m under enormous time pressure. Builds are coming in continuously, and we’re organized on one- or two-week development cycles.
  • The product(s) I’m testing is (are) very complex.
  • There are many interdependencies between modules within the product, or between products.
  • I’m seeing a consistent pattern of failures specifically related to those interdependencies; the tiniest change here can have devastating impact there—or anywhere.
  • I believe that I have to run a complete regression test on every build to try to detect those failures.
  • I’m trying to cope by using automated checks, but the complexity makes the automation difficult, the program’s testing hooks are minimal at best, and frequent product changes make the whole relationship brittle.
  • The maintenance effort for the test automation is significant, at a cost to other testing I’d like to do.
  • I’m feeling overwhelmed by all this, but I’m trying to cope.

On top of that,

  • The organization in which I’m working calls itself Agile.
  • Other than the two-week iterations, we’re actually using at most two other practices associated with Agile development, (typically) daily scrums or Kanban boards.

Oh, and for extra points,

  • The builds that I’m getting are very unstable. The system falls over under the most basic of smoke tests. I have to do a lot of waiting or reconfiguring or both before I can even get started on the other stuff.

How might we consider these observations?

We could choose to interpret them as problems for testing, but we could think of them differently: as test results.

Test results don’t tell us whether something is good or bad, but they may inform a decision, or an evaluation, or more questions. People observe test results and decide whether there are problems, what the problems are, what further questions are warranted, and what decisions should be made. Doing that requires human judgement and wisdom, consideration of lots of factors, and a number of possible interpretations.

Just as for automated checks and other test results, it’s important to consider a variety of explanations and interpretations for testing meta-results—observations about testing. If we don’t do that, we risk missing important problems that threaten the quality of testing effort, and the quality of the product, too.

As Jerry Weinberg points out in Perfect Software and Other Illusions About Testing, whatever else something might be, it’s information. If testing is, as Jerry says, gathering information with the intention of informing a decision, it seems a mistake to leave potentially valuable observations lying around on the floor.

We often run into problems when we test. But instead of thinking of them as problems for testing, we could also choose to think of them as symptoms of product or project problems—problems that testing can help to solve.

For example, when a tester feels outnumbered by programmers, or when a tester feels under time pressure, that’s a test result. The feeling often comes from the programmers generating more work and more complexity than the tester can handle without help.

Complexity, like quality, is a relationship between some person and something else. Complexity on its own isn’t necessarily a problem, but the way people react to it might be. When we observe the ways in which people react to perceived complexity and risk, we might learn a lot.

  • Do we, as testers, help people to become conscious of the risks—especially the Black Swans—that typically accompany complexity?
  • If people are conscious of risk, are they paying attention to it? Are they panicking over it? Or are they ignoring it and whistling past the graveyard? Or…
  • Are people reacting calmly and pragmatically? Are they acknowledging and dealing with the complexity of the product?
  • If they can’t make the product or the process that it models less complex, are they at least taking steps to make that product or process easier to understand?
  • Might the programmers be generating or modifying code so quickly that they’re not taking the time to understand what’s really going on with it?
  • If someone feels that more testers are needed, what’s behind that feeling? (I took a stab at an answer to that question a few years back.)

How might we figure that out answers to those questions? One way might be to look at more of the test results and test meta-results.

  • Does someone perceive testing to be difficult or time-consuming? Who?
  • What’s the basis for that perception? What assumptions underlie it?
  • Does the need to investigate and report bugs overwhelm the testers’ capacity to obtain good test coverage? (I wrote about that problem here.)
  • Does testing consistently reveal consistent patterns of failure?
  • Are programmers consistently surprised by such failures and patterns?
  • Do small changes in the code cause problems that are disproportionately large or hard to find?
  • Do the programmers understand the product’s interdependencies clearly? Are those interdependencies necessary, or could they be eliminated?
  • Are programmers taking steps to anticipate or prevent problems related to interfaces and interactions?
  • If automated checks are difficult to develop and maintain, does that say something about the skill of the tester, the quality of the automation interfaces, or the scope of checks? Or about something else?
  • Do unstable builds get in the way of deeper testing?
  • Could we interpret “unstable builds” as a sign that the product has problems so numerous and serious that even shallow testing reveals them?
  • When a “stable” build appears after a long series of unstable builds, how stable is it really?

Perhaps, with the answers to those questions, we could raise even more questions.

  • What risks do those problems present for the success of the product, whether in the short term or the longer term?
  • When testing consistently reveals patterns of failures and attendant risk, what does the product team do with that information?
  • Are the programmers mandated to deliver code? Or are the programmers mandated to deliver code with a warrant that the code does what it should (and doesn’t do what it shouldn’t), to the best of their knowledge? Do the programmers adamantly prefer the latter mandate?
  • Is someone pressuring the programmers to make schedule or scope commitments that they can’t really fulfill?
  • Are the programmers and the testers empowered to push back on scope or schedule pressure when it adds to product or project risk?
  • Do the business people listen to the development team’s concerns? Are they aware of the risks that testers and programmers bring to their attention? When the development team points out risks, do managers and business people deal with them congruently?
  • Is the team working at a sustainable pace? Or is the product and the project being overwhelmed by complexity, interdependencies, fragility, and problems that lurk just beyond the reach of our development and testing effort?
  • Is the development team really Agile, in the sense of the precepts of the Agile Manifesto? Or is “agility” being used in a cargo-cult way, using practices or artifacts to mask over an incoherent project?

Testers often feel that their role is to find, investigate, and report on bugs in a running software product. That’s usually true, but it’s also a pretty limited view of what testers could test. A product can be anything that someone has produced: a program, a requirements document, a diagram, a specification, a flowchart, a prototype, a development process mode, a development process, an idea. Testing can reveal information about all of those things, if we pay attention.

When seen one way, the problems that appear at the top of this article look like serious problems for testing. They may be, but they’re more than that too. When we remember Jerry’s definition of testing as “gathering information with the intention of informing a decision”, then everything that we notice or discover during testing is a test result.

Here’s a follow-up to this post. (See also this discussion for an example of looking beyond the test result for possible product and project risks.)

This post was edited in small ways, for clarity, on 2017-03-11.

Can You Test a Clock in a Sealed Box?

Friday, September 2nd, 2011

A while ago, James Bach and I did a transpection session. The object of the conversation was to think critically about the common trope that every test consists of at least an input and an expected result. We wanted to go deeper than that, and in the process we discovered a number of useful ideas. A test can be informed by an expectation, but oracles can also be developed on the fly. Oracles can also be applied retrospectively, after the test has been “completed”, such that you never know when a test ends. James has a wonderful example of that here. We also came up with the notion of implicit and explicit inputs, and symbolic and non-symbolic inputs.

As the basis of our chat, James presented the thought experiment of testing a clock that you can’t get at. Just recently my friend Adam White pointed me to this little gem. Enjoy!

The Best Tour

Thursday, June 30th, 2011

Cem Kaner recently wrote a reply to my blog post Of Testing Tours and Dashboards. One way to address the best practice issue is to go back to the metaphor and ask “What would be the best tour of London?” That question should give rise to plenty of other questions.

  • Are you touring for your own purposes, or in support of someone else’s interests? To what degree are other people interested in what you learn on the tour? Are you working for them? Who are they? Might they be a travel agency? A cultural organization? A newspaper? A food and travel show on TV? The history department of a university? What’s your information objective? Does the client want quick, practical, or deep questions answered? What’s your budget?
  • How well do you know London already?  How much would you like to leave open the possibility of new discoveries?  What maps or books or other documentation do you have to help to guide or structure your tour?  Is updating those documents part of your purpose?
  • Is someone else guiding your tour? What’s their reputation? To what extent do you know and trust them? Are they going to allow you the opportunity and the time to follow your own lights and explore, or do they have a very strict itinerary for you to follow? What might you see—or miss—as a result?
  • Are you traveling with other people? What are they interested in? To what degree do you share your discovery and learning?
  • How would you prefer to get around? By Tube, to get around quickly? By a London Taxi (which includes some interesting information from the cabbie? By bus, so you can see things from the top deck? On foot? By tour bus, where someone else is doing all the driving and all the guiding (that’s scripted touring)?
  • What do you need to bring with you? Notepad? Computer? Mobile phone? Still camera? Video camera? Umbrella? Sunscreen? (It’s London; you’ll probably need the umbrella.)
  • How much time do you have available?   An afternoon?  A day?  A few days? A week?  A month?
  • What are you (or your clients) interested in? Historical sites? Art galleries? Food? Museums? Architecture? Churches? Shopping? How focused do you want your tour to be? Very specialized, or a little of this and a little of that? What do you consider “in London”, and what’s outside of it?
  • How are you going to organize your time? How are you going to account for time spent in active investigation and research versus moving from place to place, breaks, and eating? How are you going to budget time to collect your findings, structure and summarize your experience, and present a report?
  • How do you want to record your tour? If you’re working for a client, what kind of report do they want? A conversation? Written descriptions? Pictures? Do they want things in a specific format?

(Note, by the way, that these questions are largely structured around the CIDTESTD guidewords in the Heuristic Test Strategy Model (Customer, Information, Developer Relations, Equipment and Tools, Schedule, Test Item, and Deliverables)—and that there are context-specific questions that we can add as we model and explore the mission space and the testing assignment.)

There is no best tour of London; they have their strengths and weaknesses. Reasonable people who think about it for a moment realize that the “best” tour of London is a) relative to some person; b) relative to that person’s purposes and interests; c) relative to what the person already knows; d) relative to the amount of time available.  And such a reasonable person would be able to apply that metaphor to software testing tours too.

Common Languages Ain’t So Common

Tuesday, June 28th, 2011

A friend told me about a payment system he worked on once. In the system models (and in the source code), the person sending notification of a pending payment was the payer. The person who got that notice was called the payee. That person could designate somone else—the recipient—to pick up the money. The transfer agent would credit the account of the recipient, and debit the account of the person who sent notification—the payer, who at that point in the model suddenly became known as the sender. So, to make that clear: the payer sends email to the payee, who receives it. The sender pays money to the recipient (who accepts the payment.) Got that clear? It turns out there was a logical, historical reason for all this. Everything seemed okay at the beginning of the project; there was one entity named “payer” and another named “payee”. Payer A and Payee B exchanged both email and money, until someone realized that B might give someone else, C, the right to pick up the money. Needing another word for C, the development group settled on “recipient”, and then added “sender” to the model for symmetry, even though there was no real way for A to split into two roles as B had. Uh, so far.

There’s a pro-certification argument that keeps coming back to the discussion like raccoons to a garage: the claim that, whatever its flaws, “at least certification training provides us with a common language for testing.” It’s bizarre enough that some people tout this rationalization; it’s even weirder that people accept it without argument. Fortunately, there’s an appropriate and accurate response: No, it doesn’t. The “common language” argument is riddled with problems, several of which on their own would be showstoppers.

  • Which certification training, specifically, gives us a common language for testing? Aren’t there several different certification tribes? Do they all speak the same language? Do they agree, or disagree on the “common language”? What if we believe certification tribes present (at best) a shallow understanding and a shallow description of the ideas that they’re describing?
  • Who is the “us” referred to in the claim? Some might argue that “us” refers to the testing “industry”, but there isn’t one. Testing is practiced in dozens of industries, each with its own contexts, problems, and jargon.
  • Maybe “us” refers to our organization, or our development shop. Yet within our own organization, which testers have attended the training? Of those, has everyone bought into the common language? Have people learned the material for practical purposes, or have they learned it simply to pass the certification exam? Who remembers it after the exam? For how long? Even if they remember it, do they always and everafter use the language that has been taught in the class?
  • While we’re at it, have the programmers attended the classes? The managers? The product owners? Have they bought in too?
  • With that last question still hanging, who within the organization decides how we’ll label things? How does the idea of a universal language for testing fit with the notion of the self-organizing team? Shouldn’t choices about domain-specific terms in domain-specific teams be up to those teams, and specific to those domains?
  • What’s the difference between naming something and knowing something? It’s easy enough to remember a label, but what’s the underlying idea? Terms of art are labels for constructs—categories, concepts, ideas, thought-stuff. What’s in and what’s out with respect to a given category or label? Does a “common language” give us a deep understanding of such things? Please, please have a look at Richard Feynman’s take on differences between naming and knowing, http://www.youtube.com/watch?v=05WS0WN7zMQ.
  • The certification scheme has representatives from over 25 different countries, and must be translated into a roughly equivalent number of languages. Who translates? How good are the translations?
  • What happens when our understanding evolves? Exploratory testing, in some literature, is equated with “ad hoc” testing, or (worse) “error guessing”. In the 1990s, James Bach and Cem Kaner described exploratory testing as “simultaneous test design, test execution, and learning”. In 2006, participants in the Workshop on Heuristic and Exploratory Techniques discussed and elaborated their ideas on exploratory testing. Each contributed a piece to a definition synthesized by Cem Kaner: “Exploratory software testing is a style of software testing that emphasizes the personal freedom and responsibility of the individual tester to continually optimize the value of her work by treating test-related learning, test design, test execution, and test result interpretation as mutually supportive activities that run in parallel throughout the project.” That doesn’t roll off the tonque quite so quickly, but it’s a much more thorough treatment of the idea, identifying exploratory testing as an approach, a way that you do something, rather than something that you do. Exploratory work is going on all the time in any kind of complex cognitive activity, and our understanding of the work and of exploration itself evolves (as we’ve pointed out here, and here, and here, and here, and here.). Just as everyday, general-purpose languages adopt new words and ideas, so do the languages that we use in our crafts, in our communities, and with our clients.

In software development, we’re always solving new problems. Those new problems may involve people to work with entirely new technological or business domains, or to bridge existing domains with new interactions and new relationships. What happens when people don’t have a common language for testing, or for anything else in that kind of development process? Answer: they work it out. As Peter Galison notes in his work on trading zones, “Cultures in interaction frequently establish contact languages, systems of discourse that can vary from the most function-specific jargons, through semispecific pidgins, to full-fledged creoles rich enough to support activities as complex as poetry and metalinguistic reflection.”  Each person in a development group brings elements of his or her culture along for the ride; each project community develops its own culture and its own language.

Yes, we do need common languages for testing (note the plural), but that commonality should be local, not global. Anthropology shows us that meaningful language develops organically when people gather for a common purpose in a particular context. Just as we need testing that is specific to a given context, we need terms that are that way too. So instead of focusing training on memorizing glossary entries, let’s teach testers more about the relationships between words and ideas. Let’s challenge each other to speak and to listen precisely, and to ask better questions about the language we’re using, and to be wary of how words might be fooling us. And let us, like responsible professionals and thinking human beings, develop and refine language as it suits our purposes as we interact with our colleagues—which means rejecting the idea of having canned, context-free, and deeply problematic language imposed upon us.

Follow-up, 2014-08-24: This post has been slightly edited to respond to a troubling fact: the certificationists have transmogrified into the standardisers. Here’s a petition where you can add your voice to stop this egregious rent-seeking.