Archive for the ‘Tester Skill’ Category

Worthwhile Documentation

Monday, December 19th, 2011

In the Rapid Software Testing class, we focus on ways of doing the fastest, least expensive testing that still completely fulfills the mission. That involves doing some things more quickly, and it also involves doing other things less, or less wastefully. One of the prime candidates for radical waste reduction is documentation that’s incongruent with the testing mission.

Medical device projects typically present a high degree of risk. Excellent testing helps teams and product owners to identify risks and problems in the product. The quality of testing is a function of the skill of the tester; one would not set loose an incapable tester on high-risk project. Yet some managers have told me that they commission people to write test documentation in a particular style. That style is, to me, overly elaborate and specific with respect to actions to perform and observations to make. Yet at the same time, that style is remarkably devoid of ideas about motivation or risk.

I sometimes ask managers why they use this style of instruction. They usually answer, “because we want anyone to be able to walk up to this system and test it.”

“Anyone?” I ask. “Why anyone?”

“You know how it is. If we have to test a new revision of this program a year from now, there’s a good chance that we won’t have the same testers.” (Dude. If you’re inflicting on your staff the idea of testing as writing or following instructions for an automaton, I might have an explanation for you.)

“Anyone?” I ask. “How about a cat?”

“Well, Michael, that’s silly. Cats can’t think. Cats can’t read.”

“How about my daughter? She’s seven, and she can read well enough to read that. And she could follow the steps pretty well, too.”

“We don’t hire children here!”

“Okay,” I offer. “Would you hire a completely incompetent tester who needed to be told absolutely everything, in painful detail?”

“We wouldn’t hire anyone like that.”

“Fair enough, and I’d hope not. So, why do you insist that people write instructions for them that way?

Let me be clear: when the situation calls for skilled testers, you don’t need overly specific instructions for them. On the other hand, if you don’t have skilled testers, you’ve got a problem that scripted testing won’t be able to solve.

Here’s a splendid example of a machete that we believe that managers could use to cut through jungles of waste. In a recent project that involved work with FDA-regulated medical devices, James Bach found a huge number of excruciatingly overspecified, low-value test cases aimed at “anyone”. The following two paragraphs replaced 50 pages of waste.

3.0 Test Procedures

3.1 General Testing Protocol

In the test descriptions that follow, the word “verify” is used to highlight specific items that must be checked. In addition to those items, the tester shall at all times be alert for any unexplained or erroneous behaviour of the product. The tester shall bear in mind that, regardless of any specific requirement for any specific test, there is the overarching general requirement that the product shall not pose an unacceptable risk of harm to the patient, including an unacceptable risk due to reasonably foreseeable misuse.

Test personnel requirements: The tester shall be thoroughly familiar with the Generator and Workstation Function Requirement Specifications, as well as the working principles of the devices themselves. The tester shall also know the workings of the power test jig and associated software, including how to configure and calibrate it and how to recognize it is not working correctly. The tester shall have sufficient skill in data analysis and measurement theory to make sense of statistical test results. The tester shall be sufficiently familiar with test design to complement this protocol with exploratory testing in the event that anomalies appear that require investigation. The tester shall know how to keep test records to a credible professional standard.

To me, that’s something worth writing down. Follow those instructions, and your team will save time, save work, and put the emphasis in the right places: on risk, and on meeting and mitigating that risk with skills.

Smoke Testing vs. Sanity Testing: What You Really Need to Know

Tuesday, November 8th, 2011

If you spend any time in forums in which new testers can be found, it won’t be long before someone asks “”What is the difference between smoke testing and sanity testing?”

“What is the difference between smoke testing and sanity testing?” is a unicorn question. That is, it’s a question that shouldn’t be answered except perhaps by questioning the question: Why does it matter to you? Who’s asking you? What would you do if I gave you an answer? Why should you trust my answer, rather than someone else’s? Have you looked it up on Google? What happens if people on Google disagree?

But if you persist and ask me, here’s what I will tell you:

The distinction between the smoke and sanity testing is not generally important. In fact, it’s one of the most trivial aspects of testing that I can think of, offhand. Yet it does point to something that is important.

Both smoke testing and sanity testing refer to a first-pass, shallow form of testing intended to establish whether a product or system can perform the most basic functions. Some people call such testing “smoke testing”; others call it “sanity testing”. “Smoke testing” derives from the hardware world; if you create an electronic circuit, power it up, and smoke comes out somewhere, the smoke test has failed. Sanity testing has no particular derivation that I’m aware of, other than the common dictionary definition of the word. Does the product behave in some crazy fashion? If so, it has failed the sanity test.

Do you see the similarity between these two forms of testing? Can you make a meaningful distinction between them? Maybe someone can. If so, let them make it. If you’re talking to some person, and that person want to make a big deal about the distinction, go with it. Some organizations make a distinction between the smoke and sanity testing; some don’t. If it seems important in your workplace, then ask in your workplace, and adapt your thinking accordingly while you’re there. If it’s important that you provide a “correct” answer on someone’s idiotic certification exam, give them the answer they want according to their “body of knowledge”. Otherwise, it’s not important. Don’t worry about it.

Here’s what is important: wherever you find yourself in your testing career, people will use language that has evolved as part of the culture of that organization. Some consultancies or certification mills or standards bodies claim the goal of providing “a common worldwide standard language for testing”. This is as fruitless and as pointless a goal as a common worldwide standard language for humanity. Throughout all of human history history, people have developed different languages to address things that were important in their cultures and societies and environments. Those languages continue to develop as change happens. This is not a bad thing. This is a good thing.

There is no term in testing of which I am aware whose meaning is universally understood and accepted. There’s nothing either wrong or unusual about that. It’s largely true outside the testing world too. Pick an English word at random, and odds are you’ll find multiple meanings for it. Examples:

  • Pick (choose, plectrum for a guitar)

  • English (a language, spin on a billiard ball)

  • word (a unit of speech, a 32-bit value)

  • random (without a definite path, of equal probability)

  • odds (probability, numbers not divisible by two)

  • multiple (more than one, divisible by)

  • meaning (interpretation, significance)

Never mind the shades and nuances of interpretation within each meaning of each word! And notice that “never mind”, in this context, is being used ironically. Here, “never mind” doesn’t mean “forget” or “ignore”; here, it really means the opposite: “also pay attention to”!

Not only is there no universally accepted term for anything, there’s no universally accepted authority that could authoritatively declare or enforce a given meaning for all time. (Some might point to law, claiming that there are specific terms which have solid interpretations. If that were true, we wouldn’t need courts or lawyers.)

If you find yourself in conversation (or in an interview) with someone who asks you “Do you do X?”, and you’re not sure what X is by their definition, a smart and pragmatic reply starts with, “I may do X, but not necessarily by that name.” After that,

  • You can offer to describe your notion of X (if you have one).

  • You can describe something that you do that could be interpreted as X. That can be risky, so offer this too: “Since I don’t know what you mean by X, here’s something that I do. I think it sounds similar to X, or could be interpreted as X. But I’d like to make sure that we both recognize that we could have different interpretations of what X means.”

  • You can say, “I’d like to avoid the possibility that we might be talking at cross-purposes. If you can describe what X means to you, I can tell you about my experiences doing similar things, if I’ve done them. What does X mean to you?” Upon hearing their defintion of X, then truthfully describe your experience, or say that you haven’t done it.

If you searched online for an answer to the smoke vs. sanity question, you’d find dozens, hundreds of answers from dozens, hundreds of people. (Ironically, the very post that introduces the notion of the unicorn question includes, in the second-to-last paragraph, a description of a smoke test. Or a sanity test. Whatever.) The people who answer the smoke vs. sanity question don’t agree, and neither do their answers. Yet many, even most, of the people will seem very sure of their own answers. People will have their own firm ideas about how many angels can fit on the head of a pin, too. However, there is no “correct” definition for either term outside of a specific context, since there is no authority that is univerally accepted. If someone claimed to be a universally accepted authority, I’d reject the claim, which would put an instant end to the claim of universal acceptance.

With the possibile exception of the skills of memorization, there is no testing skill involved in memorizing someone’s term for something. Terms and their meanings are slippery, indistinct, controversial, and context-dependent. The real testing skill is in learning to deal with the risk of ambiguity and miscommunication, and the power of expressing ourselves in many ways.

Testing: Difficult or Time-Consuming?

Thursday, September 29th, 2011

In my recent blog post, Testing Problems Are Test Results, I noted a question that we might ask about people’s perceptions of testing itself:

Does someone perceive testing to be difficult or time-consuming? Who? What’s the basis for that perception? What assumptions underlie it?

The answer to that question may provide important clues to the way people think about testing, which in turn influences the cost and value of testing.

As an example, an pseudonymous person (“PM Hut”) who is evidently associated with project management in some sense (s/he provides the URL http://www.pmhut.com) answered my questions above.

Just to answer your question “Does someone perceive testing to be difficult or time-consuming?” Yes, everyone, I can’t think of a single team member I have managed who doesn’t think that testing is time consuming, and they’d rather do something else.

This, alas, isn’t an unusual response. To someone like me who offers help in increasing the value and reducing the cost of testing, it triggers some questions that might prompt reframes or further questions.

  • What do the team members think testing is? Do they think that it’s something ancillary to the project, rather than an essential and integrated aspect of software development? To me, testing is about gathering information and raising awareness that’s essential for identifying product risks and steering the project. That’s incredibly important and valuable.

    So when the team members are driving a car, do they perceive looking out the windshield to be difficult or time-consuming? Do they perceive looking at the dashboard to be difficult or time-consuming? If so, why? What are the differences between the way they obtain awareness when they’re driving a car, versus the way they obtain awareness when they’re contributing to the development of a product or service?

  • Do the team members think testing is the mindless repetition of actions and observation of specific outputs, as prescribed by someone else? If so, I’d agree with them that testing is an unpalatable activity—except I don’t call that testing. I call it checking, and I’d rather let a machine do it. I’d also ask if checking is being done automatically by the programmers at lower levels where it tends to be fast, cheap, easy, useful and timely—or manually at higher levels, where it tends to be slower, more expensive, more difficult, less useful, and less timely—and tedious?
  • Is testing focused mostly on confirmation of things that we already know or hope to be true? Is it mostly focused on the functional aspects of the program (which are amenable to checking)? People tend to find this dull and tedious, and rightly so. Or is testing an active search for new information, problems, and risks? Does it include focus on parafunctional aspects of the product—the things that provide important perceptions of real value to real people? Are the testers given the freedom and responsibility to manage a good deal of their own investigation? Testers tend to find this kind of approach a lot more engaging and a lot more interesting, and the results are typically more wide-ranging, informative, and valuable to programmers and managers.
  • Is testing overburdened by meaningless and valueless paperwork, bureaucracy, and administrivia? How did that come to pass? Are team members aware that there are simple, lightweight, rapid, and highly effective ways of planning, recording, and reporting testing work and project status?
  • Are there political issues? Are testers (or people acting temporarily in a testing role) routinely blown off (as in this example)? Are the nuggets of information revealed by testing habitually dismissed? Is that because testing is revealing trivial information? If so, is there a problem with specific testing skills like modeling the test space, determining coverage, determining oracles, recording, or reporting?
  • Have people been trained on the basis of testing as a skilled, sophisticated thinking art? Or is testing something for which capability can be assessed by a trivial, 40-question multiple choice exam?
  • If testing is being done well (which given people’s attitudes expressed above would be a surprise), are programmers or managers afraid of having to deal with the information that testing reveals? Does that lead to recrimination and conflict?
  • If there’s a perception that testing is by its nature dull and slow, are the testers aware of the quick testing approaches in our Rapid Software Testing class (PDF, page 97-99) , in the Black Box Software Testing course offered by the Association for Software Testing, or in James Whittaker’s How to Break Software? Has anyone read and absorbed Lessons Learned in Software Testing?
  • If there’s a perception that technical reviews are slow, have the testers, programmers, or managers read Perfect Software and Other Illusions About Testing? Do they recognize the ways in which careful observation provides us with “instant reviews” (see Perfect Software, page 143)? Has anyone on the team read any other of Jerry Weinberg’s books on software management and measurement?
  • Have the testers, programmers, and managers recognized the extent to which exploratory testing is going on all the time? Do they recognize that issues revealed by testing might be even more important than bugs? Do they understand that every test result and every testing problem points to meta-information that can be extremely valuable in managing the project?

On PM Hut’s own Web site, there’s an article entitled “Why Project Managers Fail“. The author, Jim Benson, lists five common problems, each of which could be quickly revealed by looking at testing as a source of information, rather than by simply going through the motions. Take it from the former program manager of a product that, in its day, was the best-selling piece of commercial software in the world: testers, testing, and the information they reveal are a project manager’s best friends and most valuable assets—when you have the awareness to recognize them.

Testing need not be difficult, tedious or time-consuming. A perception that it is so, or that it must be so, suggests a problem with testing as practised or testing as perceived. Astute managers and teams will investigate that important and largely mistaken perception.

At Least Three Good Reasons for Testers to Learn to Program

Tuesday, September 20th, 2011

There is a common claim, especially in the Agile community, that suggests that all testers should be able to write programs. I don’t think that’s the case. In the Rapid Software Testing class, James Bach and I say that testing is “questioning a product in order to evaluate it”. That’s the short form of my personal definition of testing, “investigation of people, software, computers, and related goods and services, and the relationships between them”. Most people who use computer programs are not computer programmers, and there are many ways in which a tester can question and investigate a product without programming.

Yet there are at least three very good reasons why it might be a good idea for a tester to learn to program.

Tooling. Computer programs extend our capacity to sense what’s happening, to make decisions, and to perform actions. There are many wonderful packaged tools, in dozens of categories, available to us testers. Every tool offers some level of control through a variety of affordances. Some provide restrictive controls, like text fields or drop-down lists or check boxes. Other tools provide macros so that we can string together sequences of actions. Some tools come with full-blown scripting languages that provide the capacity for the tester to sense, decide, and act through the tool in very flexible and specific ways. There are also, of course, general-purpose programming and scripting languages in their own right. When we can use programming concepts and programming languages, we have a more powerful and adaptable tool set for our investigations.

Insight. When we learn to program, we develop understanding about the elements and construction of programs and the computers on which they run. We learn how data is represented inside the computer, and how bits can be interpreted and misinterpreted. We learn about flow control, decision points, looping, and branching—and how mistakes can be made. We might even be able to read the source code of the programs that we’re testing, which can be valuable in review, troubleshooting, or debugging. Even if we never see our programs’ source code, when we learn about how programs work, we gain insight into how they might not work.

Humility. Dan Spear, a great programmer at Quarterdeck and a good friend, once pointed out to me that programming a computer is one of the most humbling experiences available to us. When you program a computer to perform some task, it will reflect—often immediately and often dramatically—any imprecision in your instructions and your underlying ideas. When we learn to program, we get insight not only into how a program works, but also into how difficult programming can be. This should trigger respect for programmers, but it should also trigger something else: empathy, which—as Jerry Weinberg says—is the first and most important aspect of the emotional set and setting for the tester.

Testing Problems Are Test Results

Tuesday, September 6th, 2011

I often do an exercise in the Rapid Software Testing class in which I ask people to catalog things that, for them, make testing harder or slower. Their lists fit a pattern I hear over and over from testers (you can see an example of the pattern in this recent question on Stack Exchange). Typical points include:

  • I’m a tester working alone with several programmers (or one of a handful of testers working with many programmers).
  • I’m under enormous time pressure. Builds are coming in continuously, and we’re organized on one- or two-week development cycles.
  • The product(s) I’m testing is (are) very complex.
  • There are many interdependencies between modules within the product, or between products.
  • I’m seeing a consistent pattern of failures specifically related to those interdependencies; the tiniest change here can have devastating impact there—or anywhere.
  • I believe that I have to run a complete regression test on every build to try to detect those failures.
  • I’m trying to cope by using automated checks, but the complexity makes the automation difficult, the program’s testing hooks are minimal at best, and frequent product changes make the whole relationship brittle.
  • The maintenance effort for the test automation is significant, at a cost to other testing I’d like to do.
  • I’m feeling overwhelmed by all this, but I’m trying to cope.

On top of that,

  • The organization in which I’m working calls itself Agile.
  • Other than the two-week iterations, we’re actually using at most two other practices associated with Agile development, (typically) daily scrums or Kanban boards.

Oh, and for extra points,

  • The builds that I’m getting are very unstable. The system falls over under the most basic of smoke tests. I have to do a lot of waiting or reconfiguring or both before I can even get started on the other stuff.

How might we consider these observations?

We could choose to interpret them as problems for testing, but we could think of them differently: as test results.

Test results don’t tell us whether something is good or bad, but they may inform a decision or an evaluation or more questions. People observe test results and decide whether there are problems and what the problems are, what further questions are warranted, and what decisions should be made. Doing that requires human judgement and wisdom, consideration of lots of factors, and a number of possible interpretations.

Just as for automated checks and other test results, it’s important to consider a variety of explanations and interpretations for testing meta-results—observations about testing—lest we miss an important problem. As Jerry Weinberg points out in Perfect Software and Other Illusions About Testing, whatever else something might be, it’s information. If testing is, as Jerry says, gathering information with the intention of informing a decision, it seems a mistake to leave potentially valuable observations lying around on the floor. Indeed, rather than thinking of them as problems for testing, we could choose to think of them as symptoms of product or project problems—problems that testing can help to solve.

For example, when a tester feels outnumbered by programmers, or when a tester feels under time pressure, that’s a test result. The feeling often comes from the programmers generating more work and more complexity than the tester can handle. Yet complexity, like quality, is a relationship between some person and something else. Complexity on its own isn’t necessarily a problem; it’s how people deal with it and its attendant risks that’s a problem. When we observe the ways in which people react to a perception of complexity, we might learn a lot.

  • Are people conscious of the risks—especially the Black Swans—that typically accompany complexity?
  • If people are conscious of risk, are they paying attention to it? Are they panicking over it? Or are they ignoring it and whistling past the graveyard? Or…
  • Are people reacting calmly and pragmatically? Are they acknowledging and dealing with the complexity of the product? If they can’t make the product or the process that it models less complex, are they at least taking steps to make understanding of the product more tractable?
  • Might the programmers be generating or modifying code so quickly that they’re not taking the time to understand what’s really going on with it?
  • If someone feels that more testers are needed, what’s behind that feeling? (I took a stab at an answer to that question a few years back.)

How might we figure that out answers to those questions? One way might be to look at more of the test results and test meta-results.

  • Does someone perceive testing to be difficult or time-consuming? Who? What’s the basis for that perception? What assumptions underlie it?
  • Does the need to investigate and report bugs overwhelm the testers’ capacity to obtain good test coverage? (I wrote about that problem here.)
  • Does testing consistently reveal consistent patterns of failure?
  • Are programmers consistently surprised by such failures and patterns?
  • Do small changes in the code cause problems that are disproportionately large or hard to find?
  • Do the programmers understand the interdependencies clearly? Are those interdependencies necessary, or could they be eliminated?
  • Are programmers taking steps to anticipate or prevent problems related to interfaces and interactions?
  • If automated checks are difficult to develop and maintain, does that say something about the skill of the tester, the quality of the automation interfaces, or the scope of checks? Or about something else?
  • Are unstable builds a problem that get in the way of deeper testing? Or could we interpret them as a sign that the product has problems so numerous and serious that even shallow testing reveals them?
  • When a “stable” build appears after a long series of unstable builds, how stable is it really?

Perhaps, with the answers to those questions, we could raise even more questions.

  • What risks do those problems present for the success of the product, whether in the short term or the longer term?
  • When testing consistently reveals patterns of failures and attendant risk, what does the product team do with that information?
  • Are the programmers mandated to deliver code? Or are the programmers mandated to deliver code with a warrant that the code does what it should (and doesn’t do what it shouldn’t), to the best of their knowledge? Do the programmers adamantly prefer the latter mandate?
  • Is someone pressuring the programmers to make schedule or scope commitments that they can’t really fulfill?
  • Are the programmers and the testers empowered to push back on scope or schedule pressue when it adds to product or project risk?
  • Do the business people listen to the development team’s concerns? Are they aware of the risks that testers and programmers bring to their attention? When the development team points out risks, do managers and business people deal with them congruently?
  • Is the team working at a sustainable pace, or might we expect the product and the project to become overwhelmed by complexity, interdependencies, fragility, and problems that lurk just beyond the reach of our development and testing effort?
  • Is the development team really Agile, in the sense of the precepts of the Agile Manifesto? Or is “agility” being used in a cargo-cult way, using practices or artifacts to mask over an incoherent project?

Testers often feel that their role is to find, investigate, and report on bugs in the product. That’s usually true, but it’s also a pretty limited view of the kinds of information that testing reveals. When seen one way, the problems I’ve listed above sound like serious problems for testing. What if we also remembered Jerry’s definition of testing as “gathering information with the intention of informing a decision”? If that’s the case, then everything that we notice or discover during testing is a test result.

(See also this discussion for an example of looking beyond the test result for possible product and project risks.)

Can You Test a Clock in a Sealed Box?

Friday, September 2nd, 2011

A while ago, James Bach and I did a transpection session. The object of the conversation was to think critically about the common trope that every test consists of at least an input and an expected result. We wanted to go deeper than that, and in the process we discovered a number of useful ideas. A test can be informed by an expectation, but oracles can also be developed on the fly. Oracles can also be applied retrospectively, after the test has been “completed”, such that you never know when a test ends. James has a wonderful example of that here. We also came up with the notion of implicit and explicit inputs, and symbolic and non-symbolic inputs.

As the basis of our chat, James presented the thought experiment of testing a clock that you can’t get at. Just recently my friend Adam White pointed me to this little gem. Enjoy!

Common Languages Ain’t So Common

Tuesday, June 28th, 2011

A friend told me about a payment system he worked on once. In the system models (and in the source code), the person sending notification of a pending payment was the payer. The person who got that notice was called the payee. That person could designate somone else—the recipient—to pick up the money. The transfer agent would credit the account of the recipient, and debit the account of the person who sent notification—the payer, who at that point in the model suddenly became known as the sender. So, to make that clear: the payee sends email to the payer, who receives it. The sender pays money to the recipient (who accepts the payment.) Got that clear? It turns out there was a logical, historical reason for all this. Everything seemed okay at the beginning of the project; there was one entity named “payer” and another named “payee”. Payer A and Payee B exchanged both email and money, until someone realized that B might give someone else, C, the right to pick up the money. Needing another word for C, the development group settled on “recipient”, and then added “sender” to the model for symmetry, even though there was no real way for A to split into two roles as B had. Uh, so far.

There’s a pro-certification argument that keeps coming back to the discussion like raccoons to a garage: the claim that, whatever its flaws, “at least certification training provides us with a common language for testing.” It’s bizarre enough that some people tout this rationalization; it’s even weirder that people accept it without argument. Fortunately, there’s an appropriate and accurate response: No, it doesn’t. The “common language” argument is riddled with problems, several of which on their own would be showstoppers.

  • Which certification training, specifically, gives us a common language for testing? Aren’t there several different certification tribes? Do they all speak the same language? Do they agree, or disagree on the “common language”? What if we believe certification tribes present (at best) a shallow understanding and a shallow description of the ideas that they’re describing?
  • Who is the “us” referred to in the claim? Some might argue that “us” refers to the testing “industry”, but there isn’t one. Testing is practiced in dozens of industries, each with its own contexts, problems, and jargon.
  • Maybe “us” refers to our organization, or our development shop. Yet within our own organization, which testers have attended the training? Of those, has everyone bought into the common language? Have people learned the material for practical purposes, or have they learned it simply to pass the certification exam? Who remembers it after the exam? For how long? Even if they remember it, do they always and everafter use the language that has been taught in the class?
  • While we’re at it, have the programmers attended the classes? The managers? The product owners? Have they bought in too?
  • With that last question still hanging, who within the organization decides how we’ll label things? How does the idea of a universal language for testing fit with the notion of the self-organizing team? Shouldn’t choices about domain-specific terms in domain-specific teams be up to those teams, and specific to those domains?
  • What’s the difference between naming something and knowing something? It’s easy enough to remember a label, but what’s the underlying idea? Terms of art are labels for constructs—categories, concepts, ideas, thought-stuff. What’s in and what’s out with respect to a given category or label? Does a “common language” give us a deep understanding of such things? Please, please have a look at Richard Feynman’s take on differences between naming and knowing, http://www.youtube.com/watch?v=05WS0WN7zMQ.
  • The certification scheme has representatives from over 25 different countries, and must be translated into a roughly equivalent number of languages. Who translates? How good are the translations?
  • What happens when our understanding evolves? Exploratory testing, in some literature, is equated with “ad hoc” testing, or (worse) “error guessing”. In the 1990s, James Bach and Cem Kaner described exploratory testing as “simultaneous test design, test execution, and learning”. In 2006, participants in the Workshop on Heuristic and Exploratory Techniques discussed and elaborated their ideas on exploratory testing. Each contributed a piece to a definition synthesized by Cem Kaner: “Exploratory software testing is a style of software testing that emphasizes the personal freedom and responsibility of the individual tester to continually optimize the value of her work by treating test-related learning, test design, test execution, and test result interpretation as mutually supportive activities that run in parallel throughout the project.” That doesn’t roll off the tonque quite so quickly, but it’s a much more thorough treatment of the idea, identifying exploratory testing as an approach, a way that you do something, rather than something that you do. Exploratory work is going on all the time in any kind of complex cognitive activity, and our understanding of the work and of exploration itself evolves (as we’ve pointed out here, and here, and here, and here, and here.). Just as everyday, general-purpose languages adopt new words and ideas, so do the languages that we use in our crafts, in our communities, and with our clients.

In software development, we’re alway solving new problems. Those new problems may involve people to work with entirely new technological or business domains, or to bridge existing domains with new interactions and new relationships. What happens when people don’t have a common language for testing, or for anything else in that kind of development process? Answer: they work it out. As Peter Galison notes in his work on trading zones, “Cultures in interaction frequently establish contact languages, systems of discourse that can vary from the most function-specific jargons, through semispecific pidgins, to full-fledged creoles rich enough to support activities as complex as poetry and metalinguistic reflection.”  Each person in a development group brings elements of his or her culture along for the ride; each project community develops its own culture and its own language.

Yes, we do need common language for testing. Anthropology shows us that meaningful language develops organically when people gather for a common purpose in a particular context. Just as we need tests that are specific to a given context, we need terms that are that way too. So instead of focusing training on memorizing glossary entries, let’s teach testers more about the relationships between words and ideas. Let’s challenge each other to ask better questions about the language we’re using, and how it might be fooling us.

Exploratory Testing is All Around You

Monday, May 16th, 2011

I regularly converse with people who say they want to introduce exploratory testing in their organization. They say that up until now, they’ve only used a scripted approach.

I reply that exploratory testing is already going on all the time at your organization.  It’s just that no one notices, perhaps because they call it

  • “review”, or
  • “designing scripts”, or
  • “getting ready to test”, or
  • “investigating a bug”, or
  • “working around a problem in the script”, or
  • “retesting around the bug fix”, or
  • “going off the script, just for a moment”, or
  • “realizing the significance of what a programmer said in the hallway, and trying it out on the system”, or
  • “pausing for a second to look something up”, or
  • “test-driven development”, or
  • “Hey, watch this!”, or
  • “I’m learning how to use the product”, or
  • “I’m shaking out it a bit”, or
  • “Wait, let’s do this test first instead of that test”, or
  • “Hey, I wonder what would happen if…”, or
  • “Is that really the right phone number?”, or
  • “Bag it, let’s just play around for a while”, or
  • “How come what the script says and what the programmer says and what the spec says are all different from each other?”, or
  • “Geez, this feature is too broken to make further testing worthwhile; I’m going to go to talk to the programmer”, or
  • “I’m training that new tester in how to use this product”, or
  • “You know, we could automate that; let’s try to write a quickie Perl script right now”, or
  • “Sure, I can test that…just gimme a sec”, or
  • “Wow… that looks like it could be a problem; I think I’ll write a quick note about that to remind me to talk to my test lead”, or
  • “Jimmy, I’m confused… could you help me interpret what’s going on on this screen?”, or
  • “Why are we always using ‘tester’ as the login account? Let’s try ‘tester2′ today”, or
  • “Hey, I could cancel this dialog and bring it up again and cancel it again and bring it up again”, or
  • “Cool! The return value for each call in this library is the round-trip transaction time—and look at these four transactions that took thirty times longer than average!”, or
  • “Holy frijoles! It blew up! I wonder if I can make it blow up even worse!”, or
  • “Let’s install this and see how it works”, or
  • “Weird… that’s not what the Help file says”, or
  • “That could be a cool tool; I’m going to try it when I get home”, or
  • “I’m sitting with a new tester, helping her to learn the product”.

Now it’s possible that none of that stuff ever happens in your organization. Or maybe people aren’t paying attention or don’t know how to observe testing. Or both.

Then, just before I posted this blog post, James Bach offered me two more sure-fire clues that people are doing exploratory testing. If they say, “I am in no way doing exploratory testing”, or “We’re doing only highly rigorous formal testing”. In both cases, the emphatic nature of the claim guarantees that the claimant is not sufficiently observant about testing to realize that exploratory testing is happening all around them.

Questioning Test Cases, Part 2: Testers Learn, But Test Cases Don’t Teach

Wednesday, April 6th, 2011

In the last post, my LinkedIn correspondent provided a couple of reasons why she liked writing test cases, and why she thought it necessary to write them down in the early stages of the project. Then she gave a third reason:

When I’m on a project and I am the only one who knows how to test something, then I can’t move on to something new. I’d still be testing Cut/Copy/Paste if I were not able to pass that script on easily to a new recruit.

I often hear this kind of justification for test cases: “We need test cases so that testers can learn quickly how to test the product.” To me, that’s like saying that we need driving cases so that people can learn quickly how to drive in a new city. In a new city, a map or a general set of ideas might be helpful. Driving cases won’t, because to drive through a new city well, you have to know how to drive. In the same way, test cases are unlikely to be anything more than negligbly helpful. In order to learn quickly how to test a new program well, you have to know how to test.

It also seemed to me that she considers that writing to be the One True Way to impart information, and that reading to be the One True Way to receive it. If I’m wrong about that, it’s because I’ve misinterpreted something she’s written. Fancy that!

So for the sake of discussion, let’s say I were to to hire (say) an older domain expert, making a career change, who had no computer experience, and let’s use Cut/Copy/Paste as an example. With the new recruit, I’d do what I’ve done many times in coaching sessions. I’d sit with her down with a copy of Microsoft Word, and open up a document that had some text and some images in it. Then I’d say something like this:

“We’re going to learn how to move text around, cutting and pasting and making copies of it so we can save time on retyping stuff that the machine can do for us. Okay. Put your cursor there, and press Shift-RightArrow there. Good. See what happened?”

(There would be lots of pauses. I’d pause after each question to give the trainee time to answer; frequently to let the trainee experiment; and often to let the trainee ask questions of her own. I won’t do that explicitly here; I’ll let you put most of the pauses in yourself, as you read. For sure, assume that there’s one at least after every question, and usually after every sentence.)

“So try doing that a few more times. What happened? So you’ve highlighted the text; people call that marking or selecting the text. What happens when you add the Ctrl key, and press Ctrl-Alt-RightArrow at the same time? Right: it selects a word at a time. Now, try pressing Ctrl-X.

“Whoa, what happened just there?! Right: pressing Ctrl-X that makes the selected text go away; you can see that. What you might not know is that the text goes onto the Windows Clipboard, a kind of temporary storage area for all kinds of stuff. So you haven’t erased it; it’s more like you’ve cut it out, and the computer still has a copy of it stashed somewhere. Now, put your cursor here, and hit Ctrl-V. What did that do? Yeah; it pasted the text you cut before. That is, it retrieved the text from the clipboard, and made a copy right where you pressed Ctrl-V. So, Ctrl-X cuts, and Ctrl-V pastes a copy of the thing you cut. Try that again. What happened? Right; another copy. So it’s a little different from regular cutting and pasting, in that you can make many copies of the same thing. When might that come in handy? Good idea! Yep, that’s a good idea too. Try it a few more times, in other places. What does Ctrl-V do? What does Ctrl-X do?

“Try selecting something else, and pressing Ctrl-X again. What happens when you paste now? Can you get the old one back? Well, there might be special add-ons that allow you to keep copies of lots of things you pasted, but normally, you can only paste the last thing you put on the Clipboard.

“How are you going to remember that Ctrl-X means “cut”? That’s a great idea! Myself, I think of the X as a pair of open scissors, ready to cut—but your idea is great, and better than that, it’s your own, so you’ll remember it. How are you going to remember Ctrl-V as paste? Yeah, it’s tough to come up with a way to remember that, isn’t it? I think of it like the tip of one of those old glue bottles from grade school. On the other hand, what’s in between X and V on the keyboard? Now, go select some text again. Remember how? Good. This time, try pressing Ctrl-C, instead of Ctrl-X. Right; nothing seems to happen. And yet… move your cursor here, and try pasting. Yeah; it’s a copy, even though the original stays where it was. So Ctrl-C is Copy–C for copy, which makes some kind of sense, for a change—right there in between cut and paste. And there’s another way to remember; try pressing Alt-E. E stands for Edit, that’s right. What do you see listed on the menu, there?

“Now… there’s some documentation for these cut-and-paste features; it’s in the Windows documentation. Press F1. Great. Now look for “Keyboard Shortcuts”. There’s a list. How many different ways can you find of marking, cutting, and pasting text? Make me a list, and I’ll be back in ten minutes. Be ready to show me your report, and to demonstrate the ones you’ve discovered. Oh—and one more thing: create a list of at least five circumstances in which you might use this stuff, and we’ll talk about that too.”

It’s taken quite a while to type this example. It would take considerably less time to do it. Moreover, the trainee would learn more quickly and more deeply, because she’d be interacting with the product herself, experiencing the actions in her body, literally incorporating them. The questions, the answers, and the interactions make the learning more sticky. The practice makes it more sticky still. I try to encourage people to create mnemonics to help remember. While conversing with the trainee, I’d also be observing her. She’ll be making plenty of mistakes; people do that when they’re learning. In fact, people should be encouraged to do that when they’re learning. (Brian Marick’s book Everyday Scripting in Ruby encourages deliberate mistakes, which is one of the reasons I like it so much.) When I see mistakes, I’ll give her a chance to feel the mistake, puzzle out a way around it, and then—if she needs it—help her get out of it. (“Let’s figure out how to fix that. Hit Alt-E again. What’s listed on the Edit menu? So what’s the keyboard shortcut for Undo?” “Oops, you pressed Ctrl-Z one time too many… What does Ctrl-Y do?”). After a while, I won’t tell her how to do everything right away. I’ll start asking her to figure out some things on her own, and only help out when she’s stuck. That way she’ll be learning it for herself, which is more challenging but more affecting. I’ll keep trying to raise the bar, giving her more challenging assignments, increasing her independence while also increasing her responsibility to report. When I see her make a mistake, I might correct her right away, or I might let her continue with the mistake for a bit until she encountered some kind of minor consequence. That makes the learning even more sticky. Finally, I’d cut the conversation short if she told me that she already knew how to cut and paste—but I’d get her to show me that she knew it, and I’d give her some more exercises so that I could observe her.

The trouble that I see with “passing on a script to a new recruit” is that most test scripts I’ve seen are very long on what to do or how to do it, and correspondingly short on why to do it and how to generalize it. That is, they’re short on motivation, and they’re short on risk, and they rarely include a mission to learn. In our teaching, when we’re working with test groups, and when we’re coaching individual testers, James and I put less emphasis on techniques. Instead, we focus on making sure that people have the skills to know what to do when they receive a testing mission, whether that mission is written down in detail or not.

If there are specific things that I want a tester to look for, I’ll provide specific instructions (either written or spoken), but mostly I want to set out on their own to discover things. I want to encourage testers to vary their work and their observations. Variation makes it more that they will find problems that would go off hiding, were we to focus on the scripts. As a bonus, giving testers a concise, open-ended task keeps writing to a minimum, which saves on development cost, maintenance cost, and opportunity cost.

We’ll finish up this series in the next post.

Questioning Test Cases, Part 1

Monday, April 4th, 2011

Over the years, LinkedIn seems to have replaced comp.software.testing as the prime repository for wooly thinking and poorly conceived questions about testing.

Recently I was involved in a conversation with someone who, at least, seemed to be more articulate than most of the people on LinkedIn. Alas, I’ve since lost the thread, and after some searching I’ve been unable to find it. No matter: the points of the discussion are, to me, are still worth addressing. She was asking a question about how much time to allocate to writing test cases before starting testing, and I questioned the usefulness of doing that. The first part of her reply went like this:

I would like to point out that I doubt anyone wants to write something that they don’t need to write.

I agree that most people probably don’t want to write things they don’t need to write. But they often feel compelled, or are compelled to write things they don’t need to write.

I find value in writing test cases for a number of reason. One is that I train more junior engineers in testing and it is a good method to have them execute tests that I have written so they learn how a good test plan is put together.

If that were so, wouldn’t your junior engineers learn even more from writing test cases themselves, and getting feedback on their design and their writing? There’s a feedback loop in the design of a test, the execution of a test, the interpretation of a test result, and the learning that happens between them; wouldn’t it be a good idea to keep the feedback loop—and the learning—as rapid as possible? Wouldn’t your junior engineers learn still more from actually testing—under your close supervision, at first, and then with the freedom and responsibility to act more independently as they gain skill? You might want to have a look at this article: http://www.developsense.com/articles/2008-06-KnowWhereYourWheelsAre.pdf.

There’s a common misconception that testing happens in the characters of a written test case. It doesn’t. Testing happens in the mind and actions of the tester. It happens in the design of the test, in the execution of the test, in the observation and interpretation of the outcome of the test. Testing happens in the discovery of problems, in the investigation of those problems, and the learning about those problems and the product. At most, a fraction of this can be written down.

A test is far less something you execute, and far more a line of inquiry that you follow. To me, a good test case is idea-stuff; it’s a question that we want to ask of the program, based on some motivating idea about discovery or risk. In my observation, in writing test cases, people generally write down the least important stuff. They appear to be trying to program the tester’s actions, rather trying than to prime the tester’s thinking and observation.

Moreover, a test plan something quite different from a pile of test cases.

Secondly, [writing test cases] communicates the testing coverage with everyone involved in developing the software. If you are a contractor, this is very important since you want to leave with the client feeling like you did your job and they have the documentation to prove that they have done due diligence if they shop their company around or look for for VC money.

Were you, as a contractor, given the mission to produce test scripts specifically, or is that a mission that you have inferred? Bear in mind, I’ve been witness to many takeovers, as a program manager at a company where our senior managers were ambitiously acquiring technologies, products and companies, and as a consultant to several companies that were taken over by larger companies. In no case did anyone ever ask to see any test case documentation. Those who are investigating the company to be acquired typically don’t go to that level of detail. In my experience, alas, due diligence largely doesn’t happen at all. I’m puzzled, too, by the appeal to the least likely instances in which people might interact with test documention, rather than the everyday.

Meanwhile, there are many ways to communicate test coverage. (For example, see here, here, and here.) There are also many ways to fool yourself (and others) into believing that the more documentation the more coverage, or the more specific the documentation the more coverage—especially when that documentation is prospective, rather than retrospective. In Rapid Testing, we don’t encourage people to eliminate documentation. We encourage people to reduce documentation to what is actually necessary for the mission, and to eliminate wasteful documentation.

We focus on documenting test ideas concisely; on producing coverage outlines and risk lists that can be used to guide, rather than control a tester and her line of inquiry; on producing records of the tester’s thought process, observations, risk ideas, motivations, and so forth (see below). The goal is to capture things far more important than the tester’s mechanical actions. If someone wants a record of that, we recommend video capture software (tools like BB Test Assistant or Camtasia). An automatic log allows the tester to focus on testing the product and recording the ideas, rather than splitting focus between operating the product and writing about operating the product.

You can find examples of the kind of test documentation I’m talking about here, in the appendices to the Rapid Software Testing class, starting at page 47. Note that each example has varying degrees of polish and formal structure. Sometime it’s highly informal, used to aid memory, to frame a conversation, or trigger ideas. Sometimes it’s more formal and polished, when the audience is outside the test group. The over-riding point is to fit the documentation to the mission and to the task at hand.

More to come…