Blog Posts for the ‘Documentation’ Category

Breaking the Test Case Addiction (Part 3)

Thursday, January 17th, 2019

In the previous post, “Frieda”, my coaching client, asked about producing test cases for auditors or regulators. In Rapid Software Testing (RST), we find it helpful to frame that in terms of formal testing.

Testing is formal to the degree that it must be done in a specific way, or to verify specific facts. Formal testing typically has the goal of confirming or demonstrating something in particular about the product. There’s a continuum to testing formality in RST. My version, a tiny bit different from James Bach‘s, looks like this:

Some terminology notes: checking is the process of operating and observing a product; applying decision rules to those observations; and then reporting on the outcome of those rules; all mechanistically, algorithmically. A check can be turned into a formally scripted process that can be performed by a human or by a machine.

Procedurally scripted test cases are instances of human checking, where the tester is being substantially guided by what the script tells her to do. Since people are not machines and don’t stick to the algorithms, people are not checking in the strictest sense of our parlance.

A human transceiver is someone doing things based only on the instructions of some other person, behaving as that person’s eyes, ears, and hands.

Machine checking is the most formal mode of testing, in that machines perform checks in entirely specific ways, according to a program, entirely focused on specific facts. The motivation to check doesn’t come from the machine, but from some person. Notice that programs are formal, but programming is an informal activity. Toolsmiths and people who develop automated checks are not following scripts themselves.

The degree to which you formalize is a choice, based on a number of context factors. Your context guides your choices, and both of those evolve over time.

One of the most important context factors is your mission. You might be in a regulated environment, where regulators and auditors will eventually want you to demonstrate specific things about the product and the project in a highly formal way. If you are in that context, keeping the the auditors and the regulators happy may require certain kinds of formal testing. Nonetheless, even in that context, you must perform informal testing—lots of it—for at least two big reasons.

The first big reason is to learn the about the product and its context to prepare for excellent formal testing that will stand up to the regulators’ scrutiny. This is tied to another context factor: where you are in the life of the project and your understanding of the product.

Formal testing starts with informal work that is more exploratory and tacit, with the goal of learning; less scripted and explicit, with the goal of demonstrating. All the way along, but especially in between those poles, we’re searching for problems. No less than the Food and Drug Administration emphasizes how important this is.

Thorough and complete evaluation of the device during the exploratory stage results in a better understanding of the device and how it is expected to perform. This understanding can help to confirm that the intended use of the device will be aligned with sponsor expectations. It also can help with the selection of an appropriate pivotal study design.

Section 5: The Importance of Exploratory Studies in Pivotal Study Design
Design Considerations for Pivotal Clinical Investigations for Medical Devices
Guidance for Industry, Clinical Investigators, Institutional Review Boards
and Food and Drug Administration Staff

The pivotal stage of device development, says the FDA, focuses on developing what people need to know to evaluate the safety and effectiveness of a product. The pivotal stage usually consists of one or more pivotal studies. In other words, the FDA acknowledges that development happens in loops and cycles; that development is an iterative process.

James Bach emphasized this in his talk The Dirty Secret of Formal Testing and it’s an important point in RST. Development is an iterative process because at the beginning of any cycle of work, we don’t know for sure what all the requirements are; what they mean; what we can get; and how we might decide that we’ve got it. We don’t really know that until we’ve until we’ve tested the product… and we don’t know how to test the product until we’ve tried to test the product!

Just like developing automated checks, developing formally scripted test cases is an informal process. You don’t follow a script when you’re interpreting a specification; when you’re having a conversation with a developer or a designer; when you’re exploring the product and the test space to figure out where checking might be useful or important. You don’t follow a script when you recognize a new way of using tools to learn something about the product, and apply them. And you don’t follow a script when you investigate bugs that you’ve found—either during informal testing or the formal testing that might follow it.

If you try to develop formal procedural test cases without testing the actual product, they stand a good chance of being out of sync with it. The dirty secret of format testing is that all good formal testing begins with informal testing.

It might be a very good idea for programmers to develop some automated checks that helps them with the discipline of building clean code and getting rapid feedback on it. It’s also a good idea for developers, designers, testers, and business people to develop clear ideas about intentions for a product, envisioning success. It might also be a good idea to develop some automated checks above the unit level and apply them to the build process—but not too many and certainly not too early. The beginning of the work is usually a terrible time for excessive formalization.

Which brings us to the second big reason to perform informal testing continuously throughout any project: to address the risk that our formal testing to date will fail to reveal how the product might disappoint customers; lose someone’s money; blow something up; or hurt or kill people. We must be open to discovery, and to performing the testing and investigation that supports it, all the way throughout the project, because neither epiphanies nor bugs follow scripts or schedules.

The overarching mission of testing is focused on a question: “are there problems that threaten the value of the product, or the on-time, successful completion of our work?” That’s not a question that formal testing can ever answer on its own. Fixation on automated checks or test cases runs the risk of displacing time for experimentation, exploration, discovery, and learning.

Next time, we’ll look at an example of breaking test case addiction on a real medical device project. Stay tuned.

Breaking the Test Case Addiction (Part 2)

Wednesday, January 16th, 2019

Last time out, I was responding to a coaching client, a tester who was working in an organization fixated on test cases. Here, I’ll call her Frieda. She had some more questions about how to respond to her managers.

What if they want another tester to do your tests if you are not available?

“‘Your tests’, or ‘your testing’?”, I asked.

From what I’ve heard, your tests. I don’t agree with this but trying to see it from their point of view, said Frieda.

I wonder what would happen if we asked them “What happens when you want another manager to do your managing if you are not available?” Or “What happens when you want another programmer to do programming if the programmer is not available?” It seems to me that the last thing they would suggest would be a set of management cases, or programming cases. So why the fixation on test cases?

Fixation is excessive, obsessive focus on something to the exclusion of all else. Fixation on test cases displaces people’s attention from other important things: understanding of how the testing maps to the mission; whether the testers have sufficient skill to understand and perform the testing; the learning comes from testing and that feeds back into more testing; whether formalization is premature or even necessary…

A big problem, as I suggested last time, is a lack of managers’ awareness of alternatives to test cases. That lack of awareness feeds into a lack of imagination, and then loops back into a lack of awareness. What’s worse is that many testers suffer from the same problem, and therefore can’t help to break the loop. Why do managers keep asking for test cases? Because testers keep providing them. Why do testers keep providing them? Because managers keep asking for them, because testers keep providing them…, and the cycle continues.

That cycle also continues because there’s an attractive, even seductive, aspect to test cases: they can make testing appear legible. Legibility, as Venkatesh Rao puts it beautifully here, “quells the anxieties evoked by apparent chaos”.

Test cases help to make the messy, complex, volatile landscape of development and testing seem legible, readable, comprehensible, quantifiable. A test case either fails (problem!) or passes (no problem!). A test case makes the tester’s behaviours seem predictable and clear, so clear that the tester could even be replaced by a machine. At the beginning of the project, we develop 782 test cases. When we’ve completed 527 of them, the testing is 67.39% done!

Many people see testing as rote, step-by-step, repetitive, mechanical keypressing to demonstrate that the product can work. That gets emphasized by the domain we’re in: one that values the writing of programs. If you think keypressing is all there is to it, it makes a certain kind of sense to write programs for a human to follow so that you can control the testing.

Those programs become “your tests”. We would call those “your checks—where checking is the mechanistic process of applying decision rules to observations of the software.

On the other hand, if you are willing to recognize and accept testing as a complex, cognitive investigation of products, problems, and risks, your testing is a performance. No one else can do just as you do it. No one can do again just what you’ve done before. You yourself will never do it the same way twice. If managers want people to do “your testing” when you’re not available, it might be more practical and powerful to think of it as “performing their investigation on something you’ve been investigating”.

Investigation is structured and can be guided, but good investigation can’t be scripted. That’s because in the course of a real investigation, you can’t be sure of what you’re going to find and how you’re going to respond to it. Checking can be algorithmic; the testing that surrounds and contains checking cannot.

Investigation can be influenced or guided by plenty of things that are alternatives to test cases:

Last time out, I mentioned almost all of these as things that testers could develop while learning about the product or feature. That’s not a coincidence. Testing happens in tangled loops and spirals of learning, analysis, exploration, experimentation, discovery, and investigation, all feeding back into each other. As testing proceeds, these artifacts and—more importantly—the learning they represent can be further developed, expanded, refined, overproduced, put aside, abandoned, recovered, revisited…

Testers can use artifacts of these kinds as evidence of testing that has been done, problems that have been found, and learning that has happened. Testers can include these artifacts in test reports, too.

But what if you’re in an environment where you have to produce test cases for auditors or regulators?

Good question. We’ll talk about that next time.

Breaking the Test Case Addiction (Part 1)

Tuesday, January 15th, 2019

Recently, during a coaching session, a tester was wrestling with something that was a mystery to her. She asked:

Why do some tech leaders (for example, CTOs, development managers, test managers, and test leads) jump straight to test cases when they want to provide traceability, share testing efforts with stakeholders, and share feature knowledge with testers?

I’m not sure. I fear that most of the time, fixation on test cases is simply due to ignorance. Many people literally don’t know any other way to think about testing, and have never bothered to try. Alarmingly, that seems to apply not only to leaders, but to testers, too. Much of the business of testing seems to limp along on mythology, folklore, and inertia.

Testing, as we’ve pointed out (many times), is not test cases; testing is a performance. Testing, as we’ve pointed out, is the process of learning about a product through exploration and experimentation, which includes to some degree questioning, studying, modeling, observation, inference, etc. You don’t need test cases for that.

The obsession with procedurally scripted test cases is painful to see, because a mandate to follow a script removes agency, turning the tester into a robot instead of an investigator. Overly formalized procedures run a serious risk of over-focusing testing and testers alike. As James Bach has said, “testing shouldn’t be too focused… unless you want to miss lots of bugs.”

There may be specific conditions, elements of the product, notions of quality, interactions with other products, that we’d like to examine during a test, or that might change the outcome of a test. Keeping track of these could be very important. Is a procedurally scripted test case the only way to keep track? The only way to guide the testing? The best way? A good way, even?

Let’s look at alternatives for addressing the leaders’ desires (traceability, shared knowledge of testing effort, shared feature knowledge).

Traceability. It seems to me that the usual goal of traceability is be able to narrate and justify your testing by connecting test cases to requirements. From a positive perspective, it’s a good thing to make those connections to make sure that the tester isn’t wasting time on unimportant stuff.

On the other hand, testing isn’t only about confirming that the product is consistent with the requirements documents. Testing is about finding problems that matter to people. Among other things, that requires us to learn about things that the requirements documents get wrong or don’t discuss at all. If the requirements documents are incorrect or silent on a given point, “traceable” test cases won’t reveal problems reliably.

For that reason, we’ve proposed a more powerful alternative to traceability: test framing, which is the process of establishing and describing the logical connections between the outcome of the test at the bottom and the overarching mission of testing at the top.

Requirements documents and test cases may or may not appear in the chain of connections. That’s okay, as long as the tester is able to link the test with the testing mission explicitly. In a reasonable working environment, much of the time, the framing will be tacit. If you don’t believe that, pause for a moment and note how often test cases provide a set of instructions for the tester to follow, but don’t describe the motivation for the test, or the risk that informs it.

Some testers may not have sufficient skill to describe their test framing. If that’s so, giving test cases to those testers papers over that problem in an unhelpful and unsustainable way. A much better way to address the problem would, I believe, would be to train and supervise the testers to be powerful, independent, reliable agents, with freedom to design their work and responsibility to negotiate it and account for it.

Sharing efforts with stakeholders. One key responsibility for a tester is to describe the testing work. Again, using procedurally scripted test cases seems to be a peculiar and limited means for describing what a tester does. The most important things that testers do happen inside their heads: modeling the product, studying it, observing it, making conjectures about it, analyzing risk, designing experiments… A collection of test cases, and an assertion that someone has completed them, don’t represent the thinking part of testing very well.

A test case doesn’t tell people much about your modeling and evaluation of risk. A suite of test cases doesn’t either, and typical test cases certainly don’t do so efficiently. A conversation, a list, an outline, a mind map, or a report would tend to be more fitting ways of talking about your risk models, or the processes by which you developed them.

Perhaps the worst aspect of using test cases to describe effort is that tests—performances of testing activity—become reified, turned into things, widgets, testburgers. Effort becomes recast in terms of counting test cases, which leads to no end of mischief.

If you want people to know what you’ve done, record and report on what you’ve done. Tell the testing story, which is not only about the status of the product, but also about how you performed the work, and what made it more and less valuable; harder or easier; slower or faster.

Sharing feature knowledge with testers. There are lots of ways for testers to learn about the product, and almost all of them would foster learning better than procedurally scripted test cases. Giving a tester a script tends to focus the tester on following the script, rather than learning about the product, how people might value it, and how value might be threatened.

If you want a tester to learn about a product (or feature) quickly, provide the tester with something to examine or interact with, and give the tester a mission. Try putting the tester in front of  

  • the product to be tested (if that’s available)
  • an old version of the product (while you’re waiting for a newer one)
  • a prototype of the product (if there is one)
  • a comparable or competitive product or feature (if there is one)
  • a specification to be analyzed (or compared with the product, if it’s available)
  • a requirements document to be studied
  • a standard to review
  • a user story to be expanded upon
  • a tutorial to walk through
  • a user manual to digest
  • a diagram to be interpreted
  • a product manager to be interviewed
  • another tester to pair with
  • a domain expert to outline a business process

Give the tester the mission to learn something based on one or more of these things. Require the tester to take notes, and then to provide some additional evidence of what he or she learned.

(What if none of the listed items is available? If none of that is available, is any development work going on at all? If so, what is guiding the developers? Hint: it won’t be development cases!)

Perhaps some people are concerned not that there’s too little information, but too much. A corresponding worry might be that the available information is inconsistent. When important information about the product is missing, or unclear, or inconsistent, that’s a test result with important information about the project. Bugs breed in those omissions or inconsistencies.

What could be used as evidence that the tester learned something? Supplemented by the tester’s notes, the tester could

  • have a conversation with a test lead or test manager
  • provide a report on the activities the tester performed, and what the tester learned (that is, a test report)
  • produce a description of the product or feature, bugs and all (see The Honest Manual Writer Heuristic)
  • offer proposed revisions, expansions, or refinements of any of the artifacts listed above
  • identify a list of problems about the product that the tester encountered
  • develop a list of ways in which testers might identify inconsistencies between the product and something desirable (that is, a list of useful oracles)
  • report on a list of problems that the tester had in fulfilling the information mission
  • in a mind map, outline a set of ideas about how the tester might learn more about the product (that is, a test strategy)
  • list out a set of ideas about potential problems in the product (that is, a risk list)
  • develop a set of ideas about where to look for problems in product (that is, a product coverage outline)

Then review the tester’s work. Provide feedback, coaching and mentoring. Offer praise where the tester has learned something well; course correction where the tester hasn’t. Testers will get a lot more from this interactive process than from following step-by-step instructions in a test case.

My coaching client had some more questions about test cases. We’ll get to those next time.

Oracles from the Inside Out, Part 5: Oracles as References as Media

Tuesday, September 15th, 2015

Try asking testers how they recognize problems. Many will respond that they compare the product to its specification, and when they see an inconsistency between the product and its specification, they report a bug. Others will talk about creating and running automated checks, using tools to compare output from the product to specific, pre-determined, expected results; when the product produces a result inconsistent with expectations, the check identifies a bug which the tester then reports to the developer or manager. It might be tempting to think of this as moving from the bottom right quadrant on this table to the bottom left.

Traditional talk about oracles refers almost exclusively to references. W.E. Howden, who introduced “oracle” as a term of testing art, said that an oracle as “an external mechanism which can be used to check test output for correctness”. Yet thinking of oracles in terms of correctness leads to some pretty serious problems. (I’ve outlined some of them here).

In the Rapid Software Testing namespace, we take a different, broader view of oracles. Rather than focusing on correctness, we focus on problems: an oracle is a means by which we recognize a problem when we encounter one during testing. Checking for correctness, as Howden puts it, may severely limit our capacity to notice many kinds of problems. A product or service can be correct with respect to some principle, but have plenty of problems that aren’t identified by that principle; and a product can produce incorrect results without the incorrectness representing a problem for anyone. When testers fixate on documented requirements, there’s a risk that they will restrict their attention to looking for inconsistencies with specific claims; when testers fixate on automated checks, there’s a risk that they will restrict their focus to inconsistency with a comparable algorithm. Focus your attention too narrowly on a particular oracle—or a particular class of oracle—and you can be confident of one thing: you’ll miss lots of bugs.

Documents and tools are media. In the most general sense, “medium” is descriptive of something in between, like “small” and “large”. But “medium” as a noun, a medium, can be between lots of things. A communication medium like radio sits between performers and an audience; a psychic medium, so the claim goes, provides a bridge between a person and the spirit world; when people want to exchange things of value, they use often use money as a medium for the exchange. Marshall McLuhan, an early and influential media theorist, said that a medium is anything that humans create or use to effect change. Media are tools, technologies that people use to extend, enhance, enable, accelerate, or intensify human capabilities. Extension is the most obvious and prominent effect of media. Most people think of media in terms of communications media. A medium can certainly be printed pages or television screens that enable messages to be conveyed from one person to another. McLuhan viewed the phonetic alphabet as a technology—a medium that extended the range of speech over great distances and accelerated its transmission. But a cup of coffee is a medium too; it extends alertness and wakefulness, and when consumed socially with others, it can extend conversation and friendliness. Media, placed between a product and our observation of it, extend our capacity to recognize bugs.

McLuhan emphasized that media change things in many different ways at the same time. In addition to extending or enabling or accelerating our capabilities, McLuhan said, every new medium obsolesces one or more existing media, grabbing our attention away from old things; every new medium retrieves notions of formerly obsolescent media, making old things new again. McLuhan used heat as a metaphor for the degree to which media require the involvement of the user; a “cool” medium like radio, he said, requires the listener to participate and fill in the missing pieces of the experience; a “hot” medium like a movie, provides stimulation to the ear and especially the eye, requiring less engagement from the viewer. Every medium, when “overheated” (McLuhan’s term for a medium that has been stretched or extended beyond its original or intended capacity), reverses into the opposite of what it might have been originally intended to accomplish. Socrates (and the King of Egypt) recognized that writing could extend memory, but could reverse into forgetfulness (see Plato’s dialogue Phaedrus). Coffee extends alertness and conversation, but too much of it and people become too wired work and too addled to chat. A medium always draws attention to itself to some degree; an overheated medium may dazzle us so much that we begin to ignore what it contains or what we intended it to do for us. More importantly, a medium affects us. This is one of the implications of McLuhan’s famous but oblique statement “the medium is the message”. By “message”, he means “the change of scale or pace or pattern” that a new invention or innovation “introduces into human affairs.” (This explanation comes from Mark Federman, to whom I’m indebted for explaining McLuhan’s work to me over the years.)

When we pay attention, we can easily observe media overheating both in talk about testing and development work and in the work itself. Documents and tools frequently dominate conversations. In some organizations, a problem won’t be considered a bug unless it is inconsistent with an explicit statement in a specification or requirements document. Yet documents are only partial representations, subsets, of what people claim to have known or believed at some point in time, and times change. In some places, testing work is dominated by automated checking. Checks can be very valuable, providing great precision and fast feedback. But checks may focus on functional aspects of the product, and less on other parafunctional attributes.

McLuhan’s work emphasizes that media are essentially neutral, agnostic to our purposes. It is our engagement with media that produces good or bad outcomes—good and bad outcomes. Perhaps the most important implication of McLuhan’s work is that media amplify whatever we are. If we’re fabulous testers, our tools extend our capabilities, helping us to be even more fabulous. But if we’re incompetent, tools extend our incompetence, allowing us to do bad testing faster and worse than we’ve ever been able to do it before. To the degee that we are inclined to avoid conflict and arguments, we will use documents to help us avoid conflict and arguments; to the degree that we are inclined to welcome discussion and the refinement of ideas, then documents can help us do that. If we are disposed to be alert to a wide range of problems, automated checks will help us as we diversify our scope; if we are oblivious to certain kinds of problems in the product, automated checks will amplify our oblivion.

Reference oracles—documents, checking tools, representative data, comparable products—are unquestionably media, extending all of the other kinds of oracles: private and shared mental models, both private and shared feelings, conversations with others, and principles of consistency. How can we evaluate them? What do we use them for? And how can we use them to help us find problems without letting them overwhelm or displace all the other ways we might have of finding problems? That’s the subject of the next post.

Braiding The Stories (Test Reporting Part 2)

Friday, February 24th, 2012

We were in the middle of a testing exercise at the Amplifying Your Effectiveness conference in 2005. I was assisting James Bach in a workshop that he was leading on testing. He presented the group with a mysterious application written by James Lyndsay—an early version of one of the Black Box Test Machines. “How many test cases would you need to test this application?” he asked.

Just then Jerry Weinberg wandered into the room. “Ah! Jerry Weinberg!” said James. “One of the greatest testing experts in the world! He’ll know the answer to this one. How many test cases would you need to test this application, Jerry?”

Jerry looked at the screen for a moment. “Three,” he said, firmly and decisively.

James knew to play along. “Three?!“, he said, in a feigned combination of amazement, uncertainty, and curiosity. “How do you know it’s three? Is it really three, Jerry?”

“Yes,” said Jerry. “Three.” He paused, and then said drily, “Why? Were you expecting some other number?”

In yesterday’s post, I was harshly critical of pass vs. fail ratios, a very problematic yet startlingly common way of estimating the state of the product and the project. When I point out the mischief of pass vs. fail ratios, some people object. “In the real world,” they say, “we have to report pass vs. fail ratios to our managers, because that’s what they want.” Yet bogus reporting is antithetical to the “real world”. Pass vs. fail ratios come from the the fake world, a world where numbers have magical properties to soothe troubled and uncertain souls. Still, there’s no question that managers want something. It’s our mandate to give them something of value.

Some people say that managers want numbers because they want to know that we’re measuring. I’ve found two ways of thinking about measurement that have been very useful to me. One is the definition from Kaner and Bond’s splendid paper “Software Engineering Metrics: What Do They Measure and How Do We Know?”: “Measurement is the empirical, objective assignment of numbers, according to a rule derived from a model or theory, to attributes of objects or events with the intent of describing them.” I think that’s a superb definition of quantitative measurement, and the paper includes a set of probing questions to test the validity of a quantitative measurement. Pass vs. fail ratios fall down badly when they’re subjected to those tests.

Jerry Weinberg offers another definition of measurement that I think is more in line with what managers really want: “Measurement is the art and science of making reliable (and significant) observations.” (The main part of the definition comes from Quality Software Management, Vol. 2: First-Order Measurement; the parenthetical comes from recent correspondence over Twitter.) That’s a more general, inclusive definition. It incorporates Kaner and Bond’s notion of quantitative measurement, but it’s more welcoming to qualitative, first-order approaches. First-order measurement, as Jerry describes it, provides answers to questions like “What seems to be happening? and What should I do now?” It entails a minimum of fuss, and tends to be direct, unobtrusive, inexpensive, and qualitative, leading either to immediate action or a decision to seek more information. It’s a common, misleading, and often expensive mistake in software development to leap over first-order measurement and reporting in favour of second-order—less direct, more quantified, more abstract, and based on more elaborate and vulnerable models.

My experience, as a tester, a programmer, a program manager, and a consultant, tells me that to manage a project well, you need a good deal of immediate and significant information. “Immediate” here doesn’t only mean timely; it also means unmediated, without a bunch of stuff getting in between you and the observation. In particular, managers need to know about problems that threaten the value of the product and the on-time, successful completion of the project. That knowledge requires more than abstract data; it requires information. So, as testers, how can we inform the decision-makers? In our Rapid Software Testing class, James Bach and I have lately taken to emphasizing this: We must learn to describe and report on the product, our testing, and the quality of our testing. This involves constructing, editing, narrating, and justifying a story in three lines that weave around each other like a braid. Each line, or level, is its own story.

Level 1: Tell the product story. The product story is a qualitative report on how the product can work, how it fails, and how it might fail in ways that matter to our clients. “Working”, “failure”, and “what matters” are all qualitative evaluations. Quality is value to some person; in a business setting, quality is value to some person who matters to the business. A qualitative report about a product requires us to relate the nature of the product, the people who matter, and the presence or absence of value, risks, and problems for those people. Qualitative information makes it possible for our clients to make informed decisions about quality.

Level 2: To make the product story credible, tell the testing story. The testing story is about how we configured, operated, observed, and evaluated the product; what we actually did and what we actually saw. The testing story gives warrant to the product story; it helps our clients understand why they should believe and trust the product story we’re giving. The testing story is centred around the coverage that we obtained and the oracles that we applied. Coverage is the extent to which we’ve tested the program; it’s about where we’ve looked and how we’ve looked, and it’s also about what’s uncovered—where we might not have looked yet, and where we don’t intend to look. Oracles are central to evaluation; they’re the principles and mechanisms that allow us to recognize a problem. The product story will likely feature problems in the product; the testing story, where necessary, includes an account of how we knew they were problems, for whom they would be problems, and inferences about how serious the problems it might be. We can make inferences about the significance of problems, but not ultimate conclusions, since the decision of what matters and what constitutes a problem lies with the product owner. The product story and our clients’ reactions to it will influence the ongoing testing story, and vice versa.

Level 3: To make the testing story credible, tell a story about the quality of the testing. Just as the product story needs warrant, so too does the testing story. To tell a story about the quality of testing requires us to describe why the testing we’ve done has been good enough, and why the testing we haven’t done hasn’t been so important so far. The quality-of-testing story includes details on what made testing harder or slower, what made the product more or less testable, what the risks and costs of testing are, and what we might need or recommend in order to provide better, more accurate, more timely information. The quality-of-testing story will shape and be shaped by the other two stories.

Develop skills to tell and frame stories. People sometimes justify presenting invalid numbers in lieu of stories by saying that numbers are “efficient”. I think they mean “fast”, since efficiency of communication depends not only on speed, but also on value, relevance, validity, and the level of detail your client needs. In order to frame stories appropriately and hit the right level of detail…

Don’t think data feed; think the daily news. Testing is like investigative journalism, researching and delivering stories to people. The newspaper business knows how to direct attention efficiently to the stories in which we’re interested, such that we get the level of detail that we seek. Some of those strategies include:

  • Headlines. A quick glance over each page tells us immediately what, in the editors’ judgement, are the most salient aspects of any given story. Headlines come in different sizes, relative to the editors’ assessment of the importance of the story.
  • Front page. The paper comes folded. The stories that the paper deems most important to its reader are on the front page, above the fold. Other important stories are on the front page below the fold. The page is laid out to direct our attention to what we find most relevant, and to allow us to focus and refocus on items of interest.
  • Continuation. When an entire story is too long to fit on the front page, it’s abbreviated and the story continues elsewhere. This gives the reader the option of following the story or looking at other items on the front page.
  • Coverage areas. The newspaper is organized into sections (hard news, business, sports, life and leisure, arts, real estate, cars, travel, and so forth). Each section comes with its own front page, which generally includes headlines and continuations of its own.
  • Structured storytelling. Newspaper stories tend to be organized in spiralling levels of detail, such that the story is set up to follow the inverted pyramid (the link is well worth reading). The story typically begins with the most newsworthy information, usually immediately addressing the five W questions—who, what where, why, and when, plus how—and the the story builds from there. The key is that the reader can absorb information to the level of detail she seeks, continuing to the end of the story or jumping out when she’s satisfied.
  • Identifying who is involved and who is affected. Reporters and editors contextualize their stories. Just as in testing, people are the most important element of the context. A story is far more compelling when it affects the reader or people that the reader cares about. A good story often helps to clarify why the reader should care.
  • Varying approaches to delivering information. Newspapers often use a picture to helps to illustrate or emphasize an important aspect of a story. In the business or sports sections, where quantitative data is often crucial, information may be organized in tables, or trends may be illustrated with charts. Notice that the stories—first-order reports—are always given greater prominence than the tables of stock quotes league standings, and line scores.
  • Sidebars. Some stories are illuminated by background information that might break the flow of the main story. That information is presented in parallel; in another thread, as we might say.
  • Daily (and in the world of the Web, continuous) delivery of information. My newspaper arrives at a regular time each day, a sort of daily heartbeat for the news cycle. The paper’s Web site is updated on a continuous basis. Information is available both on a supply and a demand basis; both when I expect it and when I seek it.
  • Identifiable sources. Well-researched stories gain credibility by identifying how, where, when, and from whom the information was obtained. This helps to set up degrees of trust and skepticism in the reader.

One important note: These approaches apply to more than text. Testers need to extend these patterns not only to written or mechanical forms, but to oral discourse.

I’ll have more suggestions and additional parallels between test reporting and newspapers in the next post in this series.

Scripts or No Scripts, Managers Might Have to Manage

Wednesday, December 21st, 2011

A fellow named Oren Reshef writes in response to my post on Worthwhile Documentation.

Let me be the devil’s advocate for a post.

Not having fully detailed test steps may lead to insufficient data in bug reports.

Yup, that could be a risk (although having fully detailed steps in a test script might also lead to insufficient data in bug reports; and insufficient to whom, exactly?).

So what do you do with a problem like that? You manage it. You train the tester, reminding her of the heuristic that each problem report needs a problem description; an example of something that shows the problem; and why she thinks it’s a problem (that is, the oracle; the principle or mechanism by which the tester recognizes the problem). Problem, example, and why; PEW. You praise and reward the tester for producing reports that follow the PEW heuristic; you critique reports that don’t have them. You show the tester lots of examples of bug reports, and ask her to differentiate between the good ones and the bad ones, why each one might be consider good or bad, and in what ways. If the tester isn’t getting it, you have the tester work with and be coached by someone who does get it. The coach talks the tester through the process of identifying a problem, deciding why it’s a problem, and outlining the necessary information. Sometimes it’s steps and specific data; sometimes the steps are obvious and it’s only the data you need to specify; sometimes the problem happens with any old data, and it’s the steps that are important. And sometimes the description of the problem contains enough information that you need supply neither steps nor data. As a tester under time pressure, she needs to develop the skill to do this rapidly and well—or, if nothing works, she might have to find a job for which she is better suited.

You can argue that a good tester should include the needed information and steps in her bug report, but this raise (at least) two problems:

– The same information may be duplicated across many bugs, and even worst it will not be consistent.

As a manager, I can not only argue that a tester should include the needed information; I can require that a tester include the needed information. Come on, Mr. Advocate… this is a problem that a capable tester and a capable test manager (and presumably your client) can solve. If “the same” information is duplicated across many bugs, might that be an interesting factor worth noting? A test result, if you will? Will this actually persist for long without the test manager (or test leads, or the test team) noticing or managing it?

And in any case, would a script solve the problem that you post above? If you can solve that problem in a script, can you solve it in a (set of) bug report(s)?

Writing test steps is not as trivial as it sounds (for example due to cognitive biases, or simply by overlooking steps that seems obvious to you), and to be efficient they also need to be peer reviewed and tested. You don’t want that to happen in a bug report.

“Writing test steps is not as trivial as it sounds.” I know. It’s non-trivial in terms of time, and it’s non-trivial in terms of skill, and it’s non-trivial in terms of cost. That’s why I write about those problems. That’s why James Bach writes about them.

Again: how do you solve problems like testers providing inefficient repro steps? You solve it with training, practice, coaching, review, supervision, observation, interaction… that is, if you don’t like the results you’re getting, you steer the testers in the direction you want them to go, with leadership and management.

The tester may choose the same steps over and over, or steps that are easier for her but does not represent real customers.

Yes, I often hear things like this to justify poor testing. “Real customers” according to whom? It seems as though many organizations have a problem recognizing that hackers are real; that people under pressure are real; that people who make mistakes are real; that people who can become distracted are real. That people who get up and go away from the keyboard, such that a transaction times out are real.

Is it the role of testers to behave always like idealized “real” customers? That’s like saying that it’s the role of airport security to assume that all of the business class customers are “real” business people. I’d argue that it’s nice for testers to be able to act like customers, but it’s far more important for testers to act like testers. It’s the tester’s role to identify important vulnerabilities in the product. Sometimes that involves behaving like a typical customer, and sometimes it involves behaving like an atypical customer, or and sometimes it involves behaving like someone who is not a customer at all. But again, mostly it involves behaving like a tester.

Again you may argue that a good tester should take all that into account, but it’s not that simple to verify it especially for tests involving many short trivial steps.

Maybe it isn’t that simple. If that’s a problem, what about logging? What about screen capture tools? Such tools will track activities far more accurately than a script the tester allegedly followed. After all, a test script is just a rumour of how something should be done, and the claim that the script was followed is also a rumour. What about direct supervision and scrutiny? What about occasional pairing? What about reviewing the testers’ work? What about providing feedback to testers, while affording them both freedom and responsibility?

And would scripts solve that problem when (for example) you’re a recording bug that you’ve just discovered (probably after deviating from a script)? How, exactly? What happens when a problem identified by a script is fixed? Does the value of the script stay constant over time?

Detailed test steps (at least to some extent) might important if your test activity might be transferred to another offshore team someday (happened to me a few weeks ago, I sent them a test document with only high level details and hoped for the best), or your customer requires in-depth understanding of your tests (a multi-billion Canadian telecommunication company insisted on getting those from us during the late 90’s, we chose the least readable TestDirector export format and shipped it to them…).

Ah, yes. “I sent them a test document with only high level details and hoped for the best.” What can I say about “hope” as a management approach? Does a pile of test scripts impart in-depth understanding? Or are they (as I suspect) a way of responding to a question that you didn’t know how to answer, which was in fact a question that the telco didn’t know how to ask?

Going through some set of actions by rote is not a test. A test script is not a test. A test is what you think and what you do. It is a complex, cognitive activity that requires the presence or the development of much tacit knowledge. Raw data or raw instructions at best provide you with a miniscule fraction of what you need to know. If someone wanted in-depth understanding of how a retail store works, would you send them a pile of uncontextualized cash register receipts?

The Devil’s Advocate never seems to have a thoughtful manager for a client. I would suggest that a tester neither hire nor work for the devil.

Thank you for playing the devil’s advocate, Oren.

What Exploratory Testing Is Not (Part 5): Undocumented Testing

Wednesday, December 21st, 2011

This week I had the great misfortune of reading yet another article which makes the false and ridiculous claim that exploratory testing is “undocumented”. After years and years of plenty of people talking about and writing about and practicing excellent documentation as part of an exploratory testing approach, it’s depressing to see that there are still people shovelling fresh manure onto a pile that should have been carted off years ago.

Like the other approaches to test activities that have been discussed in this series (“touring“, “after-everything-else“, “tool-free“, and “quick testing“), “documented vs. undocumented” is in a category orthogonal to “exploratory vs. scripted”. True: usually scripted activities are performed by some agency following a set of instructions that has been written down somewhere. But we could choose to think of “scripted” in a slightly different and more expansive way, as “prescriptive”, or “mimeomorphic“. A scripted activity, in this sense, is one for which the actions to be performed have been established in advance, and the choices of the actions are not determined by the agency performing them. In that sense, a cook at McDonalds doesn’t read a script as he prepares your burger, but the preparation of a McDonald’s burger is a highly scripted activity.

Thus any kind of testing can be heavily documented or completely undocumented. A thoroughly documented test might be highly exploratory in nature, or it might be highly scripted.

In the Rapid Software Testing class, James Bach and I point out that when someone says “that should be documented”, what they’re really saying is “that should be documented if and how and when it serves our purposes.” So, let’s start by looking at the “when”.

When we question anything in order to evaluate it, there are moments in the process in which we might choose to record ideas or actions. I’ve broken these down into three basic categories that I hope you find helpful:

  • Before

  • During

  • After

There are “before”, “during”, and “after” moments with respect to any test activity, whether it’s a part of test design, test execution, result interpretation, or learning. Again, a hallmark of exploratory testing is the tester’s freedom and responsibility to optimize the value of the work as it’s happening. That means that when it’s important to record something, the tester is not only welcome but encouraged to

  • pick up a pen
  • take a screen shot
  • launch a session of Rapid Reporter
  • create or update a mind map
  • fire up a screen recorder
  • initiate logging (if it doesn’t start by default on the product you’re testing—and if logging isn’t available, you might consider identifying that as a testability problem and a related product and project risk)
  • sketch out a flowchart diagram
  • type notes into a private or shared repository
  • add to a table of data in Excel
  • fire off a note to a programmer or a product owner
and that’s an incomplete list. But they’re all forms of documentation.

Freedom to document at will should also mean that the tester is free to refrain from documenting something when the documentation doesn’t add value. At the same time, the tester is responsible and accountable for that decision. In Rapid Testing, we recommend writing down (or saving, or illustrating) only the things that are necessary or valuable to the project, and only when the value of doing so exceeds the cost. This doesn’t mean no documentation; it means the most informative yet fastest and least expensive documentation that completely fulfils the testing mission. Integrating that with testing work leads, we hold, to excellent testing—but it takes practice and skill.

For most test activities, it’s possible to relay information to other people orally, or even sometimes by allowing people to observe our behaviour. (At the beginning of the Rapid Testing class, I sometimes silently hold aloft a 5″ x 8″ index card in landscape orientation. I fold it in half along the horizontal axis, and write my first name on one side using a coloured marker. Everyone in the class mimics my actions. Without a single word of instruction being given or questions being asked, either verbally or in writing, the mission has been accomplished: each person now has a tent card in front of him.)

There’s a potential risk associated with an exploratory approach: that the tester might fail to document something important. In that case, we do what skilled people do with risk: we manage it. James Bach talks at length about managing exploratory testing sessions here. Producing appropriate documentation is partly a technical process, but the technical considerations are dominated by business imperatives: cost, value, and risk. There are also social considerations, too. The tester, the test lead, the test manager, the programmers, other managers, and the product owner determine collaboratively what’s important to document and what’s not so important with respect to the current testing mission. In an exploratory approach, we’re more likely to be emphasizing the discovery of new information. So we’re less likely to spend time on documenting what we will do, more likely to document what we are doing and what we have done. We could do a good deal of preparatory reading and writing, even in an exploratory approach—but we realize that there’s an ever-increasing risk that new discoveries will undermine the worth of what we write ahead of time.

That leads directly to “our purposes”, the task that we want to accomplish when documenting something. Just as testing itself has many possible missions, so too does test documentation. Here’s a decidedly non-exhaustive list, prepared over a couple of minutes:

  • to express testing strategy and tactics for an entire project, or for projects in general
  • to keep a set of personal notes to help structure a debriefing conversation
  • to outline testing activities for a test cycle
  • to report on activities during testing execution
  • to outline attributes of a particular quality criterion
  • to catalogue ideas about risk
  • to describe test coverage
  • to account for the work that we’ve done
  • to program a machine to perform a given set of actions
  • to alert people to potential problems in the product
  • to guide a tester’s actions over a test session
  • to identify structures in the application or service
  • to provide a description of how to use a particular test tool that we’ve crafted
  • to describe the tester’s role, skills, and qualifications
  • to explain business rules to someone else on the team
  • to outline scenarios in which the product might be used or tested
  • to identify, for a tester, a specific, explicit sequence of actions to perform, input to provide, and observations to make

That last item is the classic form of highly scripted testing, and that kind of documentation is usually absent from exploratory testing. Even so, a tester can take an exploratory approach using a script as a point of departure or as a reference, just as you might use a trail map to help guide an off-trail hike (among other things, you might want to discover shortcuts or avoid the usual pathways). So when someone says that “exploratory testing is undocumented”, I hear them saying something else. I hear them saying, “I only understand one form of test documentation, and I’ve successfully ignored every other approach to it or purpose for it.”

If you look in the appendices for the Rapid Software Testing class (you can find a .PDF at http://www.satisfice.com/rst-appendices.pdf), you’ll see a large number of examples of documentation that are entirely consistent with an exploratory approach. That’s just one source. For each item in my partial list above, here’s a partial list of approaches, examples, and tools.

Testing strategy and tactics for an entire project, or for projects in general.
Look at the Satisfice Heuristic Test Strategy Model and the Context Model for Heuristic Test Planning (these also appear in the RST Appendices).

An outline of testing activities for a test cycle.
Look at the General Functionality and Stability Test Procedure for Certifed for Microsoft Windows Logo. See also the OWL Quality Plan (and the Risk and Task Correlation) in the RST Appendices.

Keeping a set of personal notes to help structure a debriefing or other conversation.
See the “Beans ‘R Us Test Report” in the RST Appendices; or see my notes on testing an in-flight entertainment system which I did for fun on a flight from India to Amsterdam.

Recording activities and ideas during test execution
A video camera or a screen recording tool can capture the specific actions of a tester for later playback and review. Well-designed log files may also provide a kind of retrospective record about what was testing. Still neither of these provide insight into the tester’s mind. Recorded narration or conversation can do that; tools like BB Test Assistant, Camtasia, or Morae can help. The classic approach, of course, is to take notes. Have a look at my presentation, “An Exploratory Tester’s Notebook“, which has examples of freestyle notes taken during an impromptu testing session, and detailed, annotated examples of Session-Based Test Management sessions. Shmuel Gerson’s Rapid Reporter and Jonathan Kohl’s Session Tester are tools oriented towards taking notes (and, in the former case, including screen captures) of testing sessions on the fly.

Outlining many attributes of a particular quality criterion
See “Heuristics of Software Testability” in the RST Appendices for one example.

Cataloguing ideas about risk
Several examples of this in the RST Appendices, most extensively in the “Deployment Planning and Risk Analysis” example. You’ll also find an “Install Risk Catalog”; “The Risk of Incompatibility”; the Risk vs. Tasks section in the “OWL Quality Plan”; the “Y2K Compliance Report”; “Round Results Risk A”, which shows a mapping of Risk Areas vs. Test Strategy and Tasks.

Describing or outlining test coverage
A mapping establishes or illustrates relationships between things. We can use any of these to help us think about test coverage. In testing, a map might look like a road map, but it might also look like a list, a chart, a table, or a pile of stories. These can be constructed before, after, or during a given test activity, with the goal of covering the map with tests, or using testing to extend the map. I catalogued several ways of thinking about coverage and reporting on it, in three articles Got You Covered, Cover or Discover, and A Map By Any Other Name. Several examples of lightweight coverage outlines can be found in the RST Appendices (“Putt Putt Saves the Zoo”, “Table Formatting Test Notes”, There are also coverage ideas incorporated into the Apollo mission notes that we’ve titled “Guideword Heuristics for Astronauts”).

Accounting for testing work that we’ve done.
See Session-Based Test Management, and see “An Exploratory Tester’s Notebook“. Darren McMillan provides excellent examples of annotated mind maps; scroll down to the section headed “Session Reports”, and continue through “Simplifying feedback to management” and “Simplifying feedback to groups”. A forthcoming article, written by me, shows how a senior test manager tracks testing sessions at a half-day granularity level.

Programming a machine to help you to explore
See all manner of books on programming, both references and cookbooks, but for testers in particular, have a look at Brian Marick’s Everyday Scripting with Ruby. Check out Pete Houghton’s splendid examples of exploratory test automation that begin here. Cem Kaner (often in collaboration with Doug Hoffman) write extensively about automation-assisted exploratory testing; an example is here.

Alerting people to potential problems in the product
In general, bug reporting systems provide one way to handle the task of recording and reporting problems in the product. James Bach provides an example of a report that he provided to a client (along with a more informal account of the session).

Guiding a tester’s actions over a test session
Guiding a tester involves skills like chartering and checklisting. Start with the documentation on Session Based Test Management (http://www.satisfice.com/sbtm). Selena Delesie has produced an excellent blog post on chartering exploratory testing sessions. The title of Cem Kaner’s presentation at CAST 2008, The Value of Checklists and the Danger of Scripts: What legal training suggests for testers describes the content perfectly. Michael Hunter’s You Are Not Done Yet lists can be used and adapted to your context as a set of checklists.

To identify structures in the application or service
The “Product Elements” section in the Heuristic Test Strategy Model provides a kind of framework for documenting product structures. In the RST Appendices, the test notes for “Putt Putt Saves the Zoo” and “Diskmapper”, and the “OWL Quality Plan” provide examples of identifying several different structures in the programs under test. Mind mapping provides a means of describing and illustrating structures, too; see Darren McMillan’s examples here and here. Ruud Cox and Ru Cindrea used a mind map of product elements to help win the Best Bug Report award in the Test Lab at EuroSTAR 2011. I’ve created a list of structures that support exploratory testing, and many of these are related to structures in the product.

Providing a description of how to use a particular test tool that we’ve crafted
While working at a bank, I developed (in Excel and VBA) a tool that could be used as an oracle and as a way of recording test results. (Thanks to non-disclosure agreements, I can describe these, but cannot provide examples.) When I left the project, I was obliged to document my work. I didn’t work on the assumption that anyone off the street would be reading the document. Instead, I presumed that anyone assigned to that testing job and to using that tool, would have the rapid learning skill to explore the tool, the product, and the business domain in a mutually supportive way. So I crafted documentation that was intended to tell testers just enough to get them exploring.

Explaining business rules to someone else on the team
I did include documentation for novices of one kind: within the documentation for that testing tool, I included a general description of how foreign exchange transactions worked from the bank’s perspective, and how appropriate accounts got credited and debited. I had learned this by reverse-engineering use cases and consulting with the local business analyst. I summarized it with a two-page document written in simple, direct language, referring disrectly to the simpler use cases and explaining the more confusing bits in more detail. For those whose learning style was oriented toward code, I also described the tables and array formulas that applied the business rules.

Outlining scenarios in which the product might be used or tested
I discuss some issues about scenarios here—why they’re important, and why it’s important to keep them open-ended and open to interpretation. It’s more important to record than to prescribe, since in a good scenario, you’ll observe and discover much more than you’ve articulated in advance. Cem Kaner gives ideas on how to produce scenarios; Hans Buwalda presents examples of soap opera testing.

Identifying required tester skill
People with skill don’t need prescriptive documentation for every little thing. Responsible managers identify the skills needed to test, and who commit to employing people who either have those skills or can develop them quickly. James Bach eliminated 50 pages of otiose documentation with two paragraphs. (Otiose is a marvelous word; it’s fun to look it up in a thesaurus.)

Identifying, for a tester, a particular explicit sequence of actions to perform, input to provide, and observations to make.
Again, a document that attempts to specify exactly what a tester should do is the hallmark of scripted testing. James Bach articulates a paradox that has not yet been noted clearly in our craft: in order to perform a scripted test well, you need signficant amounts of skill and tacit knowledge (and you also need to ignore the script on occasion, and you need to know when those occasions are). There’s another interesting issue here: preparing such documents usually depends on exploratory activity. There’s no script to tell you how to write a script. (You might argue there’s one exception. You can follow this script to write a test script: take each line of a requirements document, and add the words “Verify that” to the beginning of each line.)

Now, just as you can perform testing badly using any approach, you can perform exploratory testing and document it inappropriately, either by under-documenting it OR over-documenting it using any of the kinds of documentation above. But, as this document shows, the notion that exploratory testing is by its nature undocumented is not only ignorant, but aggressively ignorant about both testing and documentation. Whenever you see someone claim that exploratory testing is undocumented, I’d ask you to help by setting the record straight. Feel free to refer to this blog post, if you find it helpful; also, please point me to other exemplars of excellent documentation that are consistent with exploratory approaches. If we all work together, we can bury this myth, while providing excellent records and reports for our clients.

Worthwhile Documentation

Monday, December 19th, 2011

In the Rapid Software Testing class, we focus on ways of doing the fastest, least expensive testing that still completely fulfills the mission. That involves doing some things more quickly, and it also involves doing other things less, or less wastefully. One of the prime candidates for radical waste reduction is documentation that’s incongruent with the testing mission.

Medical device projects typically present a high degree of risk. Excellent testing helps teams and product owners to identify risks and problems in the product. The quality of testing is a function of the skill of the tester; one would not set loose an incapable tester on high-risk project. Yet some managers have told me that they commission people to write test documentation in a particular style. That style is, to me, overly elaborate and specific with respect to actions to perform and observations to make. Yet at the same time, that style is remarkably devoid of ideas about motivation or risk.

I sometimes ask managers why they use this style of instruction. They usually answer, “because we want anyone to be able to walk up to this system and test it.”

“Anyone?” I ask. “Why anyone?”

“You know how it is. If we have to test a new revision of this program a year from now, there’s a good chance that we won’t have the same testers.” (Dude. If you’re inflicting on your staff the idea of testing as writing or following instructions for an automaton, I might have an explanation for you.)

“Anyone?” I ask. “How about a cat?”

“Well, Michael, that’s silly. Cats can’t think. Cats can’t read.”

“How about my daughter? She’s seven, and she can read well enough to read that. And she could follow the steps pretty well, too.”

“We don’t hire children here!”

“Okay,” I offer. “Would you hire a completely incompetent tester who needed to be told absolutely everything, in painful detail?”

“We wouldn’t hire anyone like that.”

“Fair enough, and I’d hope not. So, why do you insist that people write instructions for them that way?

Let me be clear: when the situation calls for skilled testers, you don’t need overly specific instructions for them. On the other hand, if you don’t have skilled testers, you’ve got a problem that scripted testing won’t be able to solve.

Here’s a splendid example of a machete that we believe that managers could use to cut through jungles of waste. In a recent project that involved work with FDA-regulated medical devices, James Bach found a huge number of excruciatingly overspecified, low-value test cases aimed at “anyone”. The following two paragraphs replaced 50 pages of waste.

3.0 Test Procedures

3.1 General Testing Protocol

In the test descriptions that follow, the word “verify” is used to highlight specific items that must be checked. In addition to those items, the tester shall at all times be alert for any unexplained or erroneous behaviour of the product. The tester shall bear in mind that, regardless of any specific requirement for any specific test, there is the overarching general requirement that the product shall not pose an unacceptable risk of harm to the patient, including an unacceptable risk due to reasonably foreseeable misuse.

Test personnel requirements: The tester shall be thoroughly familiar with the Generator and Workstation Function Requirement Specifications, as well as the working principles of the devices themselves. The tester shall also know the workings of the power test jig and associated software, including how to configure and calibrate it and how to recognize it is not working correctly. The tester shall have sufficient skill in data analysis and measurement theory to make sense of statistical test results. The tester shall be sufficiently familiar with test design to complement this protocol with exploratory testing in the event that anomalies appear that require investigation. The tester shall know how to keep test records to a credible professional standard.

To me, that’s something worth writing down. Follow those instructions, and your team will save time, save work, and put the emphasis in the right places: on risk, and on meeting and mitigating that risk with skills.

Doing Development Work vs. Doing Quality Assurance

Saturday, June 5th, 2010

Here’s a case where a comment and question were worthy of a post of their own.  In reference to my recent post, Testers:  Get Out of the Quality Assurance Business, Selim Mia writes:

Hi Michael,

I have started following your blog just from past few days and I like to thank you for all of your thoughtful posts by which reflects your craftsmanship.

Thank you for reading, and thank you for thanking me.

I have solely agreed all of your points/advice/discussions on this post. I had many confusion about the term QA and QC since the start of my testing career and still have many confusion, i think other testers have the same. i have been working in a department called “QA” in my organization but doing mostly testing tasks as like other companies in Bangladesh. But along with testing we have also doing some of the QA tasks (i think) and below i have mentioned some of these:

  • Check-in Review: we check, each developer at-least once in a day Check-in their source code into the svn repository (source code management system) with the comment what changes he made for this particular check-in and also reviewer name who pair reviewed the code before check-in.
  • Code review: we check, is the code reviewed by the technology expert in witch technology project is developing in the regular interval (at least for the new developer’s code, code of complex functionalities, etc) and also we ensure that actions has been taken for all the review comments.
  • Audit Process Framework: we check, are all the development processes are following by the all project members except their have enough justification and approval not to follow the particular process(es).
  • Audit Bug repository: we ensure all the reported bugs have been taken into action (not a bug, assigned, WIP, fix, won’t fix).
  • Audit Document Management System: we ensure that all the updated version of all documents of the particular project are stored on the DMS.

Are not all above activities are part (of course, not all) of QA? Your kind words will be very much helpful to me.

Regards,
– Selim

What a great question! Thank you for asking.

The overarching mission for a tester, in my view, is to be of service to the project. Now, that’s not only the case for testers; I think it’s the overarching mission of anyone, everyone, on the project. We’re all in service to our paramount clients—the product owners, the business owners, the gold owners and the goal donors (as some Agile wags have said)—but we’re also in service to each other. When we’re thinking that way, the testers help the programmers by testing the product using a different skill set and mind set from the programmers; the programmers help the testers by providing a more testable product (log files, scriptable interfaces, and so on). Testers may help programmers to pinpoint the circumstances in which a bug happens; programmers help testers by providing explanations, test programs, hints on what to test. Testers learn to program; programmers learn to test. We support each other and learn from each other.

The Agile people for years have been advocating the idea of the self-organizing team. I believe in that too. That means that, in principle, anyone on the team is empowered to do whatever work needs to be done. So if a programmer takes on the tasks of setting up and configuring test environments, or if the tester is recruited to review code or models or bugs—activities that help to assure quality as a part of collaborative process, I’d say that’s cool.

The audit stuff gives me pause. Auditing, in my view, is a kind of testing role: gathering information with the intention of informing a decision. Auditors don’t set policy or enforce rules; they provide information to management. In many process-model-obsessed organizations (here in the West, at least) the role has taken on a different slant: auditors are a kind of process police. In such organizations, people rearrange and reprioritize their work not to optimize its value, but to keep the auditors happy. This is a form of goal displacement. To me, the priority should be on providing service and value to our clients, including each other.

In my view, if auditors discover some deviation from a set policy or a process model, I’d argue that the first step is to question the reasons for the deviation. Maybe someone is being sloppy; maybe someone is cutting corners; maybe someone is adding risk. But maybe someone has discovered a faster, less expensive, more efficient, more informative, more productive way of handling a task. Models always leave out something. Process models often leave out means by which we can encourage beneficial variation and change. I’ve never heard of an auditor reporting on some fabulous new problem-solving approach that someone has discovered internally. Most often, in my experience, process models leave out adaptability and people, as this remarkable TED talk describes.

It’s neither a tester’s job nor an auditor’s job, in my view, to set or enforce policy, and I think it’s politically dangerous for us to be perceived that way. As soon as we are perceived to be responsible for enforcement, we run the risk of being seen as tattletales, busybodies, quality police. In that kind of environment, information will soon start to be hidden, which undermines the task of investigating the product and identifying problems with it that threaten its value.

So, to the extent that you’re doing development work that helps to assure quality; to the extent that your teammates themselves are asking you to assist them; to the extent that you’re providing a service to them; to the extent that they appreciate what you’re doing as a service to them; and to the extent that they thank you for it, I’d say “rock on”, and congratulations.

In another forum, a correspondent suggested “Maybe it’s all down to the “overall” thing – be part of the process, not a megalomaniac who thinks he owns it.” I absolutely agree with that.  To the extent that you’re doing “quality assurance”; to the extent that your managers are requiring you to impose on your teammates (or even worse, to the extent that you’re imposing without being asked by anyone); to the extent that you’re slowing down the project or inflicting help; to the extent that the programmers see your work as enforcing the contents of a process model or policy document; to the extent that you are barely tolerated or outright resented—well, as always, that’s up to you and your organization. But it’s not the kind of work that I would condone or accept myself.

Again, thanks for writing.

Automation Bias, Documentation Bias, and the Power of Humans

Tuesday, June 30th, 2009

A few weeks I went down to the U.S. Consulate in Toronto to register Ariel, my daughter, as an American citizen born abroad. (She’s a dualie, because she was born in Canada to an American parent: me. I’m a dualie too, born in the U.S. to Canadian parents. Being born a dual citizen is a wonderful example of a best practice. You should follow it. But I digress.)

The application process is, naturally, fraught with complication and bureaucracy. There’s also a chilling and intimidating level of security; one isn’t allowed to bring anything electronic into the Consulate at all. No cell phones, no PDAs, and certainly no laptop computers. That means no electronic records, and no hope of looking anything up. So one has to prepare.

There’s a Web site for the Consular services. One of the first items that one sees on the site is a link for telephone inquiries. Note a couple of things here: the telephone services are for visa information, not for general information; and that visa information costs USD$0.90 per minute for a recorded system with no operator. (Oddly, that’s the price for calls from the U.S.; calls from Canada are cheaper, at CAD$0.69 per minute.) I didn’t test that.

With only a little digging, I was able to find information related to registering a birth abroad. I gathered the information and documents that I figured I needed, and took it all down to the Consulate. I was getting ready to travel the next day, and so in typical fashion, I pushed things out to the noon deadline for receiving applications. I watched the clock on the car anxiously, parking at 11:53 and getting to the Consulate at 11:55. “Wow, that’s pushing it,” said the security guard. “Last one today.”

When I spoke to the friendly, helpful lady behind the counter (I mean that; she was genuinely friendly and helpful) she they told me some things that the Web site didn’t.

  • The application form itself is online, and these days it’s one of those PDFs that has input fields, so everything can be nice and tidy. Again, though, there are some fields in the form that have several possible answers. There is some helpful information available, but I still had questions.
  • The consular officers want to see original documents, but accept and keep only photocopies of them. You need to come with your own photocopies. If you don’t, it costs you $1.00 per document—and there are lots of documents. This isn’t noted anywhere on the Web site that I could see.
  • On one of the Web pages listing documentation requirements, it says “In certain cases, it may be necessary to submit additional documents, including affidavits of paternity and support, divorce decrees from prior marriages, or medical reports of blood compatibility.” Well, what cases? The page doesn’t tell me, and getting it wrong means an extra trip. The lady behind the counter reviewed what I had brought, answered a number of questions, and told me exactly what to bring next time.

As I travel around, I sometimes see an implicit assumption that documents tell us all we need to know. Yet documents are always a stand-in for some person, an incomplete representation of what they know or what they want. They’re time-bound, in that they represent someone’s ideas frozen at some point in the past. They can’t, and don’t answer followup questions. As Northrop Frye once said, “A book always says the same thing.” Yet if we look more closely, not even ideas that are carefully and thoroughly debated can be expressed unambiguously. That’s why we have judges. And lawyers.

The next thing that happened emphasized this. After I left the Consulate, I returned to my car. At the collection booth, the posted time was 12:20. I’d been less than half an hour, which is good because parking at that garage costs $3.00 per half-hour. I handed the attendant my ticket. The charge was $6.00.

“What?! I’ve only been gone for 25 minutes.”

She looked at the ticket. “Sorry, sir. You checked in at 11:40.”

“No way,” I said. “I know what time I checked in; I was running late. It was at least 12 minutes later than 11:40. I got to the entrance to the Consulate, just over there, at 11:55. No way I could have taken 15 minutes to walk 75 metres!” She showed me the ticket. It said 11:40. “That’s impossible. I want to check the clock.”

The difference was only $3.00, but I was furious. I exited the garage, drove around to the entrance and check the display. It read 12:24, the correct time. I pushed the button and pulled out a ticket; it too read 12:24. To her credit, the attendant appeared and checked the clock, and asked to see the ticket I had just printed. “12:24. I’m sorry, sir, there’s nothing I can do.” Quite true, no doubt.

In this case, the (clearly fallible) machinery and the (clearly fallible) documentation were more credible than I. I didn’t check the ticket on the way in. And yet I know when I arrived, and I know that there must have been some kind of failure with the machinery. A one-off? A consistent pattern? Happens only at a certain time of the day? A mechanical problem? A software problem?

All the way home, I pondered over how the failure had occurred, and how one might test for it. But what impressed me most about my experience with the Consulate’s Web site, and the consular officer, and the the parking ticket machine, and the parking attendant, was the way in which we invest trust, to varying degrees and at various times, in machines and in documents and in people. When is that trust warranted, and when is it not?

Postscript: Just now, as I attempted to publish this post, the net connection at this hotel was suddenly unavailable. Again.