Blog Posts for the ‘Testing Story’ Category

Breaking the Test Case Addiction (Part 10)

Monday, June 8th, 2020

This post serves two purposes. It is yet another installation in The Series That Ate My Blog; and it’s a kind of personal exploration of work in progress on the Rapid Software Testing Guide to Test Reporting. Your feedback and questions on this post will help to inform the second project, so I welcome your comments.

As a tester, your mission is to evaluate the product and report on its status, typically with a special emphasis on finding problems that matter. We’ve discussed bug reporting in the Rapid Testing Guide to Making Good Bug Reports. In this installment of Breaking the Test Case Addiction, I’m describing test reporting as something that responsible testers do.

Sounds straightforward, right? But right away, I want to address the risk of misunderstanding, so let me clear up what I mean by certain terms here.

Responsible Testers
Responsible testers are people who assume the role of tester on a project, and who commit themselves to doing that job well over time. Supporting testers (which we used to call “helpers”) help the test effort temporarily or intermittently, but are not committed to the testing role. Supporting testers are generally not required to report on their testing work to the same degree as responsible testers are.

Test Project
In this post, when I say test project, I’m referring to any set of activities focused on testing of any product or service, or any part of it: a low-level unit, a function, a component, a feature, a story, a service, an entire system… A test project can contain lots of little test projects. Accordingly, depending on the level of granularity we’re referring to, a test project might happen over moments or minutes, days, weeks, or months. A report on a test project might cover similar spans of time—instants, episodes, sprints, releases…

“Test project” here could refer to something that happens outside of development. More typically, it refers to testing activity that happens inside a development project, in parallel with the other aspects of development, like design, programming, or other testing.

Product
When I say product here, I mean anything that anyone has produced that might be subject to testing. While that includes running code, “product” could include code that is not running yet; prototypes and mockups; specifications and other requirement documents; flowcharts, diagrams, or state models; user documentation; sales and marketing material; or ideas about any of those things. When we refer to testing activity pointed at things that are static, like most of the items in the preceding list, we usually call it “review”; we might also call it “performing a thought experiment”. Review is a kind of testing activity that may be closely or distantly associated with performing a test—which brings us to what we mean by “testing”.

Testing, Test Activities, and Review
When I say testing here, I am using the Rapid Software Testing definition. To us, testing is the process of evaluating a product by learning about it through experiencing, exploring, and experimenting.

Testing includes many activities: questioning, studying, modeling, operating the product, manipulating it, making inferences, analyzing risk, thinking critically, recording the process, reporting on it, etc. Testing activities also include investigating and analyzing bugs and suspicious behaviour. Testing typically includes applying tools to help with any testing activities.

A test is an instance of testing, and to perform a test means to explore, experiment with, and gain experience of a product. In general, to perform a test implies that we will operate and observe a product or its output by some means.

In review, operation of the product as such typically isn’t available. In review, though, we engage in other testing activities as mentioned above. We can’t perform experiments on the running product but, as I mentioned above, we might perform thought experiments on it, imagining interactions between the product and the people using it. Of course, a thought experiment isn’t the same as a real-world experiment; that’s a key difference between review and performing a test.

Why go on about all this? Because reporting is central to our role as testers. We test; we learn; and we report on what we’ve learned.

Are you doing testing work of any kind, or even thinking about doing testing? Then you’ve got a test project on the go, and you can report on its status, even if your report starts with “I haven’t started testing the product yet, but here are some ideas about how we might go about it.”

Report
Next, let’s unpack the idea of a report. A report is a description, explanation, or justification of something. A report is a communication, but a report is not necessarily a document.

Communicating a report might happen as conversation in a hallway, or beside a coffee machine or a water cooler; as a couple of sentences uttered at a stand-up meeting; as a quick mention of a bug in passing to a developer; as a lengthy description of the status of the product and the status of testing at a go-live meeting. A report might be conveyed in writing as a paragraph, a page, or several pages of text; as (heaven help us) a PowerPoint presentation; or as hundreds of pages in bound books, formally presented to a government or regulatory body.

We might include or refer to artifacts collected or produced during the activity that led to the report—the reporter’s raw notes, data sets, program code, design notes for the activity itself. A report might be supplemented with illustrations, charts, graphs, or diagrams, sketched on a whiteboard or formally rendered on glossy paper. Or a report might be accompanied by photographs, audio, video, mind maps, tables, and references to other artifacts.

Test Report
A test report is any description, explanation, or justification of the status of a test project.

A comprehensive test report is all of those things together.

A professional test report is one that is competently, thoughtfully, and ethically designed to serve your clients in their context. A professional test report need not be a comprehensive test report, nor vice versa.

Some might say that a test report is “just the facts”, but it isn’t; it cannot be. A test report is based on facts, but it’s a story about facts—a story framed for the person or people receiving it. Stories always emphasize some things and leave other things out. We never have all the facts, and facts are sometimes in dispute. Stories are always, to some degree, biased by the storyteller and focused by what the storyteller wants the audience to hear, to learn, and to know. Those biases can seen be as problems in the report, features of it, or both.

The audience for your test report might include insiders who are directly involved in the testing and development work; other insiders (who might be overseeing that work, or affected by it without being directly involved); or outsiders.

For now, I’m going to assume your audience is in the first two categories. On that basis, it helps to consider what the audiences for a test report probably wants to know above all else.

They almost certainly don’t want to know about test case counts (although they might think they do).
They almost certainly don’t want to know about pass-fail ratios (although they might think they do).
They almost certainly don’t want to know about when the testing is going to be done (although they might think they do).

(I realize that these claims may sound strange to you. I will address these (non-)desires in a future post.)

Having been a program manager, a developer, and having worked with lots of them, I can tell you what those people almost certainly do want to know:

What is the actual status of the product? Are there problems that threaten the value of the product? Do these problems threaten the on-time, successful completion of our work?

A test report addresses those questions.

Three Aspects of Test Reporting
A good test report braids three strands of story together:

  • a story about the product and its status; what the product is, what it does, how it works, how it doesn’t work, and how it might not work in ways that matter to our various clients. This is a story about bugs, problems, and risks about the product.
  • a story about how the testing was done—how the product story in was obtained; how we configured, operated, observed, and evaluated the product. A thread in this second strand of the testing story involves describing the ways in which we recognized problems; our oracles. Another thread in this strand involves where we looked for problems; our coverage. Yet another thread includes what we haven’t covered yet, or won’t cover at all unless something changes.
  • a story about the quality of the testing work—why the testing that was done can be trusted, or to the degree that it is untrustworthy, issues that present obstacles to the fastest, least expensive, most powerful testing we can do. In this strand, we also identify what we might need or recommend to the testing better, and we may also provide a context and and evaluation of the quality of the report itself.

Most of the time, the client of the testing will be most interested in that first strand. Sometimes the client might be more interested in one of the other two. Nonetheless, whatever form the report might take, the reporter should at least be prepared to address all three strands.

(I’ve written more about this pattern here, here, and here.)

Credibility
If you’re not credible, your reports won’t be taken seriously. In your reporting, you may be delivering surprising or uncomfortable information. Your clients, unconsciously or deliberately, may assume that you’re mistaken or that you’re exaggerating risks, and they may try to micro-manage your reporting. Credibility is an antidote to all this.

To build and maintain credibility, it’s important to actually care about the project and the people on it. It’s important to take your work and your skills seriously, and to demonstrate that seriousness in your attitude, commitments, and behaviour. There will be more to say about this later, but for now…

  • Actually know how to do your job.
  • Gain experience with the product.
  • Study the technology in and around your project.
  • Read all of the relevant requirement, specification, and standards documents carefully, especially when you’re in a regulated environment.
  • Take notes diligently on your own work to inform your reporting.
  • Sweat the details in your own work.
  • Find things to appreciate about the work of others.
  • Acknowledge mistakes, correct them and learn from them.
  • Do not tell lies or exaggerate.

Examples
Note that Part 7 of this series included a number of test reports delivered verbally. Here I’m providing examples of test report documents.

As you survey them, you might want to consider the context for which they’re intended; the reporting levels that they focus on (product, testing, or quality-of-testing); the evidence or references included to support the report; and what the report might need or could leave out.

Note that while a couple of reports refer to specific things to be checked, there is rarely even a mention of test cases. The focus, instead, is usually on bugs or potential problems in the product that represent risk to the value of the product, and therefore risk to the business.

Spot Check Test Report

Click to access mpim-report.pdf


Here is an example of a real, comprehensive, professional test report, prepared by James Bach and edited by me. Over five pages, it describes a paired exploratory testing session that found problems in a real medical device. (The names, nouns and verbs have been changed to shield the identity of the company and the product.)

Cheese Grater Incident Report

Click to access cheesegrater.pdf


This is two reports in one: a whimsical yet serious report on repairing a broken Parmesan cheese dispenser; and a much longer, detailed set of notes on how to perform an investigation and report on it. Indeed, the latter section is a really worthwhile complement to this blog post.

OEW Case Tool

Click to access OEWCaseToolReport.pdf


An example of a two-page summary report (from 1994!) about a computer-aided software engineering (CASE) tool at Borland.

Y2K Compliance Report

Click to access Y2KComplianceReport.pdf


An eight-page report prepared for compliance with Y2K requirements, including notes on strategy; the test approaches that were applied (and risks that prompted those approaches); the results; and a list of specific items that needed to be checked.

OWL Quality Plan

Click to access OWLQualityPlan.pdf


This is a report on proposed plans for testing another Borland product, the Object Windows Library. The report includes a table linking product risks to testing work necessary to investigate those risks. It also includes a listing of components and sub-components in the product.

An Exploratory Tester’s Notebook

Click to access etnotebook.pdf


This paper on recording and reporting includes a report on my spontaneous investigation of an in-flight entertainment system, and a couple of session-based test management session sheets.

A Sticky Situation

Click to access 2012-02-AStickySituation.pdf


This is an example of a form of reporting that’s sometimes called an “information radiator”. It visualizes the status of a test project (and some degree of test coverage) using sticky notes.

The Low-Tech Testing Dashboard

Click to access dashboard.pdf


Of this, James Bach says “Back in 1997, I was challenged by top management to create a way to convey testing status at a glance. Thus was born the “low-tech testing dashboard” which has since been rendered in various electronic, distributed forms. The important thing about the dashboard is that there are no “measurements.” We don’t count anything. Instead there are assessments. These are subjective, yes, but always grounded in evidence.

Who Killed My Battery?

Click to access boneh-www2012.pdf


A splendid research paper on what drains mobile phone batteries… and why. Also a presentation on YouTube: https://www.youtube.com/watch?v=_uv057DP2Vs

Once again, these reports don’t focus test cases, but on testing. They’re examples of powerful and reasonable test reports that offer an alternative to management that is fixated on test cases.

Managers are more likely to relax their obsession with test cases when we provide them with reports that tell the product and testing stories.

Breaking the Test Case Addiction (Part 8)

Monday, December 9th, 2019

Throughout this series, we’ve been looking at an an alternative to artifact-based approaches to performing and accounting for testing: an activity-based approach.

Frieda, my coaching client, and I had been discussing how to manage testing without dependence on formalized, scripted, procedural test cases. Part of any approach to making work accountable is communication between a manager or test lead and the person who had done the work. In session-based test management, one part of this communication is a conversation that we call a debrief, and that’s what we talked about last time.

One of the important elements of a debrief is accounting for the time spent on the work. And that’s why one of the most important questions in the debrief is What did you spend your time doing in this session?

“Ummm… That would be ‘testing’, presumably, wouldn’t it?” Frieda asked.

“Well,” I replied, “there’s testing, and then there’s other work that happens in the session. And there are pretty much inevitably interruptions of some kind.”

“For sure,” Frieda agreed. “I’m getting interrupted every day, all the time: instant messages, phone calls, other testers asking me for help, programmers claiming they can’t reproduce the bug on their machines…”

“Interruptions are a Thing, for sure,” I said. “Let’s talk about those in a bit. First, though, let’s consider what you’d be doing during a testing session in which you weren’t interrupted. Or if we didn’t talk about the interruptions, for a moment. What would you be doing?”

“Testing. Performing tests. Looking for bugs,” said Frieda.

“Right. Can you go deeper? More specific?”

“OK. I’d be learning about the product, exercising test conditions, increasing test coverage. I’d be keeping notes. If I were making a mind map, I’d be adding to it, filling in the empty areas where I hadn’t been before. Each bit of testing I performed would add to coverage.”

“‘Each bit of testing,'”, I repeated. “All right; let’s imagine that you set up a 90-minute session where you could be uninterrupted. Lock the office door…”

“…the one that I don’t have…”, Frieda said.

“Natch. It’s cubicle-land where you work. But let’s say you put up a sign that said “Do not disturb! Testing is in Session!” Set the phone to Send Calls, shut off Slack and Skype and iMessage and what-all… In that session, let’s just say that you could do a bunch of two-minute tests, and with each one of those tests, you could learn something specific about the product.”

“That’s not how testing really works! That sounds like… test cases!” Frieda said.

“I know,” I grinned. “You’re right. I agree. But let’s suspend that objection for a bit while we work through this. Imagine that 90-minute session rendered as a nine-by-five table of 45 little microbursts of test activity. The kind of manager that you’ve been role-playing here thinks this will happen.”

A Manager's Fantasy of an Ideal Test Session

Frieda chuckled. “Manager’s Fantasy Edition. That’s about right.”

“Indeed,” I said. “But why?”

“Well, obviously, when I’m testing, I find bugs. When I do, start investigating. I start figuring out how to reproduce the bug, so I can write it up. And then I write it up.”

“Right,” I said. “But even though it’s part of testing, it’s got a different flavour than the learning-focused stuff, doesn’t it?”

“Definitely,” said Frieda. ” When I find a bug, I’m not covering new territory. It’s like I’m not adding to the map I’m making of the product. It’s more like I’m staying in the same place for a while while I investigate.”

“Is that a good thing to do?”

“Well…, yes,” Frieda replied. “Obviously. Investigating bugs is a big part of my job.”

“Right. And it takes time. How much?”

“Well,” Frieda began, “A lot of the time I repeat the test to make sure I’m really seeing a bug. Then I try to find out how to reproduce it reliably, in some minimum set of steps, or with some particular data. Sometimes I try some variations to see if I can find other problems around that problem. Then I’ve got to turn all that into a bug report, and log it in the tracking system. Even if I don’t write it up formally, I have to talk to the developer about it.”

“So, quite a bit of time,” I said.

“Yep,” she said. “And another thing: some bugs block me and prevent me from getting to part of the product I want to test. Trying to work around the blockers takes time too. So… like I said, while I’m doing all those things, I’m not covering new ground. It’s like being stuck in the mud on a flooded road.”

“If I were your manager, and if I were concerned about your productivity, I’d want to know about stuff like that,” I said. That’s why, in session-based test management, we keep track of several kinds of testing time. Let’s start with two: test design and execution, in which we’re performing tests, learning about the product, gaining a better understanding of it. Of course, our focus is on activity that will either find a bug, or help us to find a bug. We call that T-time, for short, and distinguish it from bug investigation and reporting—B-time—which includes the stuff that you were just talking about. The key thing is that B-time interrupts T-time.”

Frieda’s brow furrowed. “Or, to put it another way, investigating bugs reduces test coverage.”

“Yes. And when it does, it’s important for managers to know about it. As a manager, I don’t want to be fooled about coverage—that is, how much of the product that we’ve examined with respect to some model.

“You start a session with a charter that’s intended to cover something we want to know about. In a 90-minute session, it’s one thing if a tester spends 80 minutes covering some product area with testing and only ten minutes investigating bugs. It’s a completely different thing if the tester spends 80 minutes investigating bugs, and only ten minutes on tests that produced new coverage. If you only spend ten percent of the time addressing the charter, and the rest on investigating a bug that you’ve found, I’d hope you’d report that you hadn’t accomplished your charter.”

“Wait… what if I were nervous about that?” Frieda asked. “Doesn’t it look bad if I haven’t achieved the goal for the session?”

“Not necessarily,” I replied. “We can have the best of intentions and aspirations for a session before it starts But the product is what it is, and whatever happens, happens. Whatever the charter suggests, there’s an overarching mission for every session: investigate the product and report on the problems in it. If you’re having to report lots of bugs because they’re there, and you’re doing it efficiently, that shouldn’t be held against you. Testers don’t put the bugs in. If there are problems to report, that takes time, and that’s going to reduce coverage time. If you’re finding and investigate a lot of bugs, there’s no shame in not covering what we might hope you’d cover. Plus, bug investigation helps the developers to understand what they’re dealing with, so that’s a service to the team.”

Frieda looked concerned. “Not very managers I’ve worked with would understand that. They’d just say, ‘Finish the test cases!’ and be done with it.”

“That can be an issue, for sure. But a key part of testing work these days is to help managers to learn how to become good clients for testing. That sometimes means spelling out certain things explicitly. For instance: if you find a ton of bugs in during in a session, that’s bad enough, in that you’ve got a lot less than a session’s worth of test coverage. But there’s something that might be even worse on top of that: you have found only the shallowest bugs By definition; the bugs you’ve found already were the easiest bugs to find. A swarm of shallow bugs is often associated with an infestation of deeper bugs.”

“So, in that situation, I’m going to need a few more sessions to obtain the coverage we intended to achieve with the first one,” said Frieda.

“Right. And you if you’re concerned about risk, you’ll may want to charter more, deeper testing sessions, because—again, by definition—deeper bugs are harder to find.”

Frieda paused. “You said there were several kinds of testing time. You mentioned T-time and B-time. That’s only two.”

“Yes. At very least, there’s also Setup time, S-time. While you’re setting up for a test, you aren’t obtaining coverage, and you’re not investigating or reporting a bug. Actually, setting up is only one thing covered our notion of “Setup”. S-time is a kind of catch-all for time within the session in which you couldn’t have found a bug. Maybe you’re configuring the product or some tool; maybe you’re resetting the system after a problem; maybe you’re tidying up your notes.”

“Or reading about the product? Or talking with somebody about it?”, Frieda asked.

“Right. Anything that’s necessary to get the work done, but that isn’t T-time or B-time. So instead of that Manager’s Fantasy Version of the session, a real session often looks like this:”

A More Plausible Test Session

“Or even this.”

A Common Test Session

“Wow,” said Frieda. “I mean, that second one is totally realistic to me. And look at how little gets covered, and how much doesn’t get covered.”

“Yeah. When we visualize like this, makes an impression, doesn’t it? Trouble is, not very many testers help managers connect those dots. As you said, if you want to achieve the coverage that the manager hoped for in the Fantasy Edition, this helps to show that you’ll need something like four sessions to get it, not just one. Plus the bugs that you’ve found in that one session are by definition the shallowest bugs, the ones closest to the surface. Hidden, rare, subtle, intermittent, emergent bugs… they’re deeper.”

Frieda still had a few more questions, which we’ll get to next time.

Breaking the Test Case Addiction (Part 7)

Monday, June 10th, 2019

Throughout this series, we’ve been looking at an an alternative to artifact-based approaches to testing: an activity-based approach.

In the previous post, we looked at a kind of scenario testing, using a one-page sheet to guide a tester through a session of testing. The one-pager replaces explicit, formal, procedure test cases with a theme and a set of test ideas, a set of guidelines, or a checklist. The charter helps to steer the tester to some degree, but the tester maintains agency over her work. She has substantial freedom make her own choices from one moment to the next.

Frieda, my coaching client, anticipated what her managers would say. In our coaching session, she played the part of her boss. “With test cases,” she said, in character, “I can be sure about what has been tested. Without test cases, how will anyone know what the tester has done?”

A key first step in breaking the test case addiction is acknowledging the client’s concern. I started my reply to “the manager” carefully. “There’s certainly a reasonable basis for that question. It’s important for managers and other clients of testing to know what testing has been done, and how the testers have done it. My first step would be to ask them about those things.”

“How would that work?”, asked Frieda, still in her role. “I can’t be talking to them all the time! With test cases, I know that they’ve followed the test cases, at least. How am I supposed to trust them without test cases?”

“It seems to me that if you don’t trust them, that’s a pretty serious problem on its own—one of the first things to address if you’re a manager. And if you mistrust them, can you really trust them when they tell you that they’ve followed the test cases? And can you trust that they’ve done a good job in terms of the things that the test cases don’t mention?”

“Wait… what things?” asked “the manager” with a confused expression on her face. Frieda played the role well.

“Invisible things. Unwritten things. Most of the written test cases I’ve seen refer only to conditions or factors that can be observed or manipulated; behaviours that can be described or encoded in strings or sentences or numbers or bits. It seems to me that a test case rarely includes the motivation for the test; the intention for it; how to interpret the steps. Test cases don’t usually raise new questions, or encourage testers to look around at the sides of the path.

“Now,” I continued, “some testers deal with that stuff really well. They act on those unspoken, unwritten things as they perform the test. Other testers might follow the test case to the letter — yet not find any bugs. A tester might not even follow the test case at all, and just say that he followed it. Yet that tester might find lots of important bugs.”

“So what am I supposed to do? Watch them every minute of every day?”

“Oh, I don’t think you can do that,” I replied. “Watching everybody all the time isn’t reasonable and it isn’t sustainable. You’ve got plenty of important stuff to do, and besides, if you were watching people all the time, they wouldn’t like it any more than you would. As a manager, you must to be able to give a fair degree of freedom and responsibility to your testers. You must be able to extend some degree of trust to them.”

“Why should I trust them? They miss lots of bugs!” Frieda seemed to have had a lot of experience with difficult managers.

“Do you know why they miss bugs?” I asked. “Maybe it’s not because they’re ignoring the test cases. Maybe it’s because they’re following them too closely. When you give someone very specific, formalized instructions and insist that they follow them, that’s what they’ll do They’ll focus on following the instructions, but not on the overarching testing task, which is learning about the product and finding problems in it.”

“So how should I get them to do that?”, asked “the manager”.

“Don’t turn test cases into the mission. Make their mission learning about the product and finding problems in it.”

“But how can I trust them to do that?”

“Well,” I replied, “let’s look at other people who focus on investigation: journalists; scientific researchers; police detectives. Their jobs are to make discoveries. They don’t follow scripted procedures. No one sees that as a problem. They all work under some degree of supervision—journalists report to editors; researchers in a lab report to senior researchers and to managers; detectives report to their superiors. How do those bosses know what their people are doing?”

“I don’t know. I imagine they check in from time to time. They meet? They talk?”

“Yes. And when they do, they describe the work they’ve done, and provide evidence to back up the description.”

“A lot of the testers I work with aren’t very good at that,” said Frieda, suddenly as herself. “I worry sometimes that I’m not good at that.”

“That’s a good thing to be concerned about. As a tester, I would want to focus on that skill; the skill of telling the story of my testing. And as a manager, I’d want to prepare my testers to tell that story, and train them in how to do it any time they’re asked.”

“What would that be like?”, asked Frieda.

“It varies. It depends a lot on tacit knowledge.”

“Huh?”

“Tacit knowledge is what we know that hasn’t been made explicit—told, or written down, or diagrammed, or mapped out, or explained. It’s stuff that’s inside someone’s head; or it’s physical things that people do that has become second nature, like touch typing; or it’s cultural, social—The Way We Do Things Around Here.

“The profile of a debrief after a testing session varies pretty dramatically depending on a bunch of context factors: where we are in the project, how well the tester knows the product, and how well we know each other.

“Let me take you through one debrief. I’ll set the scene: we’re working on a product—a project management system. Karla is an experienced tester who’s been testing the product for a while. We’ve worked together for a long time too, and I know a lot about how she tests. When I debrief her, there’s a lot that goes unsaid, because I trust her to tell me what I need to know without me having to ask her too much. We both summarize. Here’s how the conversation with Karla might play out.”

Me: (scanning the session sheet) The charter was to look at task updates from the management role. Your notes look fine. How did it go?

Karla: Yeah. It’s not in bad shape. It feels okay, and I’m mostly done with it. There’s at least one concurrency problem, though. When a manager tries to reassign a task to another tester, and that task is open because the assigned tester is updating it, the reassignment doesn’t stick. It’s still assigned to the original tester, not the one the manager assigned. Seems to me that would be pretty rare, but it could happen. I logged that, and I talked about it to Ron.

Me: Anything else?

Karla: Given that bug, we might want to do another session on any kind of update. Maybe part of a session. Ron tells me async stuff in Javascript can be a bear. He’s looking into a way of handling the sequence properly, and he should have a fix by the end of the day. I wouldn’t mind using part of a session to script out some test data for that.

Me: Okay. Want to look at that tomorrow, when you look at the reporting module? And anything else I should know?

Karla: I can get to that stuff in the morning. It’d be cool to make sure the programmers aren’t mucking around in the test environment, though. That was 20 minutes of Setup.

Me: Okay, I’ll tell them to stay out.

“And that’s it,” I said.

“That’s it?”, asked Frieda. “I figured a debrief would be longer than that.”

“Oh, it could be,” I replied. “If the tester is inexperienced or new to me; if the test notes have problems; if the product or feature is new or gnarly; or if the tester found lots of bugs or ran into lots of obstacles, the debrief can take a while longer.

When I want to co-ordinate testing work for a bunch of people, or when I anticipate that someone might want to scrutinize the work, or when I’m in a regulated environment, I might want to be extra-careful and structure the conversation more formally. I might even want to checklist the debriefing.

No matter what, though, I have a kind of internal checklist. In broad terms, I’ve got three big questions: How’s the product? How do we know? Why should I trust what we know, and what do we need to get a better handle on things?”

“That sounds like four questions,” Frieda smiled. “But it also sounds like the three-part testing story.”

“Right you are. So when I’m asking focused questions, I’d start with the charter:

  • Did you fulfill your charter? Did you cover everything that the charter was intended to cover?
  • If you didn’t fulfill the charter, what aspects of the charter didn’t get done?
  • What else did you do, even if it was outside the scope of the mission?

“What I’m doing here is trying to figure out whether the charter was met as written, or if we need to adjust the it to reflect what really happened. After we’ve established that, I’ll ask questions in three areas that overlap to some degree. I won’t necessarily ask them in any particular order, since each answer will affect my choice of the next question.”

“So a debriefing is an exploratory process too!” said Frieda.

“Absolutely!” I grinned. “I’ll tend to start by asking about the product:

  • How’s the product? What is it supposed to do? Does it do that?
  • How do you know it’s supposed to to that?
  • What did you find out or learn? In particular, what problems did you find?

“I’ll ask about the testing:

  • What happened in the course of the session?
  • What did you cover, and how did you cover it it?
  • What product factors did you focus on?
  • What quality criteria were you paying the most attention to?
  • If you saw problems, how did you know that they were problems? What were your oracles?
  • Was there anything important from the charter that you didn’t cover?
  • What testing around this charter do you see as important, but has not yet been done?

“Based on things that come up in response to these questions, I’ll probably have some others:

  • What work products did you develop?
  • What evidence do you have to back the story? What makes it credible?
  • Where can people find that evidence? Why, or why not, should we hang on to it?
  • What testing activity should, or could, happen next or in the more distant future?
  • What might be necessary to enable that activity?

“That last question is about practical testability.”

“Geez, that’s a lot of questions,” said Frieda.

“I don’t necessarily ask them all every time. I usually don’t have to. I will go through a lot of them when a tester is new to this style of working, or new to me. In those cases, as a manager, I have to take more responsibility for making sure about what was tested—what we know and what we don’t. Plus these kinds of questions—and the answers—help me to figure out whether the tester is learning to be more self-guided

“And then I’ve got three more on my list:

  • What factors might have affected the quality of the testing?
  • What got in the way, made things harder, made things slower, made the testing less valuable?
  • What ongoing problems are you having?
  • Frieda frowned. “A lot of the managers I’ve worked with don’t seem to want to know about the problems. They say stuff like, ‘Don’t come to me with problems; come to me with solutions.'”

    I laughed. “Yeah, I’ve dealt with those kinds of managers. I usually don’t want to go them at all. But when I do, I assure them that I’m really stuck and that I need management help to get unstuck. And I’ve often said this: ‘You probably don’t want to hear about problems; no one really does. But I think it would be worse for everyone if you didn’t know about them.’

    “And that leads to one more important question:

    • What did you spend your time doing in this session?”

    “Ummm… That would be ‘testing’, presumably, wouldn’t it?” Frieda asked.

    “Well,” I replied, “there’s testing, and then there’s other work that happens in the session.”

    We’ll talk about that next time.

I Represent the User! And We All Do

Saturday, December 15th, 2018

As a tester, I try to represent the interests of users. Saying the user, in the singular, feels like a trap to me. There are usually lots of users, and they tend to have diverse and sometimes competing interests. I’d like to represent and highlight the interests of users that might have been forgotten or overlooked.

There’s another trap, though. As Cem Kaner has pointed out, it’s worth remembering that practically everyone else on the team represents the interests of end users in some sense. “End users want this product in a timely way at a reasonable price, so let’s get things happening on schedule and on budget,” say the project managers. “End users like lots of features,” say the marketers. “End users want this specific feature right away,” say the sales people. “End users want this feature optimized like I’m making it now,” say the programmers. I’d be careful about claiming that I represent the end user—and especially insinuating that I’m the only one who does—when lots of other people can credibly make that claim.

Meanwhile, I aspire to test and find problems that threaten the value of the product for anyone who matters. That includes anyone who might have an interest in the success of the product, like managers and developers, of course. It also includes anyone whose interests might have been forgotten or neglected. Technical support people, customer service representatives, and documentors spring to mind as examples. There are others. Can you think of them? People who live in other countries or speak other languages, whether they’re end users or business partners or colleagues in remote offices, are often overlooked or ignored.

All of the people in our organization play a role in assuring quality. I can assure the quality of my own work, but not of the product overall. For that reason, it seems inappropriate to dub myself and my testing colleagues as “quality assurance”. The “quality assurance” moniker causes no end of confusion and angst. Alas, not much has changed over the last 35 years or so: no one, including the most of the testing community, seems willing to call testers what they are: testers.

That’s a title I believe we should wear proudly and humbly. Proudly, because we cheerfully and diligently investigate the product, learning deeply about it where most others merely prod it a little. Humbly, because we don’t create the product, design it, code it, or fix it if it has problems. Let’s honour those who do that, and not make the over-reaching claim that we assure the quality of their work.

Very Short Blog Posts (35): Make Things Visible

Tuesday, April 24th, 2018

I hear a lot from testers who discover problems late in development, and who get grief for bringing them up.

On one level, the complaints are baseless, like holding an investigate journalist responsible for a corrupt government. On the other hand, there’s a way for testers to anticipate bad news and reduce the surprises. Try producing a product coverage outline and a risk list.

A product coverage outline is an artifact (a mind map, or list, or table) that identifies factors, dimensions, or elements of a product that might be relevant to testing it. Those factors might include the product’s structure, the functions it performs, the data it processes, the interfaces it provides, the platforms upon which it depends, the operations that people might perform with it, and the way the product is affected by time. (Look here for more detail.) Sketches or diagrams can help too.

As you learn more through deeper testing, add branches to the map, or create more detailed maps of particular areas. Highlight areas that have been tested so far. Use colour to indicate the testing effort that has been applied to each area—and where coverage is shallower.

A risk list is a list of bad things that might happen: Some person(s) will experience a problem with respect to something desirable that can be detected in some set of conditions because of a vulnerability in the system. Generate ideas on that, rank them, and list them.

At the beginning of the project or early as possible, post your coverage outline and risk list in places where people will see and read them. Update it daily. Invite questions and conversations. This can help you change “why didn’t you find that bug?” to “why didn’t we find that bug?”

Very Short Blog Posts (33): Insufficient Information and Insufficient Time

Monday, March 19th, 2018

Here’s a question I get from testers quite a lot:

“What do I do when the developers give me something to test with insufficient information and time to test it?”

Here’s my quick answer: test it.

Here’s my practical answer: test it with whatever time and information you have available. (Testing is evaluating a product by learning about it through exploration and experimentation.) When your time is up, provide a report on what you have learned about the product, with particular focus on any problems you have found.

Identify the important risks and product factors of which you are aware, and which you have covered. (A product factor, or product element, is something that can be examined during a test, or that could influence the outcome of a test.) Identify important risks and product factors that you’re aware of and that you haven’t covered. Note the time and sources of information that you had available to you.

If part of the product or feature is obscure to you because you perceive that you have had insufficient information or time or testabilty to learn about it, include that in your report.

(I’ll provide a deep answer to the question eventually, too.)

Related posts:

How Is the Testing Going?
Testability
Testing Problems Are Test Results

How is the testing going?

Thursday, February 8th, 2018

Last week on Twitter, I posted this:

“The testing is going well.” Does this mean the product is in good shape, or that we’re obtaining good coverage, or finding lots of bugs? “The testing is going badly.” The product is in good shape? Testing is blocked? We’re noting lots of bugs erroneously?

People replied offering their interpretations. That wasn’t too surprising. Their interpretations differed; that wasn’t too surprising either. I was more surprised at how many people seemed to believe that there was a single basis on which we could say “the testing is going well” or “the testing is going badly”—along with the implicit assumption that people would automatically understand the answer.

To test is—among many other things—to construct, edit, narrate, and justify a story. Like any really good story, a testing story involves more than a single subject. A excellent, expert testing story has at least three significant elements, three plot lines that weave around each other like a braid. Miss one of those elements, and the chance of misinterpretation skyrockets. I’ve talked about this before, but it seems it’s time for a reminder.

In Rapid Software Testing, we emphasize the importance of a testing story with three strands, each of which is its own story.

We must tell a story about the product and its status. As we have tested, we have learned things about the product: what it is, what it does, how it works, how it doesn’t work, and how it might not work in ways that matter to our various clients. The overarching reason that most clients hire testers is to learn about problems that threaten the value of the product, so bugs—actual, manifest problems—tend to lead in the product story.

Risks—unobserved but potential problems—figure prominently in the product story too. From a social perspective, good news about the product is easier to deliver, and it does figure in a well-rounded report about the product’s state. But it’s the bad news—and the potential for more of it—that requires management attention.

We must tell a story about the testing. If we want management to trust us, our product story needs a warrant. Our product story becomes justified and is more trustworthy when we can describe how we configured, operated, observed, and evaluated the product. Part of this second strand of the testing story involves describing the ways in which we recognized problems; our oracles. Another part of this strand involves where we looked for problems; our coverage.

It’s important to talk about what we’ve covered with our testing. It may be far more important to talk about what we haven’t covered yet, or won’t cover at all unless something changes. Uncovered areas of the test space may conceal bugs and risks worse than any we’ve encountered so far.

Since we have limited time and resources, we must make choices about what to test. It’s our responsibility to make sure that our clients are aware of those choices, how we’re making them, and why we’re making them. We must highlight potentially important testing that hasn’t been done. When we do that, our clients can make informed decisions about the risks of leaving areas of the product untested—or provide the direction and resources to make sure that they do get tested.

We must tell a story about how good the testing is. If the second strand of the testing story supports the first, this third strand supports the second. Here it’s our job to describe why our testing is the most fabulous testing we could possibly do—or to the degree that it isn’t, why it isn’t, and what we need or recommend to make it better.

In particular, we must describe the issues that present obstacles to the fastest, least expensive, most powerful testing we can do. In the Rapid Software Testing namespace, a bug is a problem that threatens the value of the product; an issue is a problem that threatens the value of the testing. (Some people say “issue” for what we mean by “bug”, and “concern” for what we mean by “issue”. The labels don’t matter much, as long people recognize that there may be problems that get in the way of testing, and bring them to management’s attention.)

A key element in this third strand of the testing story is testability. Anything that makes testing harder, slower, or weaker gives bugs more time and more opportunity to survive undetected. Managers need to know about problems that impede testing, and must make management decisions to address them. As testers, we’re obliged to help managers make informed decisions.

On an unhappy day, some time in the future, when a manager asks “Why didn’t you find that bug?”, I want to be able to provide a reasonable response. For one thing, it’s not only that I didn’t notice the bug; no one on the team noticed the bug. For another, I want to be able to remind the manager that, during development, we all did our best and that we collaboratively decided where to direct our attention in testing and testability. Without talking about testing-related issues during development, those decisions will be poorly informed. And if we missed bugs, I want to make sure that we learn from whatever mistakes we’ve made. Allowing issues to remain hidden might be one of those mistakes.

In my experience, testers tend to recognize the importance of the first strand—reporting on the status of the product. It’s not often that I see testers who are good at the second strand—modeling and describing their coverage. Worse, I almost never encounter test reports in which testers describe what hasn’t been covered yet or will not be covered at all; important testing not done. As for the third strand, it seems to me that testers are pretty good at reporting problems that threaten the value of the testing work to each other. They’re not so good, alas, at reporting those problems to managers. Testers also aren’t necessarily so good at connecting problems with the testing to the risk that we’ll miss important problems in the product.

Managers: when you want a report from a tester and don’t want to be misled, ask about all three parts of the story. “How’s the product doing?” “How do you know? What have you covered, and what important testing hasn’t been done yet?” “Why should we be happy with the testing work? Why should we be concerned? What’s getting in the way of your doing the best testing you possibly could? How can we make the testing go faster, more easily, more comprehensively?”

Testers: when people ask “How is the testing going?”, they may be asking about any of the three strands in the testing story. When we don’t specify what we’re talking about, and reply with vague answers like “the testing is going well”, “the testing is going badly”, the person asking may apply the answer to the status of the product, the test coverage, or the quality of the testing work. The report that they hear may not be the report that we intended to deliver. To be safe, even when you answer briefly, make sure to provide a reply that touches on all three strands of the testing story.

Deeper Testing (2): Automating the Testing

Saturday, April 22nd, 2017

Here’s an easy-to-remember little substitution that you can perform when someone suggests “automating the testing”:

“Automate the evaluation
and learning
and exploration
and experimentation
and modeling
and studying of the specs
and observation of the product
and inference-drawing
and questioning
and risk assessment
and prioritization
and coverage analysis
and pattern recognition
and decision making
and design of the test lab
and preparation of the test lab
and sensemaking
and test code development
and tool selection
and recruiting of helpers
and making test notes
and preparing simulations
and bug advocacy
and triage
and relationship building
and analyzing platform dependencies
and product configuration
and application of oracles
and spontaneous playful interaction with the product
and discovery of new information
and preparation of reports for management
and recording of problems
and investigation of problems
and working out puzzling situations
and building the test team
and analyzing competitors
and resolving conflicting information
and benchmarking…”

And you can add things to this list too. Okay, so maybe it’s not so easy to remember. But that’s what it would mean to automate the testing.

Use tools? Absolutely! Tools are hugely important to amplify and extend and accelerate certain tasks within testing. We can talk about using tools in testing in powerful ways for specific purposes, including automated (or “programmed“) checking. Speaking more precisely costs very little, helps us establish our credibility, and affords deeper thinking about testing—and about how we might apply tools thoughtfully to testing work.

Just like research, design, programming, and management, testing can’t be automated. Trouble arises when we talk about “automated testing”: people who have not yet thought about testing too deeply (particularly naïve managers) might sustain the belief that testing can be automated. So let’s be helpful and careful not to enable that belief.

What Is A Tester?

Thursday, June 25th, 2015

A junior tester relates some of the issues she’s encountering in describing her work.

To the people who thinks she “just breaks stuff all day”, here’s what I might reply:

It’s not that I don’t just break stuff; I don’t break stuff at all. The stuff that I’ve given to test is what it is; if it’s broken, it was broken when I got it. If I break anything, consider what my colleague James Bach says: I break dreams; I break the illusion that the software is doing what people want.

And when somebody doesn’t understand what a tester does, these are some of the metaphors upon which I can start a conversation. These are some things that, in my testing work, I am or that I aspire to be.

I’m a research scientist. My field of study is a product that’s in development. I research the product and everything around it to discover things that no one else has noticed so far. An important focus of my research is potential problems that threaten the value of the product. Other people—builders and managers—may know an immense amount about the product, but the majority of their attention is necessarily directed towards trying to make things work, and satisfaction about things that appear to work already. As a scientist, I’m attempting to falsify the theory that everything is okay with the product. So I study the technologies on which the product is built. I model the tasks and the problem space that the product is intended to address. I analyze each feature in the product, looking for problems in the way it was designed. I experiment with each part of the product, trying to disprove the theory that it will behave reasonably no matter what people throw at it. I recognize the difference between an experiment (investigating whether something works) and a demonstration (showing that something can work).

I’m an explorer. I start with a fuzzy idea of the product, and a large, empty notebook. I treat the product as a set of territories to be investigated, a country or city or landscape to learn about. I move through the space, sometimes following a safe route, and sometimes deviating from the usual path, and sometimes going to extremes. I might follow some of the same paths over and over again, but when I really want to learn about the territory, I turn off the marked roads, bushwhacking, branching and backtracking, getting lost sometimes, but always trying to see the landscape from new angles. I observe and reflect as I go. In my notebook I create pages of maps, diagrams, lists, journal entries, tables, photos, procedures. Mind you, I know that the book is only a pale representation of what I’ve seen and what I’ve learned, no matter how much I write and illustrate. I also know that many of the pages in the book are for myself, and that I’ll only show a few pages to others. The notebook is not the story of my exploration; it helps me tell the story of my exploration. (Here’s some more on notebooks.)

I’m a social scientist. I’m a sociologist and anthropologist, studying how people live and work; how they organize and interact; how things happen in their culture; and how the product will help them get things done. That’s because a product is not merely machinery and some code to make it work. A product fits into society, to fulfill a social purpose of some kind, and humans must repair the differences between what machines and humans can do. Thus testing requires a complex social judgement—which is much more than a matter of making sure that the wheels spin right. (I am indebted to Harry Collins for putting this idea so clearly.) What I’m doing has hard-science elements (just as anthropology has a strong biological component), but social sciences don’t always return hard answers. Instead, they provide “partial answers that might be useful”. (I am indebted to Cem Kaner for putting this idea so clearly.) As a social scientist, I strive to become aware of my biases so that I can manage them, thereby addressing certain threats to the validity of my research. So, I use and interact with the product in ways that represent actual customers’ behaviour, to discover problems that I and everyone else might have missed otherwise. I gather facts about the product; how it fits into the tasks that users perform with it, and how people might have to adapt themselves to handle the things that the product doesn’t do so well.

I’m a tool user. I’m always interacting with hardware, software, and other contrivances that help me to get things done. I use tools as media in the McLuhan sense: tools extend, enhance, intensify, enable, accelerate, amplify my capabilities. Tools can help me set systems up, generate data, and see things that might otherwise be harder to see. Tools can help me to sort and search through data. Tools can help me to produce results that I can compare to my product’s results. Tools can check to help me see what’s there and what might be missing. Tools can help me to feed input to the product, to control it, and to observe its output. Tools can help me with record-keeping and reporting. Sometimes the tools I’ve got aren’t up to the task at hand, so I use tools to help me build tools—whereupon I am also a tool builder. I’m aware of another aspect of McLuhan’s ideas about media: when extended beyond their original or intended capacity, tools reverse into producing the opposite of their original or intended effects.

I’m a critic. Like my favourite film critics, I study the work and how it might appeal—or not—to the audience for which it is intended. I study the technical aspects of the product, just as a film critic looks at lighting, framing of the shots, and other aspects of cinematography; at sound; at editing; at story construction; and so forth. I study culture and history—I study the culture and history of software—as a critic studies those of film—and of societies generally—to evaluate how well the product (story) fits in relation to its culture and its period and the genre in which the work fits. I might like the work or not, but as a critic, my personal preferences aren’t as important as analyzing the work on behalf of an audience. To do this well, I must recognize my preferences and my biases, and manage them. I fit all those things and more into an account that helps a potential audience decide whether they’ll like it or not. (A key difference is that the reader of my review is not the audience of a finished product; my review is for the cast, crew, and producers as the product is being built.)

I’m an investigative reporter. My beat is the product and everything and everyone around it. I ask the who, what, where, when, why, and how questions that reporters ask, and I’m continually figuring out and refining the next set of questions I need to ask. I’m interacting with the product myself, to learn all I can about it. I’m interviewing people who are asking for it, the people are who building it, and other people who might use it. I’m telling a story about what I discover, one that leads with a headline, begins with a summary overview and delves into to more detail. My story might be illustrated with charts, tables, and pictures. My story is truthful, but I realize the existence of different truths for different people, so I’m also prepared to bring several perspectives to the story.

There are other metaphors, of course. These are the prominent ones for me. What other ones can you see in your own work?

Very Short Blog Posts (24): You Are Not a Bureaucrat

Saturday, February 7th, 2015

Here’s a pattern I see fairly often at the end of bug reports:

Expected: “Total” field should update and display correct result.
Actual: “Total” field updates and displays incorrect result.

Come on. When you write a report like that, can you blame people for thinking you’re a little slow? Or that you’re a bureaucrat, and that testing work is mindless paperwork and form-filling? Or perhaps that you’re being condescending?

It is absolutely important that you describe a problem in your bug report, and how to observe that problem. In the end, a bug is an inconsistency between a desired state and an observed state; between what we want and what we’ve got. It’s very important to identify the nature of that inconsistency; oracles are our means of recognizing and describing problems. But in the relationship between your observation and the desired state, the expectation is the middleman. Your expectation is grounded in a principle based on some desirable consistency. If you need to make that principle explicit, leave out the expectation, and go directly for a good oracle instead.