DevelopsenseLogo

Braiding The Stories (Test Reporting Part 2)

We were in the middle of a testing exercise at the Amplifying Your Effectiveness conference in 2005. I was assisting James Bach in a workshop that he was leading on testing. He presented the group with a mysterious application written by James Lyndsay—an early version of one of the Black Box Test Machines. “How many test cases would you need to test this application?” he asked.

Just then Jerry Weinberg wandered into the room. “Ah! Jerry Weinberg!” said James. “One of the greatest testing experts in the world! He’ll know the answer to this one. How many test cases would you need to test this application, Jerry?”

Jerry looked at the screen for a moment. “Three,” he said, firmly and decisively.

James knew to play along. “Three?!“, he said, in a feigned combination of amazement, uncertainty, and curiosity. “How do you know it’s three? Is it really three, Jerry?”

“Yes,” said Jerry. “Three.” He paused, and then said drily, “Why? Were you expecting some other number?”

In yesterday’s post, I was harshly critical of pass vs. fail ratios, a very problematic yet startlingly common way of estimating the state of the product and the project. When I point out the mischief of pass vs. fail ratios, some people object. “In the real world,” they say, “we have to report pass vs. fail ratios to our managers, because that’s what they want.” Yet bogus reporting is antithetical to the “real world”. Pass vs. fail ratios come from the the fake world, a world where numbers have magical properties to soothe troubled and uncertain souls. Still, there’s no question that managers want something. It’s our mandate to give them something of value.

Some people say that managers want numbers because they want to know that we’re measuring. I’ve found two ways of thinking about measurement that have been very useful to me. One is the definition from Kaner and Bond’s splendid paper “Software Engineering Metrics: What Do They Measure and How Do We Know?”: “Measurement is the empirical, objective assignment of numbers, according to a rule derived from a model or theory, to attributes of objects or events with the intent of describing them.” I think that’s a superb definition of quantitative measurement, and the paper includes a set of probing questions to test the validity of a quantitative measurement. Pass vs. fail ratios fall down badly when they’re subjected to those tests.

Jerry Weinberg offers another definition of measurement that I think is more in line with what managers really want: “Measurement is the art and science of making reliable (and significant) observations.” (The main part of the definition comes from Quality Software Management, Vol. 2: First-Order Measurement; the parenthetical comes from recent correspondence over Twitter.) That’s a more general, inclusive definition. It incorporates Kaner and Bond’s notion of quantitative measurement, but it’s more welcoming to qualitative, first-order approaches. First-order measurement, as Jerry describes it, provides answers to questions like “What seems to be happening? and What should I do now?” It entails a minimum of fuss, and tends to be direct, unobtrusive, inexpensive, and qualitative, leading either to immediate action or a decision to seek more information. It’s a common, misleading, and often expensive mistake in software development to leap over first-order measurement and reporting in favour of second-order—less direct, more quantified, more abstract, and based on more elaborate and vulnerable models.

My experience, as a tester, a programmer, a program manager, and a consultant, tells me that to manage a project well, you need a good deal of immediate and significant information. “Immediate” here doesn’t only mean timely; it also means unmediated, without a bunch of stuff getting in between you and the observation. In particular, managers need to know about problems that threaten the value of the product and the on-time, successful completion of the project. That knowledge requires more than abstract data; it requires information. So, as testers, how can we inform the decision-makers? In our Rapid Software Testing class, James Bach and I have lately taken to emphasizing this: We must learn to describe and report on the product, our testing, and the quality of our testing. This involves constructing, editing, narrating, and justifying a story in three lines that weave around each other like a braid. Each line, or level, is its own story.

Level 1: Tell the product story. The product story is a qualitative report on how the product can work, how it fails, and how it might fail in ways that matter to our clients. “Working”, “failure”, and “what matters” are all qualitative evaluations. Quality is value to some person; in a business setting, quality is value to some person who matters to the business. A qualitative report about a product requires us to relate the nature of the product, the people who matter, and the presence or absence of value, risks, and problems for those people. Qualitative information makes it possible for our clients to make informed decisions about quality.

Level 2: To make the product story credible, tell the testing story. The testing story is about how we configured, operated, observed, and evaluated the product; what we actually did and what we actually saw. The testing story gives warrant to the product story; it helps our clients understand why they should believe and trust the product story we’re giving. The testing story is centred around the coverage that we obtained and the oracles that we applied. Coverage is the extent to which we’ve tested the program; it’s about where we’ve looked and how we’ve looked, and it’s also about what’s uncovered—where we might not have looked yet, and where we don’t intend to look. Oracles are central to evaluation; they’re the principles and mechanisms that allow us to recognize a problem. The product story will likely feature problems in the product; the testing story, where necessary, includes an account of how we knew they were problems, for whom they would be problems, and inferences about how serious the problems it might be. We can make inferences about the significance of problems, but not ultimate conclusions, since the decision of what matters and what constitutes a problem lies with the product owner. The product story and our clients’ reactions to it will influence the ongoing testing story, and vice versa.

Level 3: To make the testing story credible, tell a story about the quality of the testing. Just as the product story needs warrant, so too does the testing story. To tell a story about the quality of testing requires us to describe why the testing we’ve done has been good enough, and why the testing we haven’t done hasn’t been so important so far. The quality-of-testing story includes details on what made testing harder or slower, what made the product more or less testable, what the risks and costs of testing are, and what we might need or recommend in order to provide better, more accurate, more timely information. The quality-of-testing story will shape and be shaped by the other two stories.

Develop skills to tell and frame stories. People sometimes justify presenting invalid numbers in lieu of stories by saying that numbers are “efficient”. I think they mean “fast”, since efficiency of communication depends not only on speed, but also on value, relevance, validity, and the level of detail your client needs. In order to frame stories appropriately and hit the right level of detail…

Don’t think data feed; think the daily news. Testing is like investigative journalism, researching and delivering stories to people. The newspaper business knows how to direct attention efficiently to the stories in which we’re interested, such that we get the level of detail that we seek. Some of those strategies include:

  • Headlines. A quick glance over each page tells us immediately what, in the editors’ judgement, are the most salient aspects of any given story. Headlines come in different sizes, relative to the editors’ assessment of the importance of the story.
  • Front page. The paper comes folded. The stories that the paper deems most important to its reader are on the front page, above the fold. Other important stories are on the front page below the fold. The page is laid out to direct our attention to what we find most relevant, and to allow us to focus and refocus on items of interest.
  • Continuation. When an entire story is too long to fit on the front page, it’s abbreviated and the story continues elsewhere. This gives the reader the option of following the story or looking at other items on the front page.
  • Coverage areas. The newspaper is organized into sections (hard news, business, sports, life and leisure, arts, real estate, cars, travel, and so forth). Each section comes with its own front page, which generally includes headlines and continuations of its own.
  • Structured storytelling. Newspaper stories tend to be organized in spiralling levels of detail, such that the story is set up to follow the inverted pyramid (the link is well worth reading). The story typically begins with the most newsworthy information, usually immediately addressing the five W questions—who, what where, why, and when, plus how—and the the story builds from there. The key is that the reader can absorb information to the level of detail she seeks, continuing to the end of the story or jumping out when she’s satisfied.
  • Identifying who is involved and who is affected. Reporters and editors contextualize their stories. Just as in testing, people are the most important element of the context. A story is far more compelling when it affects the reader or people that the reader cares about. A good story often helps to clarify why the reader should care.
  • Varying approaches to delivering information. Newspapers often use a picture to helps to illustrate or emphasize an important aspect of a story. In the business or sports sections, where quantitative data is often crucial, information may be organized in tables, or trends may be illustrated with charts. Notice that the stories—first-order reports—are always given greater prominence than the tables of stock quotes league standings, and line scores.
  • Sidebars. Some stories are illuminated by background information that might break the flow of the main story. That information is presented in parallel; in another thread, as we might say.
  • Daily (and in the world of the Web, continuous) delivery of information. My newspaper arrives at a regular time each day, a sort of daily heartbeat for the news cycle. The paper’s Web site is updated on a continuous basis. Information is available both on a supply and a demand basis; both when I expect it and when I seek it.
  • Identifiable sources. Well-researched stories gain credibility by identifying how, where, when, and from whom the information was obtained. This helps to set up degrees of trust and skepticism in the reader.

One important note: These approaches apply to more than text. Testers need to extend these patterns not only to written or mechanical forms, but to oral discourse.

I’ll have more suggestions and additional parallels between test reporting and newspapers in the next post in this series.

22 replies to “Braiding The Stories (Test Reporting Part 2)”

  1. Newspapers are so 20th century and (lamentably) dying. Is there another, more modern metaphor that could apply to test reporting?

    Michael replies: I’d say that for the purpose of this argument, newspapers are just right. They’ve done information architecture very well for generations. They’re (lamentably) dying for reasons that aren’t related to that.

    Reply
  2. Great series.

    Especially the parallels between test & news reporting.

    Would like to share a poke around that I attempted on a similar comparison between testing & investigative journalism approaches : http://thereluctanttester.wordpress.com/2011/05/13/journalism-and-testing/
    (inspired from a movie that I really liked )

    Coming back to the argument between descriptive approach versus numbers in test reporting, I agree that numbers should not be “frontal” but stories should.
    However I feel that raw test data (especially from automated checks) can not be ignored .

    I see test reporting as a layered approach with descriptive stories ( as you mentioned ) forming the top(usually a stakeholder facing layer) layer based descriptive analytical information derived from lower layers .
    Lower layers might include layers of raw numbers or pass/fail ratios,description of our test methods,tools,circumstances and people performing the testing.

    Reply
  3. I’m definitely interested to see the suggestions. In fact, I’ve got a request – would you please address how to tell stories to people who are seriously hung-up on numbers?

    Reply
  4. Good article, like it a lot.

    “we have to report pass vs. fail ratios to our managers, because that’s what they want.”

    I’d propose that this is actually not always the case. I had people trying to give me these rations because they thought I wanted them.
    After a good discussion that wasn’t a problem anymore but it showed that there seems to be an assumption that all test managers automatically want meaningless numbers. Let me tell these people, that is not the case. I reckon that previous bad experiences let to this behaviour.

    I’ll point my team to this article, thanks for putting your test approach into this fine level of detail.

    Reply
  5. I like this guide. I’d love to see an example. I work on a product that sends email newsletters, so I used it to create my test reports in a newsletter format. I keep meaning to write up a blog post about it, but here’s an example: http://cl.ly/0f2Z3t3a221Q182n1V2r

    The structure of these reports evolved over time, and I learned many of the points that you covered in your “daily news” analogy through trial and error.

    Reply
  6. What a great way to actually report on the testing and the product under test.

    This approach is something that I’m keen to implement.

    Thanks for sharing.

    Reply
  7. […] that would cause loss or harm or annoyance, loss of value, to somebody who matters. (See “Braiding the Stories (Test Reporting Part 2)” 2. “Testability” is it just a trendy word or the product attribute […]

    Reply
  8. Yes, yes. I’ve always been an advocate of providing a ‘quality assessment’ as part of the status. It really is about providing information — not just providing data. This article resonates with me as it elaborates on much of what I tell my team all the time.

    Reply
  9. That’s sounds good:) “One important note: These approaches apply to more than text. Testers need to extend these patterns not only to written or mechanical forms, but to oral discourse.”

    Reply

Leave a Comment