DevelopsenseLogo

Statistician or Journalist?

Eric Jacobson has a problem, which he thoughtfully relates on his thoughtful blog in a post called “How Can I Tell Users What Testers Did?”. In this post, I’ll try to answer his question, so you might want to read his original post for context.

I see something interesting here: Eric tells a clear story to relate to his readers some problem that he’s having with explaining his work to others who, by his account, don’t seem to understand it well. In that story, he mentions some numbers in passing. Yet the numbers that he presents are incidental to the story, not central to it. On the contrary, in fact: when he uses numbers, he’s using them as examples of how poorly numbers tell the kind of story he wants to tell. Yet he tells a fine story, don’t you think?

In the Rapid Software Testing course, we present this idea (Note to Eric: we’ve added this since you took the class): To test is to compose, edit, narrate, and justify two parallel stories. You must tell a story about the product: how it works, how it fails, and how it might not work in ways that matter to your client (and in the context of a retrospective, you might like to talk about how the product was failing and is now working). But in order to give that story its warrant, you must tell another story: you must tell a story about your testing. In a case like Eric’s, that story would take the form of a summary report focused on two things: what you want to convey to your clients, and what they want to know from you (and, ideally, those two things should be in sync with each other).

To do that, you might like to consider various structures to frame your story. Let’s start with the elements of what we (somewhat whimsically) call The Universal Test Procedure (you can find it in the course notes for the class). From a retrospective view, that would include

  • your model of the test space (that is, what was inside and outside the scope of your testing, and in particular the risks that you were trying to address)
  • the oracles that you used
  • the coverage that you obtained
  • the test techniques you applied
  • the ways in which you configured the product
  • the ways in which you operated the product
  • the ways in which you observed the product
  • the ways in which you evaluated the product; and
  • the heuristics by which you decided to stop testing
  • what you discovered and reported, and how you reported

You might also consider the structures of exploratory testing. Even if your testing isn’t highly exploratory, a lot of the structures have parallels in scripted testing.

Jon Bach says (and I agree) that testing is journalism, so look at the way journalists structure a story: they often start with the classic pyramid lead. They might also start with a compelling anecdote as recounted in What’s Your Story, by Craig Wortmann, or Made to Stick, by Chip and Dan Heath. If you’re in the room with your clients, you can use a whiteboard talk with diagrams, as in Dan Roam’s The Back of the Napkin. At the centre of your story, you could talk about risks that you addressed with your testing; problems that you found and that got addressed; problems that you found and that didn’t get addressed; things that slowed you down as you were testing; effort that you spent in each area; coverage that you obtained. You could provide testimonials from the programmers about the most important problems you found; the assistance that you provided to them to help prevent problems; your contributions to design meetings or bug triage sessions; obstacles that you surmounted; a set of charters that you performed, and the feature areas that they covered. Again, focus on what you want to convey to your clients, and what they want to know from you.

Incidentally, the more often and the more coherently you tell your story, the less explaining you’ll have to do about the general stuff. That means keeping as close to your clients as you can, so that they can observe the story unfolding as it happens. But when you ask “What metric or easily understood information can my test team provide users, to show our contribution to the software we release?”, ask yourself this: “Am I a statistician or a journalist?”


Other resources for telling testing stories:

Thread-Based Test Management: Introducing Thread-Based Test Management, by James Bach; and A New Thread, by Jon Bach (as of this writing, this is brand new stuff)

a video.

Constructing the Quality Story (from Better Software, November 2009): Knowledge doesn’t just exist; we build it. Sometimes we disagree on what we’ve got, and sometimes we disagree on how to get it. Hard as it may be to imagine, the experimental approach itself was once controversial. What can we learn from the disputes of the past? How do we manage skepticism and trust and tell the testing story?

On Metrics:

Three Kinds of Measurement (And Two Ways to Use Them) (from Better Software, July 2009): How do we know what’s going on? We measure. Are software development and testing sciences, subject to the same kind of quantitative measurement that we use in physics? If not, what kinds of measurements should we use? How could we think more usefully about measurement to get maximum value with a minimum of fuss? One thing is for sure: we waste time and effort when we try to obtain six-decimal-place answers to whole-number questions. Unquantifiable doesn’t mean unmeasurable. We measure constantly WITHOUT resorting to numbers. Goldilocks did it.

Issues About Metrics About Bugs (Better Software, May 2009): Managers often use metrics to help make decisions about the state of the product or the quality of the work done by the test group. Yet measurements derived from bug counts can be highly misleading because a “bug” isn’t a tangible, countable thing; it’s a label for some aspect of some relationship between some person and some product, and it’s influenced by when and how we count… and by who is doing the counting.

On Coverage:

Got You Covered (from Better Software, October 2008): Excellent testing starts by questioning the mission. So, the first step when we are seeking to evaluate or enhance the quality of our test coverage is to determine for whom we’re determining coverage, and why.

Cover or Discover (from Better Software, November 2008): Excellent testing isn’t just about covering the “map”—it’s also about exploring the territory, which is the process by which we discover things that the map doesn’t cover.

A Map By Any Other Name (from Better Software, December 2008): A mapping illustrates a relationship between two things. In testing, a map might look like a road map, but it might also look like a list, a chart, a table, or a pile of stories. We can use any of these to help us think about test coverage.

7 replies to “Statistician or Journalist?”

  1. Michael, again a nice blog post. Keep them coming!

    I have an addition to your blogpost. It’s every important that tester is aware to whom his/her is talking to. Telling a compelling story to end-users with lots of details, like which techniques you used, tends to fly in one ear and out the other. Talking about techniques might not interest them. Luckily, you did add “you might like to consider various structures to frame your story” to your story. In addition, you might also tell several stories about the same testing to different listeners. One approach might be to split up the party into technical listeners (developers, applications managers etc) and non-technical listeners (end-users, functional managers, etc).

    Reply
  2. Thanks for the ideas, Michael. As I feared, this is going to take some work on my part. This is also going to take some convincing to help the team understand why it may not be possible to communicate our work via a pie chart.

    Michael’s reply: Give them an exercise.

    • Divide them into two teams, a “management” team, and a “development” team. (Ideally, reverse their real-life roles.)
    • The “management” team assigns the “development” team a task to build something (Tinkertoys, houses of cards, paper airplanes, or toothpicks & marshmallow objects will do).
    • The “development” team builds the thing in another room.
    • The development team returns to the room without the thing. They report on what they’ve built—with the stipulation that they can only report quantitatively. They can use pie charts, but they can’t describe the object or tell about the processs; they can’t tell about what worked and what didn’t, draw a diagram, or anything like that.
    • Have the “management” team write a description and draw a picture of what has been built. For extra points, they can also describe the problems that the “development” team faced, and either got stuck on or surmounted.
    • Then have the “development” team reveal what has been built, and tell the story about what happened, how they built it, the problems they had, and so forth.
    • Debrief. Make sure everyone on both teams gets a good chance to speak about their experience.

    I’ll read your suggested links and possibly the books (which look fun). Maybe that will help me get my brain around this more.

    Of course, if there is no audience, I’m not sure if I will bother becoming a journalist. At least not one that is going to win a Peabody award.

    Reply
  3. Love that exercise. Similar to an exercise Alistair Cockburn taught us having Product Owners write specs and have developers “develop” those specs in another room within a strict time limit and see what comes out. The trick is that the POs should go over into the other room. Only through communication can you get anything close – and I’m not talking about metrics.

    Reply
  4. I’ll submit my least substantive complaint first.

    Hi, John… thanks for complaining. 🙂

    If I’ve read with full comprehension your instructions for performing the exercise, you’ve suggested that the Management team could, without any knowledge of the Development team’s activities, earn extra points by describing problems that Development faced in implementing Management’s requirements. Or, in other words, you are suggesting that they declare their speculations to be fact. Is it necessary to incent people to do what they already do?

    First response: Did I mention points? Extra points?

    That might sound flipplant, but it’s a serious question. The purpose of this exercise is not to score points, but to allow people to see things from a different perspective. By bringing up the idea that there are points or scoring might be involved, you’ve helped me to recognize that when some people see exercises, they see points, or competition, or ranking as a necessary part of them. Thank you for that.

    Second response: in experiential exercises, some people need incentives to behave in a certain way, and other people don’t. Experiential exercises, to me, are highly exploratory. Here, you’ve identified a question that could come up in a debrief: how might various kinds of incentives have changed things? Not only that, but you’ve identified something that could come up in the course of the exercise itself: someone could decide to present incentives. The structure of exercises like these is intended to be very open. People will behave as they will, and that gives us a chance to observe and discuss people’s behaviour and interactions. So thank you for that too.

    Secondly, I would point out a subtle ambiguity. While I believe that you want us to regard the Development team’s work in Room #2 and the Management team’s writing and drawing in Room #1 as serial events, there is nothing inherently contradictory, in terms of your description, in having them perform their respective tasks in parallel. I raise this otherwise petty matter only because I believe you might use the alternate interpretation to demonstrate the problematic outcome of marketing-driven development.

    All ambiguities in experiential exercises are opportunities to observe how people deal with the ambiguity. You’ve described something that might indeed come to pass, so thank you for that, too.

    Without further dilation, I move to my main purpose in replying.
    The Management team is expected to write its description and draw its picture of the completed thing without seeing it. That is, they must work with reference to Development’s quantitative reports only. (Why else would the Exercise stipulate that “the development team returns to room without the thing“?) Are you not asking them to do the impossible?

    No, I don’t think so. Note that I didn’t say an accurate picture; I said “a picture”. I think it would be relatively easy for the Managers to draw a picture of what they might expect, based on their requirements and on the quantitative data. It’s the contrast (and similarity, for that matter) between what they expect and what actually gets delivered that would be more interesting.

    For it would appear that the only people who would be able to base a description and picture of the finished thing on Development’s quantitative reports alone would be those who had never seen the requirements at all, a quality fulfilled by Sales and Marketing, perhaps, but not Management.

    For the purpose of the exercise, it’s okay that the Managers have seen the requirements; they produced them. Again, to me, one immediately interesting thing would be the likely difference between what the Managers asked for and what the Developers produced. Other interesting things? We’d only be able to find that out by doing the exercise. And I expect that it would be fairly and interestingly different each time.

    How would you have us execute this exercise in such a way that we obtain full support from the Management team?

    I’m not sure I understand. The Managers in the exercise, or the management team in the organization? Either way, full support might not be necessary. As my colleague Dale Emery would quickly point out, if anyone resists, what might we learn from resistance?

    Reply

Leave a Comment