Blog Posts for the ‘Words and Semantics’ Category

Breaking the Test Case Addiction (Part 10)

Monday, June 8th, 2020

This post serves two purposes. It is yet another installation in The Series That Ate My Blog; and it’s a kind of personal exploration of work in progress on the Rapid Software Testing Guide to Test Reporting. Your feedback and questions on this post will help to inform the second project, so I welcome your comments.

As a tester, your mission is to evaluate the product and report on its status, typically with a special emphasis on finding problems that matter. We’ve discussed bug reporting in the Rapid Testing Guide to Making Good Bug Reports. In this installment of Breaking the Test Case Addiction, I’m describing test reporting as something that responsible testers do.

Sounds straightforward, right? But right away, I want to address the risk of misunderstanding, so let me clear up what I mean by certain terms here.

Responsible Testers
Responsible testers are people who assume the role of tester on a project, and who commit themselves to doing that job well over time. Supporting testers (which we used to call “helpers”) help the test effort temporarily or intermittently, but are not committed to the testing role. Supporting testers are generally not required to report on their testing work to the same degree as responsible testers are.

Test Project
In this post, when I say test project, I’m referring to any set of activities focused on testing of any product or service, or any part of it: a low-level unit, a function, a component, a feature, a story, a service, an entire system… A test project can contain lots of little test projects. Accordingly, depending on the level of granularity we’re referring to, a test project might happen over moments or minutes, days, weeks, or months. A report on a test project might cover similar spans of time—instants, episodes, sprints, releases…

“Test project” here could refer to something that happens outside of development. More typically, it refers to testing activity that happens inside a development project, in parallel with the other aspects of development, like design, programming, or other testing.

Product
When I say product here, I mean anything that anyone has produced that might be subject to testing. While that includes running code, “product” could include code that is not running yet; prototypes and mockups; specifications and other requirement documents; flowcharts, diagrams, or state models; user documentation; sales and marketing material; or ideas about any of those things. When we refer to testing activity pointed at things that are static, like most of the items in the preceding list, we usually call it “review”; we might also call it “performing a thought experiment”. Review is a kind of testing activity that may be closely or distantly associated with performing a test—which brings us to what we mean by “testing”.

Testing, Test Activities, and Review
When I say testing here, I am using the Rapid Software Testing definition. To us, testing is the process of evaluating a product by learning about it through experiencing, exploring, and experimenting.

Testing includes many activities: questioning, studying, modeling, operating the product, manipulating it, making inferences, analyzing risk, thinking critically, recording the process, reporting on it, etc. Testing activities also include investigating and analyzing bugs and suspicious behaviour. Testing typically includes applying tools to help with any testing activities.

A test is an instance of testing, and to perform a test means to explore, experiment with, and gain experience of a product. In general, to perform a test implies that we will operate and observe a product or its output by some means.

In review, operation of the product as such typically isn’t available. In review, though, we engage in other testing activities as mentioned above. We can’t perform experiments on the running product but, as I mentioned above, we might perform thought experiments on it, imagining interactions between the product and the people using it. Of course, a thought experiment isn’t the same as a real-world experiment; that’s a key difference between review and performing a test.

Why go on about all this? Because reporting is central to our role as testers. We test; we learn; and we report on what we’ve learned.

Are you doing testing work of any kind, or even thinking about doing testing? Then you’ve got a test project on the go, and you can report on its status, even if your report starts with “I haven’t started testing the product yet, but here are some ideas about how we might go about it.”

Report
Next, let’s unpack the idea of a report. A report is a description, explanation, or justification of something. A report is a communication, but a report is not necessarily a document.

Communicating a report might happen as conversation in a hallway, or beside a coffee machine or a water cooler; as a couple of sentences uttered at a stand-up meeting; as a quick mention of a bug in passing to a developer; as a lengthy description of the status of the product and the status of testing at a go-live meeting. A report might be conveyed in writing as a paragraph, a page, or several pages of text; as (heaven help us) a PowerPoint presentation; or as hundreds of pages in bound books, formally presented to a government or regulatory body.

We might include or refer to artifacts collected or produced during the activity that led to the report—the reporter’s raw notes, data sets, program code, design notes for the activity itself. A report might be supplemented with illustrations, charts, graphs, or diagrams, sketched on a whiteboard or formally rendered on glossy paper. Or a report might be accompanied by photographs, audio, video, mind maps, tables, and references to other artifacts.

Test Report
A test report is any description, explanation, or justification of the status of a test project.

A comprehensive test report is all of those things together.

A professional test report is one that is competently, thoughtfully, and ethically designed to serve your clients in their context. A professional test report need not be a comprehensive test report, nor vice versa.

Some might say that a test report is “just the facts”, but it isn’t; it cannot be. A test report is based on facts, but it’s a story about facts—a story framed for the person or people receiving it. Stories always emphasize some things and leave other things out. We never have all the facts, and facts are sometimes in dispute. Stories are always, to some degree, biased by the storyteller and focused by what the storyteller wants the audience to hear, to learn, and to know. Those biases can seen be as problems in the report, features of it, or both.

The audience for your test report might include insiders who are directly involved in the testing and development work; other insiders (who might be overseeing that work, or affected by it without being directly involved); or outsiders.

For now, I’m going to assume your audience is in the first two categories. On that basis, it helps to consider what the audiences for a test report probably wants to know above all else.

They almost certainly don’t want to know about test case counts (although they might think they do).
They almost certainly don’t want to know about pass-fail ratios (although they might think they do).
They almost certainly don’t want to know about when the testing is going to be done (although they might think they do).

(I realize that these claims may sound strange to you. I will address these (non-)desires in a future post.)

Having been a program manager, a developer, and having worked with lots of them, I can tell you what those people almost certainly do want to know:

What is the actual status of the product? Are there problems that threaten the value of the product? Do these problems threaten the on-time, successful completion of our work?

A test report addresses those questions.

Three Aspects of Test Reporting
A good test report braids three strands of story together:

  • a story about the product and its status; what the product is, what it does, how it works, how it doesn’t work, and how it might not work in ways that matter to our various clients. This is a story about bugs, problems, and risks about the product.
  • a story about how the testing was done—how the product story in was obtained; how we configured, operated, observed, and evaluated the product. A thread in this second strand of the testing story involves describing the ways in which we recognized problems; our oracles. Another thread in this strand involves where we looked for problems; our coverage. Yet another thread includes what we haven’t covered yet, or won’t cover at all unless something changes.
  • a story about the quality of the testing work—why the testing that was done can be trusted, or to the degree that it is untrustworthy, issues that present obstacles to the fastest, least expensive, most powerful testing we can do. In this strand, we also identify what we might need or recommend to the testing better, and we may also provide a context and and evaluation of the quality of the report itself.

Most of the time, the client of the testing will be most interested in that first strand. Sometimes the client might be more interested in one of the other two. Nonetheless, whatever form the report might take, the reporter should at least be prepared to address all three strands.

(I’ve written more about this pattern here, here, and here.)

Credibility
If you’re not credible, your reports won’t be taken seriously. In your reporting, you may be delivering surprising or uncomfortable information. Your clients, unconsciously or deliberately, may assume that you’re mistaken or that you’re exaggerating risks, and they may try to micro-manage your reporting. Credibility is an antidote to all this.

To build and maintain credibility, it’s important to actually care about the project and the people on it. It’s important to take your work and your skills seriously, and to demonstrate that seriousness in your attitude, commitments, and behaviour. There will be more to say about this later, but for now…

  • Actually know how to do your job.
  • Gain experience with the product.
  • Study the technology in and around your project.
  • Read all of the relevant requirement, specification, and standards documents carefully, especially when you’re in a regulated environment.
  • Take notes diligently on your own work to inform your reporting.
  • Sweat the details in your own work.
  • Find things to appreciate about the work of others.
  • Acknowledge mistakes, correct them and learn from them.
  • Do not tell lies or exaggerate.

Examples
Note that Part 7 of this series included a number of test reports delivered verbally. Here I’m providing examples of test report documents.

As you survey them, you might want to consider the context for which they’re intended; the reporting levels that they focus on (product, testing, or quality-of-testing); the evidence or references included to support the report; and what the report might need or could leave out.

Note that while a couple of reports refer to specific things to be checked, there is rarely even a mention of test cases. The focus, instead, is usually on bugs or potential problems in the product that represent risk to the value of the product, and therefore risk to the business.

Spot Check Test Report

Click to access mpim-report.pdf


Here is an example of a real, comprehensive, professional test report, prepared by James Bach and edited by me. Over five pages, it describes a paired exploratory testing session that found problems in a real medical device. (The names, nouns and verbs have been changed to shield the identity of the company and the product.)

Cheese Grater Incident Report

Click to access cheesegrater.pdf


This is two reports in one: a whimsical yet serious report on repairing a broken Parmesan cheese dispenser; and a much longer, detailed set of notes on how to perform an investigation and report on it. Indeed, the latter section is a really worthwhile complement to this blog post.

OEW Case Tool

Click to access OEWCaseToolReport.pdf


An example of a two-page summary report (from 1994!) about a computer-aided software engineering (CASE) tool at Borland.

Y2K Compliance Report

Click to access Y2KComplianceReport.pdf


An eight-page report prepared for compliance with Y2K requirements, including notes on strategy; the test approaches that were applied (and risks that prompted those approaches); the results; and a list of specific items that needed to be checked.

OWL Quality Plan

Click to access OWLQualityPlan.pdf


This is a report on proposed plans for testing another Borland product, the Object Windows Library. The report includes a table linking product risks to testing work necessary to investigate those risks. It also includes a listing of components and sub-components in the product.

An Exploratory Tester’s Notebook

Click to access etnotebook.pdf


This paper on recording and reporting includes a report on my spontaneous investigation of an in-flight entertainment system, and a couple of session-based test management session sheets.

A Sticky Situation

Click to access 2012-02-AStickySituation.pdf


This is an example of a form of reporting that’s sometimes called an “information radiator”. It visualizes the status of a test project (and some degree of test coverage) using sticky notes.

The Low-Tech Testing Dashboard

Click to access dashboard.pdf


Of this, James Bach says “Back in 1997, I was challenged by top management to create a way to convey testing status at a glance. Thus was born the “low-tech testing dashboard” which has since been rendered in various electronic, distributed forms. The important thing about the dashboard is that there are no “measurements.” We don’t count anything. Instead there are assessments. These are subjective, yes, but always grounded in evidence.

Who Killed My Battery?

Click to access boneh-www2012.pdf


A splendid research paper on what drains mobile phone batteries… and why. Also a presentation on YouTube: https://www.youtube.com/watch?v=_uv057DP2Vs

Once again, these reports don’t focus test cases, but on testing. They’re examples of powerful and reasonable test reports that offer an alternative to management that is fixated on test cases.

Managers are more likely to relax their obsession with test cases when we provide them with reports that tell the product and testing stories.

I Represent the User! And We All Do

Saturday, December 15th, 2018

As a tester, I try to represent the interests of users. Saying the user, in the singular, feels like a trap to me. There are usually lots of users, and they tend to have diverse and sometimes competing interests. I’d like to represent and highlight the interests of users that might have been forgotten or overlooked.

There’s another trap, though. As Cem Kaner has pointed out, it’s worth remembering that practically everyone else on the team represents the interests of end users in some sense. “End users want this product in a timely way at a reasonable price, so let’s get things happening on schedule and on budget,” say the project managers. “End users like lots of features,” say the marketers. “End users want this specific feature right away,” say the sales people. “End users want this feature optimized like I’m making it now,” say the programmers. I’d be careful about claiming that I represent the end user—and especially insinuating that I’m the only one who does—when lots of other people can credibly make that claim.

Meanwhile, I aspire to test and find problems that threaten the value of the product for anyone who matters. That includes anyone who might have an interest in the success of the product, like managers and developers, of course. It also includes anyone whose interests might have been forgotten or neglected. Technical support people, customer service representatives, and documentors spring to mind as examples. There are others. Can you think of them? People who live in other countries or speak other languages, whether they’re end users or business partners or colleagues in remote offices, are often overlooked or ignored.

All of the people in our organization play a role in assuring quality. I can assure the quality of my own work, but not of the product overall. For that reason, it seems inappropriate to dub myself and my testing colleagues as “quality assurance”. The “quality assurance” moniker causes no end of confusion and angst. Alas, not much has changed over the last 35 years or so: no one, including the most of the testing community, seems willing to call testers what they are: testers.

That’s a title I believe we should wear proudly and humbly. Proudly, because we cheerfully and diligently investigate the product, learning deeply about it where most others merely prod it a little. Humbly, because we don’t create the product, design it, code it, or fix it if it has problems. Let’s honour those who do that, and not make the over-reaching claim that we assure the quality of their work.

If We Do Sanity Testing Before Release, Do We Have To Do Regression Testing?

Monday, December 3rd, 2018

Here is an edition of the reply I offered to a question that someone asked on Quora. Bear in mind that it might be a good idea to follow the links for context.

If we do sanity testing before release, do we have to do regression testing?

What if I told you Yes? What if I told you No?

Some questions shouldn’t be answered. That is: some questions shouldn’t be answered with a Yes or a No without addressing the context first. No one can give you a good answer to your question unless they know you, your product, and your project’s context.

Even after that problem is addressed, people outside your context may not know what you mean by regression testing or sanity testing, and you can’t be sure of knowing what they mean. That applies to other terms in the conversation, too; maybe they’ll talk about “manual testing”; I don’t believe there’s such a thing as “manual testing”. Maybe you agree with them now; maybe you’ll agree with me after you’ve read the linked post. Or maybe after you read this one.

Some people will suggest that regression testing and sanity testing are fundamentally different somehow; I’d contend that a sanity test may be a shallow form of regression testing, when the sanity test is what I’ve talked about here, and when regression testing is testing focused on regression- or change-related risk. In order to sort that out, you’d have to talk it through to make sure that you’re not in shallow agreement.

Nonetheless, some people will try to answer your question. To prepare you for some of those answers: it’s probably not very helpful to think about needing to do one kind of testing or the other. It’s probably more helpful to think in terms of what you and your organization want to do, and choosing what to do based on what (you believe) you know about your product, and what (you believe) you want to know about it, given the situation.

While this is not an exhaustive list, here are a few factors to consider:

  • Do you and the developers already have a lot of experience with your product?
  • Is your product being developed in a careful, disciplined way?
  • Are the developers making small, simple, incremental changes that they comprehend the risks well?
  • Is the product relatively well insulated from dependencies on platforms (hardware, operating systems, middleware, browers…) that vary a lot?
  • Are there already plenty of unit-level checks in place, such that the developers are likely to be aware of low-level coding errors early and easily?
  • Is it unusual to do a shallow pass through the features of the product and find bugs that are sticking out like a sore thumb?
  • Do you and the developers feel like they’re working at a sustainable pace?

If the answer to all of those questions is Yes, then maybe your regression testing can afford to be more focused on deep, rare, hidden, subtle, emergent problems, which are unlikely to be revealed by a sanity test. Or maybe your product (or a given feature, or a given change, or whatever you’re focused on) entails relatively low risk, so deep regression testing isn’t necessary and a sanity test will do. Or maybe your product is poorly-understood and has changed a lot, so both sanity checking and deep regression testing could be important to you.

I can offer things for you to think about, but I don’t think it’s appropriate for me or anyone else to answer your question for you. The good news is that if you study testing seriously, practice testing, and learn to test, you’ll be able to make this determination in collaboration with your team, and answer the question for yourself.

James Bach and I teach Rapid Software Testing to help people to become smart, powerful, helpful, independent testers, with agency over their work. If you want help with learning about Rapid Software Testing for yourself or for your team, find out how you can attend a public class, live or on line, or request one in-house.

Deeper Testing (2): Automating the Testing

Saturday, April 22nd, 2017

Here’s an easy-to-remember little substitution that you can perform when someone suggests “automating the testing”:

“Automate the evaluation
and learning
and exploration
and experimentation
and modeling
and studying of the specs
and observation of the product
and inference-drawing
and questioning
and risk assessment
and prioritization
and coverage analysis
and pattern recognition
and decision making
and design of the test lab
and preparation of the test lab
and sensemaking
and test code development
and tool selection
and recruiting of helpers
and making test notes
and preparing simulations
and bug advocacy
and triage
and relationship building
and analyzing platform dependencies
and product configuration
and application of oracles
and spontaneous playful interaction with the product
and discovery of new information
and preparation of reports for management
and recording of problems
and investigation of problems
and working out puzzling situations
and building the test team
and analyzing competitors
and resolving conflicting information
and benchmarking…”

And you can add things to this list too. Okay, so maybe it’s not so easy to remember. But that’s what it would mean to automate the testing.

Use tools? Absolutely! Tools are hugely important to amplify and extend and accelerate certain tasks within testing. We can talk about using tools in testing in powerful ways for specific purposes, including automated (or “programmed“) checking. Speaking more precisely costs very little, helps us establish our credibility, and affords deeper thinking about testing—and about how we might apply tools thoughtfully to testing work.

Just like research, design, programming, and management, testing can’t be automated. Trouble arises when we talk about “automated testing”: people who have not yet thought about testing too deeply (particularly naïve managers) might sustain the belief that testing can be automated. So let’s be helpful and careful not to enable that belief.

A Context-Driven Approach to Automation in Testing

Sunday, January 31st, 2016

(We interrupt the previously-scheduled—and long—series on oracles for a public service announcement.)

Over the last year James Bach and I have been refining our ideas about the relationships between testing and tools in Rapid Software Testing. The result is this paper. It’s not a short piece, because it’s not a light subject. Here’s the abstract:

There are many wonderful ways tools can be used to help software testing. Yet, all across industry, tools are poorly applied, which adds terrible waste, confusion, and pain to what is already a hard problem. Why is this so? What can be done? We think the basic problem is a shallow, narrow, and ritualistic approach to tool use. This is encouraged by the pandemic, rarely examined, and absolutely false belief that testing is a mechanical, repetitive process.

Good testing, like programming, is instead a challenging intellectual process. Tool use in testing must therefore be mediated by people who understand the complexities of tools and of tests. This is as true for testing as for development, or indeed as it is for any skilled occupation from carpentry to medicine.

You can find the article here. Enjoy!

What Is A Tester?

Thursday, June 25th, 2015

A junior tester relates some of the issues she’s encountering in describing her work.

To the people who thinks she “just breaks stuff all day”, here’s what I might reply:

It’s not that I don’t just break stuff; I don’t break stuff at all. The stuff that I’ve given to test is what it is; if it’s broken, it was broken when I got it. If I break anything, consider what my colleague James Bach says: I break dreams; I break the illusion that the software is doing what people want.

And when somebody doesn’t understand what a tester does, these are some of the metaphors upon which I can start a conversation. These are some things that, in my testing work, I am or that I aspire to be.

I’m a research scientist. My field of study is a product that’s in development. I research the product and everything around it to discover things that no one else has noticed so far. An important focus of my research is potential problems that threaten the value of the product. Other people—builders and managers—may know an immense amount about the product, but the majority of their attention is necessarily directed towards trying to make things work, and satisfaction about things that appear to work already. As a scientist, I’m attempting to falsify the theory that everything is okay with the product. So I study the technologies on which the product is built. I model the tasks and the problem space that the product is intended to address. I analyze each feature in the product, looking for problems in the way it was designed. I experiment with each part of the product, trying to disprove the theory that it will behave reasonably no matter what people throw at it. I recognize the difference between an experiment (investigating whether something works) and a demonstration (showing that something can work).

I’m an explorer. I start with a fuzzy idea of the product, and a large, empty notebook. I treat the product as a set of territories to be investigated, a country or city or landscape to learn about. I move through the space, sometimes following a safe route, and sometimes deviating from the usual path, and sometimes going to extremes. I might follow some of the same paths over and over again, but when I really want to learn about the territory, I turn off the marked roads, bushwhacking, branching and backtracking, getting lost sometimes, but always trying to see the landscape from new angles. I observe and reflect as I go. In my notebook I create pages of maps, diagrams, lists, journal entries, tables, photos, procedures. Mind you, I know that the book is only a pale representation of what I’ve seen and what I’ve learned, no matter how much I write and illustrate. I also know that many of the pages in the book are for myself, and that I’ll only show a few pages to others. The notebook is not the story of my exploration; it helps me tell the story of my exploration. (Here’s some more on notebooks.)

I’m a social scientist. I’m a sociologist and anthropologist, studying how people live and work; how they organize and interact; how things happen in their culture; and how the product will help them get things done. That’s because a product is not merely machinery and some code to make it work. A product fits into society, to fulfill a social purpose of some kind, and humans must repair the differences between what machines and humans can do. Thus testing requires a complex social judgement—which is much more than a matter of making sure that the wheels spin right. (I am indebted to Harry Collins for putting this idea so clearly.) What I’m doing has hard-science elements (just as anthropology has a strong biological component), but social sciences don’t always return hard answers. Instead, they provide “partial answers that might be useful”. (I am indebted to Cem Kaner for putting this idea so clearly.) As a social scientist, I strive to become aware of my biases so that I can manage them, thereby addressing certain threats to the validity of my research. So, I use and interact with the product in ways that represent actual customers’ behaviour, to discover problems that I and everyone else might have missed otherwise. I gather facts about the product; how it fits into the tasks that users perform with it, and how people might have to adapt themselves to handle the things that the product doesn’t do so well.

I’m a tool user. I’m always interacting with hardware, software, and other contrivances that help me to get things done. I use tools as media in the McLuhan sense: tools extend, enhance, intensify, enable, accelerate, amplify my capabilities. Tools can help me set systems up, generate data, and see things that might otherwise be harder to see. Tools can help me to sort and search through data. Tools can help me to produce results that I can compare to my product’s results. Tools can check to help me see what’s there and what might be missing. Tools can help me to feed input to the product, to control it, and to observe its output. Tools can help me with record-keeping and reporting. Sometimes the tools I’ve got aren’t up to the task at hand, so I use tools to help me build tools—whereupon I am also a tool builder. I’m aware of another aspect of McLuhan’s ideas about media: when extended beyond their original or intended capacity, tools reverse into producing the opposite of their original or intended effects.

I’m a critic. Like my favourite film critics, I study the work and how it might appeal—or not—to the audience for which it is intended. I study the technical aspects of the product, just as a film critic looks at lighting, framing of the shots, and other aspects of cinematography; at sound; at editing; at story construction; and so forth. I study culture and history—I study the culture and history of software—as a critic studies those of film—and of societies generally—to evaluate how well the product (story) fits in relation to its culture and its period and the genre in which the work fits. I might like the work or not, but as a critic, my personal preferences aren’t as important as analyzing the work on behalf of an audience. To do this well, I must recognize my preferences and my biases, and manage them. I fit all those things and more into an account that helps a potential audience decide whether they’ll like it or not. (A key difference is that the reader of my review is not the audience of a finished product; my review is for the cast, crew, and producers as the product is being built.)

I’m an investigative reporter. My beat is the product and everything and everyone around it. I ask the who, what, where, when, why, and how questions that reporters ask, and I’m continually figuring out and refining the next set of questions I need to ask. I’m interacting with the product myself, to learn all I can about it. I’m interviewing people who are asking for it, the people are who building it, and other people who might use it. I’m telling a story about what I discover, one that leads with a headline, begins with a summary overview and delves into to more detail. My story might be illustrated with charts, tables, and pictures. My story is truthful, but I realize the existence of different truths for different people, so I’m also prepared to bring several perspectives to the story.

There are other metaphors, of course. These are the prominent ones for me. What other ones can you see in your own work?

On a Role

Monday, June 15th, 2015

This article was originally published in the February 2015 edition of Testing Trapeze, an excellent online testing magazine produced by our testing friends in New Zealand. There are small edits here from the version I submitted.

Once upon a time, before I was a tester, I worked in theatre. Throughout my career, I took on many roles—but maybe not in the way you’d immediately expect.

In my early days, I was a performer, acting in roles in the sense that springs to mind for most people when they think of theatre: characters in a play. Most of the time, though, I was in the role of a stage manager, which is a little like being a program manager in a software development group. Sometimes my role was that of a lighting designer, sound engineer, or stagehand. I worked in the wardrobe of the Toronto production of CATS for six months, too.

Recent discussions about software development have prompted me to think about the role of roles in our work, and in work generally. For example, in a typical theatre piece, an actor performs in three different roles at once. Here, I’ll classify them…

a first-order role, in which a person is a member of the theatre company throughout the rehearsal period and run of the play. If someone asks him “What are you working on these days?”, he’ll reply “I’m doing a show with the Mistytown Theatre Company.”

a second-order role that the person takes on when he arrives at the theatre, defocusing from his day-to-day role as a husband and father, and focusing his energy on being an actor, or stagehand, or lighting designer. He typically holds that second-order role over the course of the working day, and abandons it when it’s time to go home.

a third-order role that the actor performs as a specific character at some point during the show. In many cases, the actor takes on one character per performance. Occasionally an actor takes on several different characters throughout the course of the performance, playing a new third-order role from one moment to another. In an improvisational theatre company, a performer may pick up and drop third-order roles as quickly as you or I would don or doff a hat. In a more traditional style of theatre, roles are more sharply defined, and things can get confusing when actors suddenly and unexpectedly change roles mid-performance.

(I saw that happen once during my theatre career. An elderly performer took ill during the middle of the first act, and her much younger understudy stepped in for the remainder of the show. It was necessary on that occasion, of course, but the relationships between the performers were shaken up for the rest of the evening, and there was no telling what sense the audience was able to make of the sudden switch until intermission when the stage manager made an announcement.)

It’s natural and normal to deal simultaneously with roles of different orders, but it’s hard to handle two roles of the same order at exactly the same time. For example, a person may be both a member of a theatre company and a parent, but it’s not easy to supervise a child while you’re on stage in the middle of a show. In a small theatre company, the same person might hold two second-order roles—as both an actor and a costume designer, say—but in a given moment, that person is focusing on either acting or costume design, but not both at once.

People in a perfomer role tend not to play two different third-order roles—two different characters—at the same moment. There are rare exceptions, as in those weird Star Trek episodes or in movies like All of Me, in which one character is inhabiting the body of another. To perform successfully in two simultaneous third-order roles takes spectacular amounts of discipline and skill, and the occasions where it’s necessary to do so aren’t terribly common.

Some roles are more temporary than others. At the end of the performance, people drop their second-order roles to go home and live out their other, more long-term roles; husbands and wives, parents, daughters and sons. They may adopt other roles too: volunteer in the community soup kitchen; declarer in this hand of the bridge game; parishioners at the church; pitcher on the softball team.

Roles can be refined and redefined; in a dramatic television series, an actor performs in a third-order role in each episode, as a particular character. If it’s an interesting character, aspects of the role change and develop over time.

At the end of the run of a show, people may continue in their first-order roles with the same theatre company; they may become directors or choreographers with that company; or they may move on to another role in another company. They may take on another career altogether. Other roles evolve too, from friend to lover to spouse to parent.

In theatre, a role is an identity that a person takes to fulfill some purpose in service of the theatre company, production, or the nightly show. More generally, a role is a position or function that a person adopts and performs temporarily. A role represents a set of services offered, and often includes tacit or explicit commmitments to do certain things for and with other people.

A role is a way to summarize ideas about services people offer, activities they perform, and the goals that guide them.

Now: to software. As a member of a software development team within an organization, I’m an individual contributor. In that first-order role, I’m a generalist. I’ve been a program manager, programmer, tech support person, technical writer, network administrator, and phone system administrator, business owner, bookkeeper, teacher, musician… Those experiences have helped me to be aware of the diversity of roles on a project, to recognize and respect the the people who perform them, and to be able to perform them effectively to some extent if necessary.

In the individual contributor role, I commit to taking on work to help the company to achieve success, just as (I hope) everyone else in the company does.

Normally I’m taking on the everyday, second-order role of a tester, just as member of a theatre company might walk through the door in the evening as a lighting technician. By adopting the testing role, I’m declaring my commitment to specialize in providing testing services for the project.

That doesn’t limit me to testing, of course. If I’m asked, I might also do some programming or documentation work, especially in small development groups—just as an actor in a very small theatre company might help in the box office and take ticket orders from time to time. Nonetheless, my commitment and responsibility to provide testing services requires me to be very cautious about taking on things outside the testing role.

When I’m hired as a tester, my default belief is that there’s going to be more than enough testing work to do. If I’m being asked to perform in a different role such that important testing work might be neglected or compromised, I must figure out the priorities with my client.

Within my testing role, I might take on a third-order role as a responsible tester (James Bach has blogged on the role of the responsible tester) for a given project, but I might take on a variety of third-order roles as a test jumper (James has blogged about test jumpers, too).

Like parts of an outfit that I choose to wear, a role is a heuristic that can help to suggest who I am and what I do. In a hospital, the medical staff are easy to identify, wearing uniforms, lab coats, or scrubs that distinguish them from civilian life. Everyone wears badges that allow others to identify them. Surgical staff wear personalized caps—some plain and ordinary, others colourful and whimsical. Doctors often have stethoscopes stuffed into a coat pocket, and certificates from medical schools on their walls.

Yet what we might see remains a hint, not a certainty; someone dressed like a nurse may not be a nurse. The role is not a guarantee that the person is qualified to do the work, so it’s worthwhile to see if the garb is a good fit for the person wearing it.

The “team member” role is one thing; the role within the team is another. In a FIFA soccer match, the goalkeeper is dressed differently to make the distinct role—with its special responsibilities and expectations—clearly visible to everyone else, including his team members.

The goalkeeper’s role is to mind the net, not to run downfield trying to score goals. There’s no rule against a goalie trying to do what a striker does, but to do so would be disruptive to the dynamics of the team. When a goalkeeper runs downfield trying to score goals, he leaves the net unattended—and those who chose to defend the goal crease aren’t allowed to use their hands.

In well-organized, self-organized teamwork, roles help to identify whether people are in appropriate places. If I’m known as a tester on the project and I am suddenly indisposed, unavailable, or out of position, people are more likely to recognize that some of the testing work won’t get done.

Conversely, if someone else can’t fulfill their role for some reason, I’m prepared to step up and volunteer to help. Yet to be helpful, I need to coordinate consistently with the rest of the team to make sure our perceptions line up. On the one hand, I may not have have noticed important and necessary work. On the other, I don’t want to inflict help on the project, nor would it be respectful or wise for me to usurp anyone else’s role.

Shifting positions to adapt to a changing situation can be a lot easier when roles help to frame where we’re coming from, where we are, and where we’re going.

A role is not a full-body tattoo, permanently inscribed on me, difficult and painful to remove. A role is not a straitjacket. I wouldn’t volunteer to wear a straitjacket, and I’ll resist if someone tries to put me into one. As Kent Beck has said, “Responsibility cannot be assigned; it can only be accepted. If someone tries to give you responsibility, only you can decide if you are responsible or if you aren’t.” (from Extreme Programming Explained: Embrace Change)

I also (metaphorically) study escape artistry in the unlikely event that someone manages to constrain me. When I adopt a role, I must do so voluntarily, understanding the commitment I’m making and believing that I can perform it well—or learn it in a hurry.

I might temporarily adopt a third-order role normally taken by someone else, but in the long run, I can’t commit to a role without full and ongoing understanding, agreement, and consent between me and my clients.

If I resist accepting a role, I don’t do so capriciously or arbitrarily, but for deeply practical reasons related to three important problems.

The Expertise Problem. I’m willing to do or to learn almost anything, but there is often work for which I may be incompetent, unprepared or underqualified. Each set of tasks in software development requires a significant and distinct set of skills which must be learned and practiced if they are to be performed expertly.

I don’t want fool my client or my team into believing that the work will be done well until I’m capable, so I’ll push back on working in certain roles unless my client is willing to accept the attendant risks.

For example, becoming an expert programmer takes years of focused study, experience, and determination. As Collins and Evans suggest, real expertise requires not only skill, but also ongoing maintenance; immersion in a way of life. James Bach remarked to me recently, “The only reason that I’m not an expert programmer now is that I haven’t tried it. I’ve been in the software business for thirty years, and if I had focused on programming, I’d be a kick-ass programmer by now. But I chose to be a tester instead.”

I feel the same way. Programming is a valuable means to end for me—it helps me get certain kinds of testing work done. I can be a quite capable programmer when I put my mind to it, but I find I have to do programming constantly—almost obsessively—to maintain my skills to my own standards. (These days, if I were asked to do any kind of production programming—even minor changes to the code—I would insist on both close collaboration with peers and careful review by an expert.)

I believe I can perform competently, adequately, eventually, in any role. Yet competence and adequacy aren’t enough when I aspire to achieving excellence and mastery.

At a certain point in my life, I decided to focus my time and energy on testing and the teaching of it; the testing and teaching roles are the ones that attract me most. Their skills are the ones that I am most interested in trying to master—just as others are focused on mastering programming skills.

So: roles represent a heuristic for focusing my development of expertise, and for distributing expertise around the team.

The Mindset Problem. Building a product demands a certain mindset; testing it deeply demands another. When I’m programming or writing (as I’m doing now), I tend to be in the builder’s mindset. As such, I’m at close “critical distance” to the work. I’m seeing it from the position of an insider—me—rather than as an outsider.

When I’m in the builder’s mindset, it’s relatively easy for me to perform shallow testing and spot coding errors, or spelling and grammatical mistakes—although after I’ve been looking at the work for a while, I may start to miss those as well.

In the builder’s mindset, it’s quite a bit harder for me to notice deeper structural or thematic problems, because I’ve invested time and energy in building the piece as I have, converging towards something I believe that I want. To see deeper problems, I need the greater critical distance that’s available in the tester’s mindset—what testers or editors do.

It’s not a trivial matter to switch between mindsets, especially with respect to one’s own work. Switching mindsets is not impossible, but shifting from building into good critical and analytical work is effortful and time-consuming, and messes with the flow.

One heuristic for identifying deep problems in my writing work would be to walk away from writing—from the builder’s mindset—and come back later with the tester’s mindset—just as I’ve done several times with this essay. However, the change in mindset takes time, and even after days or weeks, part of me remains in the writer’s mindset—because it’s my writing.

Similarly, a programmer in the flow of developing a product may find it disruptive—both logistically and intellectually—to switch mindsets and start looking for problems. In fact, the required effort likely explains a good deal of some programmers’ stated reluctance to do deep testing on their own.

So another useful heuristic is for the builder to show the work to other people. As they are different people, other builders naturally have critical distance, but that distance gets emphasized when they agree to take on a testing role.

I’ve done that with this article too, by enlisting helpers—other writers who adopt the roles of editors and reviewers. A reviewer might usually identify herself as a writer, just as someone in a testing role might normally identify as a programmer. Yet temporarily adopting a reviewer’s role and a testing mindset frames the approach to the task at hand—finding important problems in the work that are harder to see quickly from the builder’s mindset.

In publishing, some people by inclination, experience, training, and skills specialize in editing, rather than writing. The editing role is analogous to that of the dedicated tester—someone who remains consistently in the tester’s mindset, at even farther critical distance from the work than the builder-helpers are—more quickly and easily able to observe deep, rare, or subtle problems that builders might not notice.

The Workspace Problem. Tasks in software development may require careful preparation, ongoing design, and day-to-day, long-term maintenance of environments and tools. Different jobs require different workspaces.

Programmers, in the building role, set up their environments and tools to do development and building work most simply and efficiently. Setting up a test lab for all of its different purposes—investigation of problems from the field; testing for adaptability and platform support; benchmarking for performance—takes time and focus away from valuable development tasks. The testing role provides a heuristic for distributing and organizing the work of maintaining the test lab.

People sometimes say “on an Agile project, everybody does everything” or “there are no roles on an Agile project”. To me, that’s like saying that there is no particular focusing heuristic for the services that people offer; throwing out the baby of skill with the bathwater of overspecialization and isolation.

Indeed, “everybody doing everything” seems to run counter to another idea important to Agile development: expertise and craftsmanship. A successful team is one in which people with diversified skills, interests, temperaments, and experiences work together to produce something that they could not have produced individually.

Roles are powerful heuristics for helping to organize and structure the relationships between those people. Even though I’m willing to do anything, I can serve the project best in the testing role, just as others serve the project best in the developer role.

That’s the end of the article. However, my colleague James Bach offered these observations on roles, which were included as a sidebar to the article in the magazine.

A role is probably not:

  • a declaration of the only things you are allowed to do. (It is neither a prison cell nor a destiny from which escape is not possible.)
  • a declaration of the things that you and you only are allowed to do. (It is not a fortress that prevents entry from anyone outside.)
  • a one-size, exclusive, permanent, or generic structure.

A role is:

  • a declaration of what one can be relied upon to do; a promise to perform a service or services well. (Some of those services may be explict; others are tacit.)
  • a unifying idea serving to focus commitment, preparation, performance, and delivery of services.
  • a heuristic for helping people manage their time on a project, and to be able to determine spontaneously who to approach, consult with, or make requests to (or sometimes avoid), in order to get things done.
  • a heuristic for fostering personal engagement and responsibility.
  • a heuristic for defining or explaining the meaning of your work.
  • a flexible and non-exclusive structure that may exist over a span of moments or years.
  • a label that represents these things.
  • a voluntary commitment.

A role may or may not be:

  • an identity
  • a component of identity.

—James Bach

Exploratory Testing 3.0

Tuesday, March 17th, 2015

This blog post was co-authored by James Bach and me. In the unlikely event that you don’t already read James’ blog, I recommend you go there now.

The summary is that we are beginning the process of deprecating the term “exploratory testing”, and replacing it with, simply, “testing”. We’re happy to receive replies either here or on James’ site.

Very Short Blog Posts (25): Testers Don’t Break the Software

Tuesday, February 17th, 2015

Plenty of testers claim that they break the software. They don’t really do that, of course. Software doesn’t break; it simply does what it has been designed and coded to do, for better or for worse. Testers investigate systems, looking at what the system does; discovering and reporting on where and how the software is broken; identifying when the system will fail under load or stress.

It might be a good idea to consider the psychological and public relations problems associated with claiming that you break the software. Programmers and managers might subconsciously harbour the idea that the software was fine until the testers broke it. The product would have shipped on time, except the testers broke it. Normal customers wouldn’t have problems with the software; it’s just that the testers broke it. There are no systemic problems in the project that lead to problems in the product; nuh-uh, the testers broke it.

As an alternative, you could simply say that you investigate the software and report on what it actually does—instead of what people hope or wish that it does. Or as my colleague James Bach puts it, “We don’t break the software. We break illusions about the software.”

Give Us Back Our Testing

Saturday, February 14th, 2015

“Program testing involves the execution of a program over sample test data followed by analysis of the output. Different kinds of test output can be generated. It may consist of final values of program output variables or of intermediate traces of selected variables. It may also consist of timing information, as in real time systems.

“The use of testing requires the existence of an external mechanism which can be used to check test output for correctness. This mechanism is referred to as the test oracle. Test oracles can take on different forms. They can consist of tables, hand calculated values, simulated results, or informal design and requirements descriptions.”

—William E. Howden, A Survey of Dynamic Analysis Methods, in Software Validation and Testing Techniques, IEEE Computer Society, 1981

Once upon a time, computers were used solely for computation. Humans did most of the work that preceded or followed the computation, so the scope of a computer program was limited. In the earliest days, testing a program mostly involved checking to see if the computations were being performed correctly, and that the hardware was working properly before and after the computation.

Over time, designers and programmers became more ambitious and computers became more powerful, enabling more complex and less purely numerical tasks to be encoded and delegated to the machinery. Enormous memory and blinding speed largely replaced the physical work associated with storing, retrieving, revising, and transmitting records. Computers got smaller and became more powerful and protean, used not only by mathematicians but also by scientists, business people, specialists, consumers, and kids.

Software is now used for everything from productivity to communications, control systems, games, audio playback, video displays, thermostats… Yet many of the software development community’s ideas about testing haven’t kept up. In fact, in many ways, they’ve gone backwards.

Ask people in the software business to describe what testing means to them, and many will begin to talk about test cases, and about comparing a program’s output to some predicted or expected result. Yet outside of software development, “testing” has retained its many more expansive meanings.

A teenager tests his parents’ patience. When confronted with a mysterious ailment, doctors perform diagnostic tests (often using very sophisticated tools) with open expectations and results that must be interpreted. Writers in Cook’s Illustrated magazine test techniques for roasting a turkey, and report on the different outcomes that they obtain by varying factors—flavours, colours, moisture, textures, cooking methods, cooking times… The Mythbusters, says Wikipedia, “use elements of the scientific method to test the validity of rumors, myths, movie scenes, adages, Internet videos, and news stories.”

Notice that all of these things called “testing” are focused on exploration, investigation, discovery, and learning. Yet over the last several decades, Howden’s notions of testing as checking for correctness, and of an oracle as a mechanism (or an artifact) became accepted by many people in the development and testing communities at large. Whether people were explicitly aware of those notions, they certainly seem tacitly to have subscribed to the idea that testing should be focused on analysis of the output, displacing those broader and deeper meanings of testing.

That idea might have been more reasonable when computers did nothing but compute. Today, computers and their software are richly intertwined with daily social life and things that we value. Yet for many in software development, “testing” has this narrow, impoverished meaning, limited to what James Bach and I call checking. Checking is a tactic of testing; the part of testing that can be encoded as algorithms and that therefore can be performed entirely by machinery. It is analogous to compiling, the part of programming that can be performed algorithmically.

Oddly, since we started distinguishing between testing and checking, some people have claimed that we’re “redefining” testing. We disagree. We believe that we are recovering testing’s meaning, restoring it to its original, rich, investigative sense. Testing’s meaning was stolen; we’re stealing it back.