Blog Posts for the ‘Words and Semantics’ Category

Deeper Testing (2): Automating the Testing

Saturday, April 22nd, 2017

Here’s an easy-to-remember little substitution that you can perform when someone suggests “automating the testing”:

“Automate the evaluation
and learning
and exploration
and experimentation
and modeling
and studying of the specs
and observation of the product
and inference-drawing
and questioning
and risk assessment
and prioritization
and coverage analysis
and pattern recognition
and decision making
and design of the test lab
and preparation of the test lab
and sensemaking
and test code development
and tool selection
and recruiting of helpers
and making test notes
and preparing simulations
and bug advocacy
and triage
and relationship building
and analyzing platform dependencies
and product configuration
and application of oracles
and spontaneous playful interaction with the product
and discovery of new information
and preparation of reports for management
and recording of problems
and investigation of problems
and working out puzzling situations
and building the test team
and analyzing competitors
and resolving conflicting information
and benchmarking…”

And you can add things to this list too. Okay, so maybe it’s not so easy to remember. But that’s what it would mean to automate the testing.

Use tools? Absolutely! Tools are hugely important to amplify and extend and accelerate certain tasks within testing. We can talk about using tools in testing in powerful ways for specific purposes, including automated (or “programmed“) checking. Speaking more precisely costs very little, helps us establish our credibility, and affords deeper thinking about testing—and about how we might apply tools thoughtfully to testing work.

Just like research, design, programming, and management, testing can’t be automated. Trouble arises when we talk about “automated testing”: people who have not yet thought about testing too deeply (particularly naïve managers) might sustain the belief that testing can be automated. So let’s be helpful and careful not to enable that belief.

A Context-Driven Approach to Automation in Testing

Sunday, January 31st, 2016

(We interrupt the previously-scheduled—and long—series on oracles for a public service announcement.)

Over the last year James Bach and I have been refining our ideas about the relationships between testing and tools in Rapid Software Testing. The result is this paper. It’s not a short piece, because it’s not a light subject. Here’s the abstract:

There are many wonderful ways tools can be used to help software testing. Yet, all across industry, tools are poorly applied, which adds terrible waste, confusion, and pain to what is already a hard problem. Why is this so? What can be done? We think the basic problem is a shallow, narrow, and ritualistic approach to tool use. This is encouraged by the pandemic, rarely examined, and absolutely false belief that testing is a mechanical, repetitive process.

Good testing, like programming, is instead a challenging intellectual process. Tool use in testing must therefore be mediated by people who understand the complexities of tools and of tests. This is as true for testing as for development, or indeed as it is for any skilled occupation from carpentry to medicine.

You can find the article here. Enjoy!

What Is A Tester?

Thursday, June 25th, 2015

A junior tester relates some of the issues she’s encountering in describing her work.

To the people who thinks she “just breaks stuff all day”, here’s what I might reply:

It’s not that I don’t just break stuff; I don’t break stuff at all. The stuff that I’ve given to test is what it is; if it’s broken, it was broken when I got it. If I break anything, consider what my colleague James Bach says: I break dreams; I break the illusion that the software is doing what people want.

And when somebody doesn’t understand what a tester does, these are some of the metaphors upon which I can start a conversation. These are some things that, in my testing work, I am or that I aspire to be.

I’m a research scientist. My field of study is a product that’s in development. I research the product and everything around it to discover things that no one else has noticed so far. An important focus of my research is potential problems that threaten the value of the product. Other people—builders and managers—may know an immense amount about the product, but the majority of their attention is necessarily directed towards trying to make things work, and satisfaction about things that appear to work already. As a scientist, I’m attempting to falsify the theory that everything is okay with the product. So I study the technologies on which the product is built. I model the tasks and the problem space that the product is intended to address. I analyze each feature in the product, looking for problems in the way it was designed. I experiment with each part of the product, trying to disprove the theory that it will behave reasonably no matter what people throw at it. I recognize the difference between an experiment (investigating whether something works) and a demonstration (showing that something can work).

I’m an explorer. I start with a fuzzy idea of the product, and a large, empty notebook. I treat the product as a set of territories to be investigated, a country or city or landscape to learn about. I move through the space, sometimes following a safe route, and sometimes deviating from the usual path, and sometimes going to extremes. I might follow some of the same paths over and over again, but when I really want to learn about the territory, I turn off the marked roads, bushwhacking, branching and backtracking, getting lost sometimes, but always trying to see the landscape from new angles. I observe and reflect as I go. In my notebook I create pages of maps, diagrams, lists, journal entries, tables, photos, procedures. Mind you, I know that the book is only a pale representation of what I’ve seen and what I’ve learned, no matter how much I write and illustrate. I also know that many of the pages in the book are for myself, and that I’ll only show a few pages to others. The notebook is not the story of my exploration; it helps me tell the story of my exploration. (Here’s some more on notebooks.)

I’m a social scientist. I’m a sociologist and anthropologist, studying how people live and work; how they organize and interact; how things happen in their culture; and how the product will help them get things done. That’s because a product is not merely machinery and some code to make it work. A product fits into society, to fulfill a social purpose of some kind, and humans must repair the differences between what machines and humans can do. Thus testing requires a complex social judgement—which is much more than a matter of making sure that the wheels spin right. (I am indebted to Harry Collins for putting this idea so clearly.) What I’m doing has hard-science elements (just as anthropology has a strong biological component), but social sciences don’t always return hard answers. Instead, they provide “partial answers that might be useful”. (I am indebted to Cem Kaner for putting this idea so clearly.) As a social scientist, I strive to become aware of my biases so that I can manage them, thereby addressing certain threats to the validity of my research. So, I use and interact with the product in ways that represent actual customers’ behaviour, to discover problems that I and everyone else might have missed otherwise. I gather facts about the product; how it fits into the tasks that users perform with it, and how people might have to adapt themselves to handle the things that the product doesn’t do so well.

I’m a tool user. I’m always interacting with hardware, software, and other contrivances that help me to get things done. I use tools as media in the McLuhan sense: tools extend, enhance, intensify, enable, accelerate, amplify my capabilities. Tools can help me set systems up, generate data, and see things that might otherwise be harder to see. Tools can help me to sort and search through data. Tools can help me to produce results that I can compare to my product’s results. Tools can check to help me see what’s there and what might be missing. Tools can help me to feed input to the product, to control it, and to observe its output. Tools can help me with record-keeping and reporting. Sometimes the tools I’ve got aren’t up to the task at hand, so I use tools to help me build tools—whereupon I am also a tool builder. I’m aware of another aspect of McLuhan’s ideas about media: when extended beyond their original or intended capacity, tools reverse into producing the opposite of their original or intended effects.

I’m a critic. Like my favourite film critics, I study the work and how it might appeal—or not—to the audience for which it is intended. I study the technical aspects of the product, just as a film critic looks at lighting, framing of the shots, and other aspects of cinematography; at sound; at editing; at story construction; and so forth. I study culture and history—I study the culture and history of software—as a critic studies those of film—and of societies generally—to evaluate how well the product (story) fits in relation to its culture and its period and the genre in which the work fits. I might like the work or not, but as a critic, my personal preferences aren’t as important as analyzing the work on behalf of an audience. To do this well, I must recognize my preferences and my biases, and manage them. I fit all those things and more into an account that helps a potential audience decide whether they’ll like it or not. (A key difference is that the reader of my review is not the audience of a finished product; my review is for the cast, crew, and producers as the product is being built.)

I’m an investigative reporter. My beat is the product and everything and everyone around it. I ask the who, what, where, when, why, and how questions that reporters ask, and I’m continually figuring out and refining the next set of questions I need to ask. I’m interacting with the product myself, to learn all I can about it. I’m interviewing people who are asking for it, the people are who building it, and other people who might use it. I’m telling a story about what I discover, one that leads with a headline, begins with a summary overview and delves into to more detail. My story might be illustrated with charts, tables, and pictures. My story is truthful, but I realize the existence of different truths for different people, so I’m also prepared to bring several perspectives to the story.

There are other metaphors, of course. These are the prominent ones for me. What other ones can you see in your own work?

On a Role

Monday, June 15th, 2015

This article was originally published in the February 2015 edition of Testing Trapeze, an excellent online testing magazine produced by our testing friends in New Zealand. There are small edits here from the version I submitted.

Once upon a time, before I was a tester, I worked in theatre. Throughout my career, I took on many roles—but maybe not in the way you’d immediately expect. In my early days, I was a performer, acting in roles in the sense that springs to mind for most people when they think of theatre: characters in a play. Most of the time, though, I was in the role of a stage manager, which is a little like being a program manager in a software development group. Sometimes my role was that of a lighting designer, sound engineer, or stagehand. I worked in the wardrobe of the Toronto production of CATS for six months, too.

Recent discussions about software development have prompted me to think about the role of roles in our work, and in work generally. For example, in a typical theatre piece, an actor performs in three different roles at once. Here, I’ll classify them…

a first-order role, in which a person is a member of the theatre company throughout the rehearsal period and run of the play. If someone asks him “What are you working on these days?”, he’ll reply “I’m doing a show with the Mistytown Theatre Company.”

a second-order role that the person takes on when he arrives at the theatre, defocusing from his day-to-day role as a husband and father, and focusing his energy on being an actor, or stagehand, or lighting designer. He typically holds that second-order role over the course of the working day, and abandons it when it’s time to go home.

a third-order role that the actor performs as a specific character at some point during the show. In many cases, the actor takes on one character per performance. Occasionally an actor takes on several different characters throughout the course of the performance, playing a new third-order role from one moment to another. In an improvisational theatre company, a performer may pick up and drop third-order roles as quickly as you or I would don or doff a hat. In a more traditional style of theatre, roles are more sharply defined, and things can get confusing when actors suddenly and unexpectedly change roles mid-performance. (I saw that happen once during my theatre career. An elderly performer took ill during the middle of the first act, and her much younger understudy stepped in for the remainder of the show. It was necessary on that occasion, of course, but the relationships between the performers were shaken up for the rest of the evening, and there was no telling what sense the audience was able to make of the sudden switch until intermission when the stage manager made an announcement.)

It’s natural and normal to deal simultaneously with roles of different orders, but it’s hard to handle two roles of the same order at exactly the same time. For example, a person may be both a member of a theatre company and a parent, but it’s not easy to supervise a child while you’re on stage in the middle of a show. In a small theatre company, the same person might hold two second-order roles—as both an actor and a costume designer, say—but in a given moment, that person is focusing on either acting or costume design, but not both at once. People in a perfomer role tend not to play two different third-order roles—two different characters—at the same moment. There are rare exceptions, as in those weird Star Trek episodes or in movies like All of Me, in which one character is inhabiting the body of another. To perform successfully in two simultaneous third-order roles takes spectacular amounts of discipline and skill, and the occasions where it’s necessary to do so aren’t terribly common.

Some roles are more temporary than others. At the end of the performance, people drop their second-order roles to go home and live out their other, more long-term roles; husbands and wives, parents, daughters and sons. They may adopt other roles too: volunteer in the community soup kitchen; declarer in this hand of the bridge game; parishioners at the church; pitcher on the softball team.

Roles can be refined and redefined; in a dramatic television series, an actor performs in a third-order role in each episode, as a particular character. If it’s an interesting character, aspects of the role change and develop over time. At the end of the run of a show, people may continue in their first-order roles with the same theatre company; they may become directors or choreographers with that company; or they may move on to another role in another company. They may take on another career altogether. Other roles evolve too, from friend to lover to spouse to parent.

In theatre, a role is an identity that a person takes to fulfill some purpose in service of the theatre company, production, or the nightly show. More generally, a role is a position or function that a person adopts and performs temporarily. A role represents a set of services offered, and often includes tacit or explicit commmitments to do certain things for and with other people. A role is a way to summarize ideas about services people offer, activities they perform, and the goals that guide them.

Now: to software. As a member of a software development team within an organization, I’m an individual contributor. In that first-order role, I’m a generalist. I’ve been a program manager, programmer, tech support person, technical writer, network administrator, and phone system administrator, business owner, bookkeeper, teacher, musician… Those experiences have helped me to be aware of the diversity of roles on a project, to recognize and respect the the people who perform them, and to be able to perform them effectively to some extent if necessary. In the individual contributor role, I commit to taking on work to help the company to achieve success, just as (I hope) everyone else in the company does.

Normally I’m taking on the everyday, second-order role of a tester, just as member of a theatre company might walk through the door in the evening as a lighting technician. By adopting the testing role, I’m declaring my commitment to specialize in providing testing services for the project. That doesn’t limit me to testing, of course. If I’m asked, I might also do some programming or documentation work, especially in small development groups—just as an actor in a very small theatre company might help in the box office and take ticket orders from time to time. Nonetheless, my commitment and responsibility to provide testing services requires me to be very cautious about taking on things outside the testing role. When I’m hired as a tester, my default belief is that there’s going to be more than enough testing work to do. If I’m being asked to perform in a different role such that important testing work might be neglected or compromised, I must figure out the priorities with my client.

Within my testing role, I might take on a third-order role as a responsible tester (James Bach has blogged on the role of the responsible tester) for a given project, but I might take on a variety of third-order roles as a test jumper (James has blogged about test jumpers, too).

Like parts of an outfit that I choose to wear, a role is a heuristic that can help to suggest who I am and what I do. In a hospital, the medical staff are easy to identify, wearing uniforms, lab coats, or scrubs that distinguish them from civilian life. Everyone wears badges that allow others to identify them. Surgical staff wear personalized caps—some plain and ordinary, others colourful and whimsical. Doctors often have stethoscopes stuffed into a coat pocket, and certificates from medical schools on their walls. Yet what we might see remains a hint, not a certainty; someone dressed like a nurse may not be a nurse. The role is not a guarantee that the person is qualified to do the work, so it’s worthwhile to see if the garb is a good fit for the person wearing it.

The “team member” role is one thing; the role within the team is another. In a FIFA soccer match, the goalkeeper is dressed differently to make the distinct role—with its special responsibilities and expectations—clearly visible to everyone else, including his team members. The goalkeeper’s role is to mind the net, not to run downfield trying to score goals. There’s no rule against a goalie trying to do what a striker does, but to do so would be disruptive to the dynamics of the team. When a goalkeeper runs downfield trying to score goals, he leaves the net unattended—and those who chose to defend the goal crease aren’t allowed to use their hands.

In well-organized, self-organized teamwork, roles help to identify whether people are in appropriate places. If I’m known as a tester on the project and I am suddenly indisposed, unavailable, or out of position, people are more likely to recognize that some of the testing work won’t get done. Conversely, if someone else can’t fulfill their role for some reason, I’m prepared to step up and volunteer to help. Yet to be helpful, I need to coordinate consistently with the rest of the team to make sure our perceptions line up. On the one hand, I may not have have noticed important and necessary work. On the other, I don’t want to inflict help on the project, nor would it be respectful or wise for me to usurp anyone else’s role. Shifting positions to adapt to a changing situation can be a lot easier when roles help to frame where we’re coming from, where we are, and where we’re going.

A role is not a full-body tattoo, permanently inscribed on me, difficult and painful to remove. A role is not a straitjacket. I wouldn’t volunteer to wear a straitjacket, and I’ll resist if someone tries to put me into one. As Kent Beck has said, “Responsibility cannot be assigned; it can only be accepted. If someone tries to give you responsibility, only you can decide if you are responsible or if you aren’t.” (from Extreme Programming Explained: Embrace Change) I also (metaphorically) study escape artistry in the unlikely event that someone manages to constrain me. When I adopt a role, I must do so voluntarily, understanding the commitment I’m making and believing that I can perform it well—or learn it in a hurry. I might temporarily adopt a third-order role normally taken by someone else, but in the long run, I can’t commit to a role without full and ongoing understanding, agreement, and consent between me and my clients. If I resist accepting a role, I don’t do so capriciously or arbitrarily, but for deeply practical reasons related to three important problems.

The Expertise Problem. I’m willing to do or to learn almost anything, but there is often work for which I may be incompetent, unprepared or underqualified. Each set of tasks in software development requires a significant and distinct set of skills which must be learned and practiced if they are to be performed expertly. I don’t want fool my client or my team into believing that the work will be done well until I’m capable, so I’ll push back on working in certain roles unless my client is willing to accept the attendant risks.

For example, becoming an expert programmer takes years of focused study, experience, and determination. As Collins and Evans suggest, real expertise requires not skill, but also ongoing maintenance; immersion in a way of life. James Bach remarked to me recently, “The only reason that I’m not an expert programmer now is that I haven’t tried it. I’ve been in the software business for thirty years, and if I had focused on programming, I’d be a kick-ass programmer by now. But I chose to be a tester instead.” I feel the same way. Programming is a valuable means to end for me—it helps me get certain kinds of testing work done. I can be a quite capable programmer when I put my mind to it, but I find I have to do programming constantly—almost obsessively—to maintain my skills to my own standards. (These days, if I were asked to do any kind of production programming—even minor changes to the code—I would insist on both close collaboration with peers and careful review by an expert.) I believe I can perform competently, adequately, eventually, in any role. Yet competence and adequacy aren’t enough when I aspire to achieving excellence and mastery. At a certain point in my life, I decided to focus my time and energy on testing and the teaching of it; the testing and teaching roles are the ones that attract me most. Their skills are the ones that I am most interested in trying to master—just as others are focused on mastering programming skills. So: roles represent a heuristic for focusing my development of expertise, and for distributing expertise around the team.

The Mindset Problem. Building a product demands a certain mindset; testing it deeply demands another. When I’m programming or writing (as I’m doing now), I tend to be in the builder’s mindset. As such, I’m at close “critical distance” to the work. I’m seeing it from the position of an insider—me—rather than as an outsider. It’s relatively easy for me to perform shallow testing and spot coding errors, or spelling and grammatical mistakes—although after I’ve been looking at the work for a while, I may start to miss those as well. It’s quite a bit harder for me to notice deeper structural or thematic problems, because I’ve invested time and energy in building the piece as I have, converging towards something I believe that I want. To see deeper problems, I need the greater critical distance that’s available in the tester’s mindset—what testers or editors do. It’s not a trivial matter to switch between mindsets, especially with respect to one’s own work. Switching mindsets is not impossible, but shifting from building into good critical and analytical work is effortful and time-consuming, and messes with the flow.

One heuristic for identifying deep problems in my writing work would be to walk away from writing—from the builder’s mindset—and come back later with the tester’s mindset—just as I’ve done several times with this essay. However, the change in mindset takes time, and even after days or weeks, part of me remains in the writer’s mindset—because it’s my writing. Similarly, a programmer in the flow of developing a product may find it disruptive—both logistically and intellectually—to switch mindsets and start looking for problems. In fact, the required effort likely explains a good deal of some programmers’ stated reluctance to do deep testing on their own.

So another useful heuristic is for the builder to show the work to other people. As they are different people, other builders naturally have critical distance, but that distance gets emphasized when they agree to take on a testing role. I’ve done that with this article too, by enlisting helpers—other writers who adopt the roles of editors and reviewers. A reviewer might usually identify herself as a writer, just as someone in a testing role might normally identify as a programmer. Yet temporarily adopting a reviewer’s role and a testing mindset frames the approach to the task at hand—finding important problems in the work that are harder to see quickly from the builder’s mindset. In publishing, some people by inclination, experience, training, and skills specialize in editing, rather than writing. The editing role is analogous to that of the dedicated tester—someone who remains consistently in the tester’s mindset, at even farther critical distance from the work than the builder-helpers are—more quickly and easily able to observe deep, rare, or subtle problems that builders might not notice.

The Workspace Problem. Tasks in software development may require careful preparation, ongoing design, and day-to-day, long-term maintenance of environments and tools. Different jobs require different workspaces. Programmers, in the building role, set up their environments and tools to do development and building work most simply and efficiently. Setting up a test lab for all of its different purposes—investigation of problems from the field; testing for adaptability and platform support; benchmarking for performance—takes time and focus away from valuable development tasks. The testing role provides a heuristic for distributing and organizing the work of maintaining the test lab.

People sometimes say “on an Agile project, everybody does everything” or “there are no roles on an Agile project”. To me, that’s like saying that there is no particular focusing heuristic for the services that people offer; throwing out the baby of skill with the bathwater of overspecialization and isolation. Indeed, “everybody doing everything” seems to run counter to another idea important to Agile development: expertise and craftsmanship. A successful team is one in which people with diversified skills, interests, temperaments, and experiences work together to produce something that they could not have produced individually. Roles are powerful heuristics for helping to organize and structure the relationships between those people. Even though I’m willing to do anything, I can serve the project best in the testing role, just as others serve the project best in the developer role.

That’s the end of the article. However, my colleague James Bach offered these observations on roles, which were included as a sidebar to the article in the magazine.

A role is probably not:

  • a declaration of the only things you are allowed to do. (It is neither a prison cell nor a destiny from which escape is not possible.)
  • a declaration of the things that you and you only are allowed to do. (It is not a fortress that prevents entry from anyone outside.)
  • a one-size, exclusive, permanent, or generic structure.

A role is:

  • a declaration of what one can be relied upon to do; a promise to perform a service or services well. (Some of those services may be explict; others are tacit.)
  • a unifying idea serving to focus commitment, preparation, performance, and delivery of services.
  • a heuristic for helping people manage their time on a project, and to be able to determine spontaneously who to approach, consult with, or make requests to (or sometimes avoid), in order to get things done.
  • a heuristic for fostering personal engagement and responsibility.
  • a heuristic for defining or explaining the meaning of your work.
  • a flexible and non-exclusive structure that may exist over a span of moments or years.
  • a label that represents these things.
  • a voluntary commitment.

A role may or may not be:

  • an identity
  • a component of identity.

—James Bach

Exploratory Testing 3.0

Tuesday, March 17th, 2015

This blog post was co-authored by James Bach and me. In the unlikely event that you don’t already read James’ blog, I recommend you go there now.

The summary is that we are beginning the process of deprecating the term “exploratory testing”, and replacing it with, simply, “testing”. We’re happy to receive replies either here or on James’ site.

Very Short Blog Posts (25): Testers Don’t Break the Software

Tuesday, February 17th, 2015

Plenty of testers claim that they break the software. They don’t really do that, of course. Software doesn’t break; it simply does what it has been designed and coded to do, for better or for worse. Testers investigate systems, looking at what the system does; discovering and reporting on where and how the software is broken; identifying when the system will fail under load or stress.

It might be a good idea to consider the psychological and public relations problems associated with claiming that you break the software. Programmers and managers might subconsciously harbour the idea that the software was fine until the testers broke it. The product would have shipped on time, except the testers broke it. Normal customers wouldn’t have problems with the software; it’s just that the testers broke it. There are no systemic problems in the project that lead to problems in the product; nuh-uh, the testers broke it.

As an alternative, you could simply say that you investigate the software and report on what it actually does—instead of what people hope or wish that it does. Or as my colleague James Bach puts it, “We don’t break the software. We break illusions about the software.”

Give Us Back Our Testing

Saturday, February 14th, 2015

“Program testing involves the execution of a program over sample test data followed by analysis of the output. Different kinds of test output can be generated. It may consist of final values of program output variables or of intermediate traces of selected variables. It may also consist of timing information, as in real time systems.

“The use of testing requires the existence of an external mechanism which can be used to check test output for correctness. This mechanism is referred to as the test oracle. Test oracles can take on different forms. They can consist of tables, hand calculated values, simulated results, or informal design and requirements descriptions.”

—William E. Howden, A Survey of Dynamic Analysis Methods, in Software Validation and Testing Techniques, IEEE Computer Society, 1981

Once upon a time, computers were used solely for computation. Humans did most of the work that preceded or followed the computation, so the scope of a computer program was limited. In the earliest days, testing a program mostly involved checking to see if the computations were being performed correctly, and that the hardware was working properly before and after the computation.

Over time, designers and programmers became more ambitious and computers became more powerful, enabling more complex and less purely numerical tasks to be encoded and delegated to the machinery. Enormous memory and blinding speed largely replaced the physical work associated with storing, retrieving, revising, and transmitting records. Computers got smaller and became more powerful and protean, used not only by mathematicians but also by scientists, business people, specialists, consumers, and kids. Software is now used for everything from productivity to communications, control systems, games, audio playback, video displays, thermostats… Yet many of the software development community’s ideas about testing haven’t kept up. In fact, in many ways, they’ve gone backwards.

Ask people in the software business to describe what testing means to them, and many will begin to talk about test cases, and about comparing a program’s output to some predicted or expected result. Yet outside of software development, “testing” has retained its many more expansive meanings. A teenager tests his parents’ patience. When confronted with a mysterious ailment, doctors perform diagnostic tests with open expectations and results that must be interpreted. Writers in Cook’s Illustrated magazine test techniques for roasting a turkey, and report on the different outcomes involving the different factors—flavours, colours, moisture, textures—that they obtain. The Mythbusters, says Wikipedia, “use elements of the scientific method to test the validity of rumors, myths, movie scenes, adages, Internet videos, and news stories.”

Notice that all of these things called “testing” are focused on exploration, investigation, discovery, and learning. Yet over the last several decades, Howden’s notions of testing as checking for correctness, and of an oracle as a mechanism (or an artifact) became accepted by many people in the development and testing communities at large. Whether people were explicitly aware of those notions, they certainly seemed tacitly to have subscribed to the idea that testing should be focused on analysis of the output, displacing those deeper meanings of testing. That idea might have been more reasonable when computers did nothing but compute. Today, computers and their software are richly intertwined with daily social life and things that we value. Yet for many in software development, “testing” has this narrow, impoverished meaning, limited to what James Bach and I call checking. Checking is a tactic of testing; the part of testing that can be encoded as algorithms and that therefore can be performed entirely by machinery. It is analogous to compiling, the part of programming that can be performed algorithmically.

Oddly, since we started distinguishing between testing and checking, some people have claimed that we’re “redefining” testing. We disagree. We believe that we are recovering testing’s meaning, restoring it to its original, rich, investigative sense. Testing’s meaning was stolen; we’re stealing it back.

The Rapid Software Testing Namespace

Monday, February 2nd, 2015

Just as no one has the right to tell you what language to speak at home, nobody outside of your project has the authority to tell you how to speak inside your project. Every project develops its own namespace, so to speak, and its own formal or informal criteria for naming things inside it. Rapid Software Testing is, among other things, a project in that sense. For years, James Bach and I have been developing labels for ideas and activities that we talk about in our work and in our classes. While we’re happy to adopt useful ideas and terms from other places, we have the sole authority (for now) to set the vocabulary formally within Rapid Software Testing (RST). We don’t have the right to impose our vocabulary on anyone else. So what do we do when other people use a word to mean something different from what we mean by the same word?

We invoke “the RST namespace” when we talk about testing and checking, for example, so that we can speak clearly and efficiently about ideas that we bring up in our classes and in the practice of Rapid Software Testing. From time to time, we also try to make it clear why we use words in a specific way. For example, we make a big deal about testing and checking. We define checking as “the process of making evaluations by applying algorithmic decision rules to specific observations of a product” (and a check is an instance of checking). We define testing as “the process of evaluating a product by learning about it through exploration and experimentation, which includes to some degree: questioning, study, modeling, observation, inference, etc.” (and a test is an instance of testing).

This is in contrast with the ISTQB, which in its Glossary defines “test” as “a set of test cases”—along with “test case” as “a set of input values, execution preconditions, expected results and execution postconditions, developed for a particular objective or test condition, such as to exercise a particular program path or to verify compliance with a specific requirement.” Interesting, isn’t it: the ISTQB’s definition of test looks a lot like our definition of check. In Rapid Software Testing, we prefer to put learning and experimentation (rather than satisfying requirements and demonstrating fitness for purpose) at the centre of testing. We prefer to think of a test as something that people do as an act of investigation; as a performance, not as an artifact.

Because words convey meaning, we converse (and occasionally argue, and sometimes passionately) the value we see in the words we choose and the ways we think of them. Our goal is to describe things that people haven’t noticed, or to make certain distinctions clear, with the goal of reducing the risk that someone will misunderstand—or miss—something important. Nonetheless, we freely acknowledge that we have no authority outside of Rapid Software Testing. There’s nothing to stop people from using the words we use in a different way; there are no language police in software development. So we’re also willing to agree to use other people’s labels for things when we’ve had the conversation about what those labels mean, and have come to agreement.

People who tout a “common language” often mean “my common language”, or “my namespace”. They also have the option to certify you as being able to pass a vocabulary test, if anyone thinks that’s important. We don’t. We think that it’s important for people to notice when words are being used in different ways. We think it’s important for people to become polyglots—and that often means working out which namespace we might be using from one moment to the next. In our future writing, conversation, classes, and other work, you might wonder what we’re talking about when we refer to “the RST namespace”. This post provides your answer.

Taking Severity Seriously

Wednesday, January 14th, 2015

There’s a flaw in the way most organizations classify the severity of a bug. Here’s an example from the Elementool Web site (as of 14 January, 2015); I’m sure you’ve seen something like it:

Critical: The bug causes a failure of the complete software system, subsystem or a program within the system.
High: The bug does not cause a failure, but causes the system to produce incorrect, incomplete, inconsistent results or impairs the system usability.
Medium: The bug does not cause a failure, does not impair usability, and does not interfere in the fluent work of the system and programs.
Low: The bug is an aesthetic (sic —MB), is an enhancement (ditto) or is a result of non-conformance to a standard.

These are serious problems, to be sure—and there are problems with the categorizations, too. (For example, non-conformance to a medical device standard can get you publicly reprimanded by the FDA; how is that low severity?) But there’s a more serious problem with models of severity like this: they’re all about the system as though no person used that system. There’s no empathy or emotion here; there’s no impact on people. The descriptions don’t mention the victims of the problem, and they certainly don’t identify consequences for the business. What would happen if we thought of those categories a little differently?

Critical: The bug will cause so much harm or loss that customers will sue us, regulators will launch a probe of our management, newspapers will run a front-page story about us, and comedians will talk about us on late night talk shows. Our company will spend buckets of money on lawyers, public relations, and technical support to try to keep the company afloat. Many capable people will leave voluntarily without even looking for a new job. Lots of people will get laid off. Or, the bug blocks testing such that we could miss problems of this magnitude; go back to the beginning of this paragraph.

High: The bug will cause loss, harm, or deep annoyance and inconvenience to our customers, prompting them to flood the technical support phones, overwhelm the online chat team, return the product demanding their money back, and buy the competitor’s product. And they’ll complain loudly on Twitter. The newspaper story will make it to the front page of the business section, and our product will be used for a gag in Dilbert. Sales will take a hit and revenue will fall. The Technical Support department will hold a grudge against Development and Product Management for years. And our best workers won’t leave right away, but they’ll be sufficiently demoralized to start shopping their résumés around.

Medium: The bug will cause our customers to be frustrated or impatient, and to lose faith in our product such that they won’t necessarily call or write, but they won’t be back for the next version. Most won’t initiate a tweet about us, but they’ll eagerly retweet someone else’s. Or, the bug will annoy the CEO’s daughter, whereupon the CEO will pay an uncomfortable visit to the development group. People won’t leave the company, but they’ll be demotivated and call in sick more often. Tech support will handle an increased number of calls. Meanwhile, the testers will have—with the best of intentions—taken time to investigate and report the bug, such that other, more serious bugs will be missed (see “High” and “Critical” above). And a few months later, some middle manager will ask, uncomprehendingly, “Why didn’t you find that bug?”

Low: The bug is visible; it makes our customers laugh at us because it makes our managers, programmers, and testers look incompetent and sloppy‐and it causes our customers to suspect deeper problems. Even people inside the company will tease others about the problem via grafitti in the stalls in the washroom (written with a non-washable Sharpie).Again, the testers will have spent some time on investigation and reporting, and again test coverage will suffer.

Of course, one really great way to avoid many of these kinds of problems is to focus on diligent craftsmanship supported by scrupulous testing. But when it comes to that discussion in that triage meeting, let’s consider the impact on real customers, on the real people in our company, and on our own reputations.

Testing is…

Tuesday, October 28th, 2014

Every now and again, someone makes some statement about testing that I find highly questionable or indefensible, whereupon I might ask them what testing means to them. All too often, they’re at a loss to reply because they haven’t really thought deeply about the matter; or because they haven’t internalized what they’ve thought about; or because they’re unwilling to commit to any statement about testing. And then they say something vague or non-commital like “it depends” or “different things to different people” or “that’s a matter of context”, without suggesting relevant dependencies, people, or context factors. So, for those people, I offer a set of answers from which they can choose one; or they can adopt the entire list wholesale; or they use one or more items as a point of departure for something of their own invention. You don’t have to agree with any of these things; in that case, invent your own ideas about testing from whole cloth But please: if you claim to be a tester, or if you are making some claim about testing, please prepare yourself and have some answer ready when someone asks you “what is testing?”. Please. Here are some possible replies; I believe everything is Tweetable, or pretty close.

Testing is—among other things—reviewing the product and ideas and descriptions of it, looking for significant and relevant inconsistencies.
Testing is—among other things—experimenting with the product to find out how it may be having problems—which is not “breaking the product”.
Testing is—among other things—something that informs quality assurance, but is not in and of itself quality assurance.
Testing is—among other things—helping our clients to make empirically informed decisions about the product, project, or business.
Testing is—among other things—a process by which we systematically examine any aspect of the product with the goal of preventing surprises.
Testing is—among other things—a process of interacting with the product and its systems in many ways that challenge unwarranted optimism.
Testing is—among other things—observing and evaluating the product, to see where all those defect prevention ideas might have failed.
Testing is—among other things—a special part of the development process focused on discovering what could go badly (or what is going badly).
Testing is—among other things—exploring, discovering, investigating, learning, and reporting about the product to reveal new information.
Testing is—among other things—gathering information about the product, its users, and conditions of its use, to help defend value.
Testing is—among other things—raising questions to help teams to develop products that more quickly and easily reveal their own problems.
Testing is—among other things—helping programmers and the team to learn about unanticipated aspects of the product we’re developing.
Testing is—among other things—helping our clients to understand the product they’ve got so they can decide if it’s the product they want.
Testing is—among other things—using both tools and direct interaction with the product to question and evaluate its behaviours and states.
Testing is—among other things—exploring products deeply, imaginatively, and suspiciously, to help find problems that threaten value.
Testing is—among other things—performing actual and thought experiments on products and ideas to identify problems and risks.
Testing is—among other things—thinking critically and skeptically about products and ideas around them, with the goal of not being fooled.
Testing is—among other things—evaluating a product by learning about it through exploration, experimentation, observation and inference.

You’re welcome.