Blog Posts for the ‘Tester Skill’ Category

On Red

Friday, June 26th, 2015

What actually happens when a check returns a “red” result?

Some people might reflexively say “Easy: we get a red; we fix the bug.” Yet that statement is too simplistic, concealing a good deal of what really goes on. The other day, James Bach and I transpected on the process. Although it’s not the same in every case, we think that for responsible testers, the process actually goes something more like this:

First, we ask, “Is the check really returning a red?” The check provides us with a result which signals some kind of information, but by design the check hides lots of information too. The key here is that we want to see the problem for ourselves and apply human sensemaking to the result and to the possibility of a real problem.

Sensemaking is not a trivial subject. Karl Weick, in Sensemaking in Organizations, identifies seven elements of sensemaking, saying it is:

  • grounded in identity construction (which means that making sense of something is embedded in a set of “who-am-I-and-what-am-I-doing here?” questions);
  • social (meaning that “human thinking and social functioning are essential aspects of each other”, and that making sense of something tends to be oriented towards sharing the meanings);
  • ongoing (meaning that it’s happening all the time, continuously; yet it’s…)
  • retrospective (meaning that it’s based on “what happened” or “what just happened?”; even though it’s happening in the present, it’s about things that have happened in the past, however recent that might be);
  • enactive of sensible environments (meaning that sensemaking is part of a process in which we try to make the world a more understandable place);
  • based on plausibilty, rather than accuracy (meaning that when people make sense of something, they tend to rely on heuristics, rather than things that are absolutely 100% guaranteed to be correct)
  • focused on extracted cues (extracted cues are simple, familiar bits of information that lead to a larger sense of what is occurring, like “Flimsy!->Won’t last!” or “Shouting, with furrowed brow!->Angry!” or “Check returns red!->Problem!”).

The reason that we need to apply sensemaking is that it’s never clear that a check is signaling an actual problem in the product. Maybe there’s a problem in the instrumentation, or a mistake in the programming of the check. So when we see a “red” result, we try to make sense of it by seeking more information (or examining other extracted cues, as Weick might say).

  • We might perform the check a second time, to see if we’re getting a consistent result. (Qualitative researchers would call this a search for diachronic reliability; are we getting the same result over time?)
  • If the second result isn’t consistent with the first, we might perform the check again several times, to see if the result recurs only occasionally and intermittently.
  • We might look for secondary indicators of the problem, other oracles or other evidence that supports or refutes the result of the check.
  • If we’re convinced that the check is really red, we then ask “where is the trouble?” The trouble might be in the product or in the check.

    • We might inspect the state of our instrumentation, to make sure that all of the equipment is in place and set up correctly.
    • We might work our way back through the records produced by the check, tracing through log files for indications of behaviours and changes of state, and possible causes for them.
    • We might perform the check slowly, step by step, observing more closely to see where the things went awry. We might step through the code in the debugger, or perform a procedure interactively instead of delegating the activity to the machinery.
    • We might perform the check with different values, to assess the extents or limits of the problem.
    • We might perform the check using different pacing or different sequences of actions to see if time is a factor.
    • We might perform the check on other platforms, to see if the check is revealing a problem of narrow or more general scope. (Those qualitative researchers would call this a search for synchronic reliability; could the same thing happen at the same time in different places?)
    • Next, if the check appears to be producing a result that makes sense—the check is accurately identifying a condition that we programmed it to identify—it might be easy to conclude that there’s a bug, and now it’s time to fix it. But we’re not done, because although the check is pointing to an inconsistency between the actual state of the product and some programmed result, there’s yet another decision to be made: is that inconsistency a problem with respect to something that someone desires? In other words, does that inconsistency matter?

      • Maybe the check is archaic, checking for some condition that is no longer relevant, and we don’t need it any more.
      • Maybe the check is one of several that are still relevant, but this specific check wrong in some specific respect. Perhaps something that used to be permitted is now forbidden, or vice versa.
      • When the check returns a binary result based on a range of possible results, we might ask “is the result within a tolerable range?” In order to do that, we might have to revisit our notions of what is tolerable. Perhaps the output deviated from the range insignificantly, or momentarily; that is, the check may be too restrictive or too fussy.
      • Maybe the check has been not been set up with explicit pass/fail criteria, but to alert us about some potentially interesting condition that is not necessarily a failure. In this case, the check doesn’t represent a problem per se, but rather a trigger for investigation.
      • We might look outside of the narrow scope of the check to see if there’s something important that the check has overlooked. We might do this interactively, or by applying different checks.

      In other words: after making an observation and concluding that it fits the facts, we might choose to apply our tacit and explicit oracles to make a different sense of the outcome. Rather than concluding “The product doesn’t work the way we wanted it to”, we may realize that we didn’t want the product to do that after all. Or we might repair the outcome (as Harry Collins would put it) by saying, “That check sometimes appears to fail when it doesn’t, just ignore it” or “Oh… well, probably this thing happened… I bet that’s what it was… don’t bother to investigate.”

      In the process of developing the check, we were testing (evaluating the product by learning about it through exploration and experimentation). The check itself happens mechanically, algorithmically. As it does so, it projects a complex, multi-dimensional space down to single-dimensional result, “red” or “green”. In order to make good use of that result, we must unpack the projection. After a red result, the check turns into the centre of a test as we hover over it and examine it. In other words, the red check result typically prompts us to start testing again.

      That’s what usually happens when a check returns a “red” result. What happens when it returns nothing but “green” results?

    On a Role

    Monday, June 15th, 2015

    This article was originally published in the February 2015 edition of Testing Trapeze, an excellent online testing magazine produced by our testing friends in New Zealand. There are small edits here from the version I submitted.

    Once upon a time, before I was a tester, I worked in theatre. Throughout my career, I took on many roles—but maybe not in the way you’d immediately expect. In my early days, I was a performer, acting in roles in the sense that springs to mind for most people when they think of theatre: characters in a play. Most of the time, though, I was in the role of a stage manager, which is a little like being a program manager in a software development group. Sometimes my role was that of a lighting designer, sound engineer, or stagehand. I worked in the wardrobe of the Toronto production of CATS for six months, too.

    Recent discussions about software development have prompted me to think about the role of roles in our work, and in work generally. For example, in a typical theatre piece, an actor performs in three different roles at once. Here, I’ll classify them…

    a first-order role, in which a person is a member of the theatre company throughout the rehearsal period and run of the play. If someone asks him “What are you working on these days?”, he’ll reply “I’m doing a show with the Mistytown Theatre Company.”

    a second-order role that the person takes on when he arrives at the theatre, defocusing from his day-to-day role as a husband and father, and focusing his energy on being an actor, or stagehand, or lighting designer. He typically holds that second-order role over the course of the working day, and abandons it when it’s time to go home.

    a third-order role that the actor performs as a specific character at some point during the show. In many cases, the actor takes on one character per performance. Occasionally an actor takes on several different characters throughout the course of the performance, playing a new third-order role from one moment to another. In an improvisational theatre company, a performer may pick up and drop third-order roles as quickly as you or I would don or doff a hat. In a more traditional style of theatre, roles are more sharply defined, and things can get confusing when actors suddenly and unexpectedly change roles mid-performance. (I saw that happen once during my theatre career. An elderly performer took ill during the middle of the first act, and her much younger understudy stepped in for the remainder of the show. It was necessary on that occasion, of course, but the relationships between the performers were shaken up for the rest of the evening, and there was no telling what sense the audience was able to make of the sudden switch until intermission when the stage manager made an announcement.)

    It’s natural and normal to deal simultaneously with roles of different orders, but it’s hard to handle two roles of the same order at exactly the same time. For example, a person may be both a member of a theatre company and a parent, but it’s not easy to supervise a child while you’re on stage in the middle of a show. In a small theatre company, the same person might hold two second-order roles—as both an actor and a costume designer, say—but in a given moment, that person is focusing on either acting or costume design, but not both at once. People in a perfomer role tend not to play two different third-order roles—two different characters—at the same moment. There are rare exceptions, as in those weird Star Trek episodes or in movies like All of Me, in which one character is inhabiting the body of another. To perform successfully in two simultaneous third-order roles takes spectacular amounts of discipline and skill, and the occasions where it’s necessary to do so aren’t terribly common.

    Some roles are more temporary than others. At the end of the performance, people drop their second-order roles to go home and live out their other, more long-term roles; husbands and wives, parents, daughters and sons. They may adopt other roles too: volunteer in the community soup kitchen; declarer in this hand of the bridge game; parishioners at the church; pitcher on the softball team.

    Roles can be refined and redefined; in a dramatic television series, an actor performs in a third-order role in each episode, as a particular character. If it’s an interesting character, aspects of the role change and develop over time. At the end of the run of a show, people may continue in their first-order roles with the same theatre company; they may become directors or choreographers with that company; or they may move on to another role in another company. They may take on another career altogether. Other roles evolve too, from friend to lover to spouse to parent.

    In theatre, a role is an identity that a person takes to fulfill some purpose in service of the theatre company, production, or the nightly show. More generally, a role is a position or function that a person adopts and performs temporarily. A role represents a set of services offered, and often includes tacit or explicit commmitments to do certain things for and with other people. A role is a way to summarize ideas about services people offer, activities they perform, and the goals that guide them.

    Now: to software. As a member of a software development team within an organization, I’m an individual contributor. In that first-order role, I’m a generalist. I’ve been a program manager, programmer, tech support person, technical writer, network administrator, and phone system administrator, business owner, bookkeeper, teacher, musician… Those experiences have helped me to be aware of the diversity of roles on a project, to recognize and respect the the people who perform them, and to be able to perform them effectively to some extent if necessary. In the individual contributor role, I commit to taking on work to help the company to achieve success, just as (I hope) everyone else in the company does.

    Normally I’m taking on the everyday, second-order role of a tester, just as member of a theatre company might walk through the door in the evening as a lighting technician. By adopting the testing role, I’m declaring my commitment to specialize in providing testing services for the project. That doesn’t limit me to testing, of course. If I’m asked, I might also do some programming or documentation work, especially in small development groups—just as an actor in a very small theatre company might help in the box office and take ticket orders from time to time. Nonetheless, my commitment and responsibility to provide testing services requires me to be very cautious about taking on things outside the testing role. When I’m hired as a tester, my default belief is that there’s going to be more than enough testing work to do. If I’m being asked to perform in a different role such that important testing work might be neglected or compromised, I must figure out the priorities with my client.

    Within my testing role, I might take on a third-order role as a responsible tester (James Bach has blogged on the role of the responsible tester) for a given project, but I might take on a variety of third-order roles as a test jumper (James has blogged about test jumpers, too).

    Like parts of an outfit that I choose to wear, a role is a heuristic that can help to suggest who I am and what I do. In a hospital, the medical staff are easy to identify, wearing uniforms, lab coats, or scrubs that distinguish them from civilian life. Everyone wears badges that allow others to identify them. Surgical staff wear personalized caps—some plain and ordinary, others colourful and whimsical. Doctors often have stethoscopes stuffed into a coat pocket, and certificates from medical schools on their walls. Yet what we might see remains a hint, not a certainty; someone dressed like a nurse may not be a nurse. The role is not a guarantee that the person is qualified to do the work, so it’s worthwhile to see if the garb is a good fit for the person wearing it.

    The “team member” role is one thing; the role within the team is another. In a FIFA soccer match, the goalkeeper is dressed differently to make the distinct role—with its special responsibilities and expectations—clearly visible to everyone else, including his team members. The goalkeeper’s role is to mind the net, not to run downfield trying to score goals. There’s no rule against a goalie trying to do what a striker does, but to do so would be disruptive to the dynamics of the team. When a goalkeeper runs downfield trying to score goals, he leaves the net unattended—and those who chose to defend the goal crease aren’t allowed to use their hands.

    In well-organized, self-organized teamwork, roles help to identify whether people are in appropriate places. If I’m known as a tester on the project and I am suddenly indisposed, unavailable, or out of position, people are more likely to recognize that some of the testing work won’t get done. Conversely, if someone else can’t fulfill their role for some reason, I’m prepared to step up and volunteer to help. Yet to be helpful, I need to coordinate consistently with the rest of the team to make sure our perceptions line up. On the one hand, I may not have have noticed important and necessary work. On the other, I don’t want to inflict help on the project, nor would it be respectful or wise for me to usurp anyone else’s role. Shifting positions to adapt to a changing situation can be a lot easier when roles help to frame where we’re coming from, where we are, and where we’re going.

    A role is not a full-body tattoo, permanently inscribed on me, difficult and painful to remove. A role is not a straitjacket. I wouldn’t volunteer to wear a straitjacket, and I’ll resist if someone tries to put me into one. As Kent Beck has said, “Responsibility cannot be assigned; it can only be accepted. If someone tries to give you responsibility, only you can decide if you are responsible or if you aren’t.” (from Extreme Programming Explained: Embrace Change) I also (metaphorically) study escape artistry in the unlikely event that someone manages to constrain me. When I adopt a role, I must do so voluntarily, understanding the commitment I’m making and believing that I can perform it well—or learn it in a hurry. I might temporarily adopt a third-order role normally taken by someone else, but in the long run, I can’t commit to a role without full and ongoing understanding, agreement, and consent between me and my clients. If I resist accepting a role, I don’t do so capriciously or arbitrarily, but for deeply practical reasons related to three important problems.

    The Expertise Problem. I’m willing to do or to learn almost anything, but there is often work for which I may be incompetent, unprepared or underqualified. Each set of tasks in software development requires a significant and distinct set of skills which must be learned and practiced if they are to be performed expertly. I don’t want fool my client or my team into believing that the work will be done well until I’m capable, so I’ll push back on working in certain roles unless my client is willing to accept the attendant risks.

    For example, becoming an expert programmer takes years of focused study, experience, and determination. As Collins and Evans suggest, real expertise requires not skill, but also ongoing maintenance; immersion in a way of life. James Bach remarked to me recently, “The only reason that I’m not an expert programmer now is that I haven’t tried it. I’ve been in the software business for thirty years, and if I had focused on programming, I’d be a kick-ass programmer by now. But I chose to be a tester instead.” I feel the same way. Programming is a valuable means to end for me—it helps me get certain kinds of testing work done. I can be a quite capable programmer when I put my mind to it, but I find I have to do programming constantly—almost obsessively—to maintain my skills to my own standards. (These days, if I were asked to do any kind of production programming—even minor changes to the code—I would insist on both close collaboration with peers and careful review by an expert.) I believe I can perform competently, adequately, eventually, in any role. Yet competence and adequacy aren’t enough when I aspire to achieving excellence and mastery. At a certain point in my life, I decided to focus my time and energy on testing and the teaching of it; the testing and teaching roles are the ones that attract me most. Their skills are the ones that I am most interested in trying to master—just as others are focused on mastering programming skills. So: roles represent a heuristic for focusing my development of expertise, and for distributing expertise around the team.

    The Mindset Problem. Building a product demands a certain mindset; testing it deeply demands another. When I’m programming or writing (as I’m doing now), I tend to be in the builder’s mindset. As such, I’m at close “critical distance” to the work. I’m seeing it from the position of an insider—me—rather than as an outsider. It’s relatively easy for me to perform shallow testing and spot coding errors, or spelling and grammatical mistakes—although after I’ve been looking at the work for a while, I may start to miss those as well. It’s quite a bit harder for me to notice deeper structural or thematic problems, because I’ve invested time and energy in building the piece as I have, converging towards something I believe that I want. To see deeper problems, I need the greater critical distance that’s available in the tester’s mindset—what testers or editors do. It’s not a trivial matter to switch between mindsets, especially with respect to one’s own work. Switching mindsets is not impossible, but shifting from building into good critical and analytical work is effortful and time-consuming, and messes with the flow.

    One heuristic for identifying deep problems in my writing work would be to walk away from writing—from the builder’s mindset—and come back later with the tester’s mindset—just as I’ve done several times with this essay. However, the change in mindset takes time, and even after days or weeks, part of me remains in the writer’s mindset—because it’s my writing. Similarly, a programmer in the flow of developing a product may find it disruptive—both logistically and intellectually—to switch mindsets and start looking for problems. In fact, the required effort likely explains a good deal of some programmers’ stated reluctance to do deep testing on their own.

    So another useful heuristic is for the builder to show the work to other people. As they are different people, other builders naturally have critical distance, but that distance gets emphasized when they agree to take on a testing role. I’ve done that with this article too, by enlisting helpers—other writers who adopt the roles of editors and reviewers. A reviewer might usually identify herself as a writer, just as someone in a testing role might normally identify as a programmer. Yet temporarily adopting a reviewer’s role and a testing mindset frames the approach to the task at hand—finding important problems in the work that are harder to see quickly from the builder’s mindset. In publishing, some people by inclination, experience, training, and skills specialize in editing, rather than writing. The editing role is analogous to that of the dedicated tester—someone who remains consistently in the tester’s mindset, at even farther critical distance from the work than the builder-helpers are—more quickly and easily able to observe deep, rare, or subtle problems that builders might not notice.

    The Workspace Problem. Tasks in software development may require careful preparation, ongoing design, and day-to-day, long-term maintenance of environments and tools. Different jobs require different workspaces. Programmers, in the building role, set up their environments and tools to do development and building work most simply and efficiently. Setting up a test lab for all of its different purposes—investigation of problems from the field; testing for adaptability and platform support; benchmarking for performance—takes time and focus away from valuable development tasks. The testing role provides a heuristic for distributing and organizing the work of maintaining the test lab.

    People sometimes say “on an Agile project, everybody does everything” or “there are no roles on an Agile project”. To me, that’s like saying that there is no particular focusing heuristic for the services that people offer; throwing out the baby of skill with the bathwater of overspecialization and isolation. Indeed, “everybody doing everything” seems to run counter to another idea important to Agile development: expertise and craftsmanship. A successful team is one in which people with diversified skills, interests, temperaments, and experiences work together to produce something that they could not have produced individually. Roles are powerful heuristics for helping to organize and structure the relationships between those people. Even though I’m willing to do anything, I can serve the project best in the testing role, just as others serve the project best in the developer role.

    That’s the end of the article. However, my colleague James Bach offered these observations on roles, which were included as a sidebar to the article in the magazine.

    A role is probably not:

    • a declaration of the only things you are allowed to do. (It is neither a prison cell nor a destiny from which escape is not possible.)
    • a declaration of the things that you and you only are allowed to do. (It is not a fortress that prevents entry from anyone outside.)
    • a one-size, exclusive, permanent, or generic structure.

    A role is:

    • a declaration of what one can be relied upon to do; a promise to perform a service or services well. (Some of those services may be explict; others are tacit.)
    • a unifying idea serving to focus commitment, preparation, performance, and delivery of services.
    • a heuristic for helping people manage their time on a project, and to be able to determine spontaneously who to approach, consult with, or make requests to (or sometimes avoid), in order to get things done.
    • a heuristic for fostering personal engagement and responsibility.
    • a heuristic for defining or explaining the meaning of your work.
    • a flexible and non-exclusive structure that may exist over a span of moments or years.
    • a label that represents these things.
    • a voluntary commitment.

    A role may or may not be:

    • an identity
    • a component of identity.

    —James Bach

    Very Short Blog Posts (26): You Don’t Need Acceptance Criteria to Test

    Tuesday, February 24th, 2015

    You do not need acceptance criteria to test.

    Reporters do not need acceptance criteria to investigate and report stories; scientists do not need acceptance criteria to study and learn about things; and you do not need acceptance criteria to explore something, to experiment with it, to learn about it, or to provide a description of it.

    You could use explicit acceptance criteria as a focusing heuristic, to help direct your attention toward specific things that matter to your clients; that’s fine. You might choose to use explicit acceptance criteria as claims, oracles that help you to recognize a problem that happens as you test; that’s fine too. But there are many other ways to identify problems; quality criteria may be tacit, not explicit; and you may discover many problems that explicit acceptance criteria don’t cover.

    You don’t need acceptance criteria to decide whether something is acceptable or unacceptable. As a tester you don’t have decision-making authority over acceptability anyway. You might use acceptance criteria to inform your testing, and to identify threats to the value of the product. But you don’t need acceptance criteria to test.

    Very Short Blog Posts (24): You Are Not a Bureaucrat

    Saturday, February 7th, 2015

    Here’s a pattern I see fairly often at the end of bug reports:

    Expected: “Total” field should update and display correct result.
    Actual: “Total” field updates and displays incorrect result.

    Come on. When you write a report like that, can you blame people for thinking you’re a little slow? Or that you’re a bureaucrat, and that testing work is mindless paperwork and form-filling? Or perhaps that you’re being condescending?

    It is absolutely important that you describe a problem in your bug report, and how to observe that problem. In the end, a bug is an inconsistency between a desired state and an observed state; between what we want and what we’ve got. It’s very important to identify the nature of that inconsistency; oracles are our means of recognizing and describing problems. But in the relationship between your observation and the desired state, the expectation is the middleman. Your expectation is grounded in a principle based on some desirable consistency. If you need to make that principle explicit, leave out the expectation, and go directly for a good oracle instead.

    When Programmers (and Testers) Do Their Jobs

    Monday, December 22nd, 2014

    For a long time, I’ve admired Robert (“Uncle Bob”) Martin’s persistent advocacy of craftsmanship in programming and software development. Recently on Twitter, he said

    One of the most important tasks in the testing role is to identify alternative interpretations of apparently clear and simple statements. Uncle Bob’s statement appears clear and simple, but as with any sentence that can be read by a human, it affords multiple interpretations. One interpretation might be that “when programmers do their jobs, testers find nothing and therefore have nothing useful to contribute“. I’m pretty sure Uncle Bob didn’t mean to say that, although it seems that at least one of my colleagues might have taken that interpretation. I prefer to think Uncle Bob’s intention was to remind programmers to take responsibility for the integrity and quality of their work, and not to slight testers.

    As a tester, part of my job to help reduce the chance that statements could be misinterpreted or taken in an overly simplistic way. I think Uncle Bob probably meant the first item on this list of a few possible interpretations (and I hope he’d agree with the other ones that I offer here, too):

    • When programmers do their jobs, testers find nothing that takes the form of blatant coding errors.
    • When programmers do their jobs, testers find nothing inconsistent with what the programmers have been asked to do—although the testers might discover problems in the design or the requirements that were given to the programmers to implement.
    • When programmers do their jobs, testers find nothing that indicates the programmer has been negligent or sloppy, although even the best programmers are not perfect.
    • When programmers do their jobs, testers find nothing that makes the product hard to test; instead, they receive a highly testable product that provides access to things like log files and testable interfaces.
    • When programmers do their jobs, testers find nothing problemmatic, although they might discover unanticipated value in the product.
    • When programmers do their jobs, testers find nothing that interferes with deep testing—looking for rare, hidden, subtle, or platform-related problems that could escape even the most diligent programmers.
    • When programmers do their jobs, testers find nothing that slows them down in developing a more comprehensive understanding of the business needs, making their testing more relevant.
    • When programmers do their jobs, testers find nothing that takes time away from developing rich test ideas, scenarios, and experiments that yield a deep understanding of the product and its emergent behaviours.
    • When programmers do their jobs, testers find nothing more to ask for in terms of useful tools that would aid testing.

    In the same thread, James Bach pointed out that even when programmers do their jobs, testers find that the product is doing its job, and that testers find important truths about the product. Neither of these is exactly “nothing”. So…

    • When programmers do their jobs, testers shine light on exactly how well the programmers have done their jobs.
    • When programmers do their jobs, testers identify ways in which other people might have different interpretations of a job well done.
    • When programmers do their jobs, testers have more time to compare our product with competitors’ products, pointing out areas of strengths and weaknesses in each one.

    Programmers are also in the business of clearing up misinterpretations. I posted a simpler version of one of the ideas above on Twitter:

    “When programmers do their jobs, testers find deep, rare, hidden, subtle, or platform-related problems.”

    That sentence was limited by Twitter’s 140-character limit, and limited further by the Twitter handles of couple of addressees to whom I was responding. Ron Jeffries, on a mission similar to mine, pointed out that some testers find deep, rare, hidden, subtle, or platform-related problems. I agree with Ron, and I’ll add that even the best testers—just like the best developers—are human, and limited, and can occasionally miss problems. So:

    • Testers (and programmers) who focus on excellence, craftsmanship, skill, and collaboration will help each other, and will tend to find problems that can be addressed before the product is released—and will tend to produce more valuable products as a result.

    Very Short Blog Posts (20): More About Testability

    Monday, July 14th, 2014

    A few weeks ago, I posted a Very Short Blog Post on the bare-bones basics of testability. Today, I saw a very good post from Adam Knight talking about telling the testability story. Adam focused, as I did, on intrinsic testability—things in the product itself that it more testable. But testability isn’t just a product attribute. In Heuristics of Testability (material we developed in a session of Rapid Software Testing Intensive Online), James Bach shows that testability is a set of relationships between product (“intrinsic testability”); project (“project-related testability”); tester (“subjective testability”); what we want from the product (“value-related testability”); and how we know what we know and what we need to know (“epistemic testability”).

    Be sure of this: anything that makes testing harder or slower gives bugs more time or more opportunities to hide. In telling an expert and compelling story of our testing, it’s essential to identify and address things that make it harder to understand the product we’ve got—things that help to increase the risk that it won’t be the product our clients want.

    Harry Collins and The Motive for Distinctions

    Monday, March 3rd, 2014

    “Computers and their software are two things. As collections of interacting cogs they must be ‘checked’ to make sure there are no missing teeth and the wheels spin together nicely. Machines are also ‘social prostheses’, fitting into social life where a human once fitted. It is a characteristic of medical prostheses, like replacement hearts, that they do not do exactly the same job as the thing they replace; the surrounding body compensates.

    “Contemporary computers cannot do just the same thing as humans because they do not fit into society as humans do, so the surrounding society must compensate for the way the computer fails to reproduce what it replaces. This means that a complex judgment is needed to test whether software fits well enough for the surrounding humans to happily ‘repair’ the differences between humans and machines. This is much more than a matter of deciding whether the cogs spin right.”

    —Harry Collins

    Harry Collins—sociologist of science, author, professor at Cardiff University, a researcher in the fields of the public understanding of science, the nature of expertise, and artificial intelligence—was slated to give a keynote speech at EuroSTAR 2013. Due to illness, he was unable to do so. The quote above is the abstract from the talk that Harry never gave. (The EuroSTAR community was very lucky and grateful to have his colleague, Rob Evans, step in at the last minute with his own terrific presentation.)

    Since I was directed to Harry’s work in 2010 (thank you, Simon Schaffer), James Bach and I have been galvanized by it. As we’ve been trying to remind people for years, software testing is a complex, cognitive, social task that requires skill, tacit knowledge, and many kinds of expertise if we want people to do it well. Yet explaining testing is tricky, precisely because so much of what skilled testers do is tacit, and not explicit; learned by practice and by immersion in a culture, not from documents or other artifacts; not only mechanical and algorithmic, but heuristic and social.

    Harry helps us by taking a scalpel to concepts and ideas that many people consider obvious or unimportant, and dissecting those ideas to reveal the subtle and crucial details under the surface. As an example, in Tacit and Explicit Knowledge, he takes the idea of tacit knowledge—formerly, any kind of knowledge that was not told—and divides it into three kinds: relational, the kind of knowledge that resides in an individual human mind, and that in general could be told; somatic, resident in the system of a human body and a human mind; and collective, residing in society and in the ever-changing relationships between people in a culture.

    How does that matter? Consider the Google car. On the surface, operating a car looks like a straightforward activity, easily made explicit in terms of the laws of physics and the rules of the road. Look deeper, and you’ll realize that driving is a social activity, and that interaction between drivers, cyclists, and other pedestrians is negotiated in real time, in different ways, all over the world. So we’ve got Google cars on the road experimentally in California and Washington; how will they do in Beijing, in Bangalore, or in Rome? How will they interact with human drivers in each society? How will they know, as human drivers do, the extent to which it is socially acceptable to bend the rules—and socially unacceptable not to bend them? In many respects, machinery can do far better than humans in the mechanical aspects of driving. Yet testing the Google car will require far more than unit checks or a Cucumber suite—it will require complex evaluation and judgement by human testers to see whether the machinery—with no awareness or understanding of social interactions, for the foreseeable future—can be accommodated by the surrounding culture. That will require a shift from the way testing is done at Google according to some popular stories. If you want to find problems that matter to people before inflicting your product on them, you must test—not only the product in isolation, but in its relationships with other people.

    Our goal, all the way along, has been to probe into the nature of testing and the way we talk about it, with the intention of empowering people to do it well. Part of this task involves taking relational tacit knowledge and making it explicit. Another part involves realizing that certain skills cannot be transferred by books or diagrams or video tutorials, but must be learned through experience and immersion in the task. Rather than hand-waving about “intuition” and “error guessing”, we’d prefer to talk about and study specific, observable, trainable, and manageable skills. We could talk about “test automation” as though it were a single subject, but it’s more helpful to distinguish the many ways that we could use tools to support and amplify our testing—for checking specific facts or states, for generating data, for visualization, for modeling, for coverage analysis… Instead of talking about “automated testing” as though machines and people were capable of the same things, we’d rather distinguish between checking (something that machines can do, an activity embedded in testing) and testing (which requires humans), so as to make both our checking and our testing more powerful.

    The abstract for Prof. Collins’ talk, quoted above, is an astute, concise description of why skilled testing matters. It’s also why the distinction between testing and checking matters, too. For that, we are grateful.

    There will be much more to come in these pages relating Harry’s work to our craft of testing; stay tuned. Meanwhile, I give his books my highest recommendation.

    Tacit and Explicit Knowledge
    Rethinking Expertise (co-authored with Rob Evans)
    The Shape of Actions: What Humans and Machines Can Do (co-authored with Martin Kusch)
    The Golem: What You Should Know About Science (co-authored with Trevor Pinch)
    The Golem at Large: What You Should Know About Technology (co-authored with Trevor Pinch)
    Changing Order: Replication and Induction in Scientific Practice
    Artificial Experts: Social Knowledge and Intelligent Machines

    Very Short Blog Posts (11): Passing Test Cases

    Wednesday, January 29th, 2014

    Testing is not about making sure that test cases pass. It’s about using any means to find problems that harm or annoy people. Testing involves far more than checking to see that the program returns a functionally correct result from a calculation. Testing means putting something to the test, investigating and learning about it through experimentation, interaction, and challenge. Yes, tools may help in important ways, but the point is to discover how the product serves human purposes, and how it might miss the mark. So a skilled tester does not ask simply “Does this check pass or fail?” Instead, the skilled tester probes the product and asks a much more rich and fundamental question: Is there a problem here?

    Very Short Blog Posts (10): Planning and Preparation

    Wednesday, January 15th, 2014

    A plan is not a document. A plan is a set of ideas that may be represented by a document or by other kinds of artifacts. In Rapid Testing, we emphasize preparing your mind, your skills, and your tools, and sharpening them all as you go. We don’t reject planning, but we de-emphasize it in favour of preparation. We also recommend that you keep the artifacts that represent your plans as concise and as flexible as they can reasonably be.

    The world of technology is complex and constantly changing. If you’re prepared, you have a much better chance of adapting and reacting appropriately to a situation when the plans have gone awry. But all the planning in the world can’t help you if you’re not prepared.

    Very Short Blog Posts (7): Planning vs. Preparation

    Sunday, November 3rd, 2013

    Imagine a software project. Imagine the things that you want to accomplish, the problems you might encounter, the workarounds you could apply, the accidents (both happy and sad) that might happen, the missteps you may take, the steps you can take to prevent them; all of the actions you can perform to manage the project. Now, make a detailed plan that takes all of your expectations into account.

    The more detailed your plan, the more likely it will differ from reality in important respects. Unexpected things will happen, some positive, some negative, and many of them out of your control. You can’t predict future events reliably, but you can prepare to respond to them. Therefore: you might want to relax your effort on specific plans somewhat, and emphasize developing skills and resources that will help you to deal capably with surprises.