“Computers and their software are two things. As collections of interacting cogs they must be ‘checked’ to make sure there are no missing teeth and the wheels spin together nicely. Machines are also ‘social prostheses’, fitting into social life where a human once fitted. It is a characteristic of medical prostheses, like replacement hearts, that they do not do exactly the same job as the thing they replace; the surrounding body compensates.
“Contemporary computers cannot do just the same thing as humans because they do not fit into society as humans do, so the surrounding society must compensate for the way the computer fails to reproduce what it replaces. This means that a complex judgment is needed to test whether software fits well enough for the surrounding humans to happily ‘repair’ the differences between humans and machines. This is much more than a matter of deciding whether the cogs spin right.”
Harry Collins—sociologist of science, author, professor at Cardiff University, a researcher in the fields of the public understanding of science, the nature of expertise, and artificial intelligence—was slated to give a keynote speech at EuroSTAR 2013. Due to illness, he was unable to do so. The quote above is the abstract from the talk that Harry never gave. (The EuroSTAR community was very lucky and grateful to have his colleague, Rob Evans, step in at the last minute with his own terrific presentation.)
Since I was directed to Harry’s work in 2010 (thank you, Simon Schaffer), James Bach and I have been galvanized by it. As we’ve been trying to remind people for years, software testing is a complex, cognitive, social task that requires skill, tacit knowledge, and many kinds of expertise if we want people to do it well. Yet explaining testing is tricky, precisely because so much of what skilled testers do is tacit, and not explicit; learned by practice and by immersion in a culture, not from documents or other artifacts; not only mechanical and algorithmic, but heuristic and social.
Harry helps us by taking a scalpel to concepts and ideas that many people consider obvious or unimportant, and dissecting those ideas to reveal the subtle and crucial details under the surface. As an example, in Tacit and Explicit Knowledge, he divides tacit knowledge—formerly, any kind of knowledge that was not told—and divided it into three kinds: relational, the kind of knowledge that resides in an individual human mind, and that general could be told; somatic, resident in the system of a human body and a human mind; and collective, residing in society and in the ever-changing relationships between people in a culture.
How does that matter? Consider the Google car. On the surface, operating a car looks like a straightforward activity, easily made explicit in terms of the laws of physics and the rules of the road. Look deeper, and you’ll realize that driving is a social activity, and that interaction between drivers, cyclists, and other pedestrians is negotiated in real time, in different ways, all over the world. So we’ve got Google cars on the road experimentally in California and Washington; how will they do in Beijing, in Bangalore, or in Rome? How will they interact with human drivers in each society? How will they know, as human drivers do, the extent to which it is socially acceptable to bend the rules—and socially unacceptable not to bend them? In many respects, machinery can do far better than humans in the mechanical aspects of driving. Yet testing the Google car will require far more than unit checks or a Cucumber suite—it will require complex evaluation and judgement by human testers to see whether the machinery—with no awareness or understanding of social interactions, for the foreseeable future—can be accommodated by the surrounding culture. That will require a shift from the way testing is done at Google according to some popular stories. If you want to find problems that matter to people before inflicting your product on them, you must test—not only the product in isolation, but in its relationships with other people.
Our goal, all the way along, has been to probe into the nature of testing and the way we talk about it, with the intention of empowering people to do it well. Part of this task involves taking relational tacit knowledge and making it explicit. Another part involves realizing that certain skills cannot be transferred by books or diagrams or video tutorials, but must be learned through experience and immersion in the task. Rather than hand-waving about “intuition” and “error guessing”, we’d prefer to talk about and study specific, observable, trainable, and manageable skills. We could talk about “test automation” as though it were a single subject, but it’s more helpful to distinguish the many ways that we could use tools to support and amplify our testing—for checking specific facts or states, for generating data, for visualization, for modeling, for coverage analysis… Instead of talking about “automated testing” as though machines and people were capable of the same things, we’d rather distinguish between checking (something that machines can do, an activity embedded in testing) and testing (which requires humans), so as to make both our checking and our testing more powerful.
The abstract for Prof. Collins’ talk, quoted above, is an astute, concise description of why skilled testing matters. It’s also why the distinction between testing and checking matters, too. For that, we are grateful.
There will be much more to come in these pages relating Harry’s work to our craft of testing; stay tuned. Meanwhile, I give his books my highest recommendation.
Tacit and Explicit Knowledge
Rethinking Expertise (co-authored with Rob Evan)
The Shape of Actions: What Humans and Machines Can Do (co-authored with Martin Kusch)
The Golem: What You Should Know About Science (co-authored with Trevor Pinch)
The Golem at Large: What You Should Know About Technology (co-authored with Trevor Pinch)
Changing Order: Replication and Induction in Scientific Practice
Artificial Experts: Social Knowledge and Intelligent Machines