Resources on Exploratory Testing, Metrics, and Other Stuff
Here are some resources on the Web that I've either written, found very useful, or both. I'm constantly referring people to the writings and resources on this list.
Evolving Understanding of Exploratory Testing
My community defines exploratory testing asa style of software testing that emphasizes the personal freedom and responsibility of the individual tester to optimize the quality of his or her work by treating test design, test execution, test interpretation, and test-related learning as mutually supportive activities that continue in parallel throughout the project.
Yes, that's quite a mouthful. It was synthesized by Cem Kaner in 2006, based upon discussions at the Exploratory Testing Research Summit and the Workshop on Heuristic and Exploratory Techniques. (hover the mouse for participants).
Sometimes, when we want to save time, we refer to exploratory testing more concisely: "parallel test design, test execution, and learning". These definitions are not contradictory. The former is more explicit; the other can be uttered more quickly, and is intended to incorporate the latter. (We used to say "simultaneous", instead of "parallel", but people had trouble, apparently, with the idea that simultaneous activities don't have to be happening at equal intensity at every moment, so parallel seems like a better word these days.)
Exploratory approaches to testing live at one end of a continuum. Scripted testing—writing out a list of step-by-step actions to perform, each step paired with a specific condition to observe—is at the other end of this continuum. Scripted approaches are common in software testing. Typically tests are designed by a senior tester, and given to someone else—typically a more junior tester—to execute.
A number of colleagues and I have serious objections to the scripted approach. It is expensive and time-consuming, and seems likely to lead to inattentional blindness. It also separates test design from test execution and result interpretation, and thereby lengthens and weakens the learning and feedback loops that would otherwise support and strengthen them.
Irrespective of any other dimension of it, we call a test more exploratory and less scripted to the extent that
- elements of design, execution, interpretation, and learning are performed by the same person;
- the design, execution, interpretation, and learning happen together, rather than being separated in time;
- the tester is making her own choices about what to test, when to test it, and how to test it—the tester may use any automation or tools in support of her testing, or none at all, as she sees fit;
- everything that has been learned so far, including the result of the last test, informs the tester’s choices about the next test;
- the tester is focused on revealing new information, rather than confirming existing knowledge about the product;
- in general, the tester is varying aspects of her tests rather than repeating them, except where the repeating aspects of the test are intended to support the discovery of new information.
James Bach and I recorded a conversation on this subject, which you can listen to here. At the 2008 Conference for the Association for Software Testing, Cem Kaner presented a talk called The Value of Checklists and the Danger of Scripts: What Legal Training Suggests for Testers. You can read that here.
Some claim that exploratory testing is "unstructured", equating it with "ad hoc testing" or "fooling around with the computer". In our definition of exploratory testing, such claims are false and unsupportable, and we reject them. Some may say that they are doing "exploratory testing" when they are behaving in an unskilled, unprofessional manner, but we reject this characterization as damaging not only to exploratory testing, but to the reputation of testers and testing generally. If you are not using the learning garnered from test design and test execution in a continuous and rapid loop to optimize the quality of the work, you are not doing exploratory testing. If exploratory testing is "fooling around with the computer", then forensic investigation is "poking around inside a dead body".
"Ad hoc" presents an interesting problem, because those who equate "ad hoc" with "exploratory" not only misunderstand the latter, but misrepresent the former as meaning "sloppy", "slapdash", "unplanned", or "undocumented". "Ad hoc" means literally "to this", or "to the purpose". The Rogers Commission on the Challenger explosion was an ad hoc commission, but it wasn't the Rogers Sloppy Commission or the Rogers Slapdash Commission. The Commission planned its work and its report, and was thoroughly documented. The Rogers Commission was formed for a specific purpose, did its work, and was dissolved when its work was done. In that sense, all testing should be "ad hoc". But alas "ad hoc" and its original meaning parted company several years ago. Exploratory testing is certainly not ad hoc in its revised sense.
Structures of Exploratory Testing
Exploratory testing is not structured in the sense of following a prescribed, step-by-step list of instructions, since that's not what structure means. Structure, per the Oxford English Dictionary, means "the arrangement of and relations between the parts or elements of something complex". In this definition, there is no reference to sequencing or to lists of instructions to follow. So, just as education, nutrition, driving an automobile, juggling, child-rearing, and scientific revolutions are structured and have structure, exploratory testing is also structured. In fact, there are many structures associated with exploratory testing. What follows is an evolving list of lists of those structures:
- Evolving Work Products, Skills and Tactics, ET Polarities, and Test Strategy. James Bach, Jon Bach, and I have been working on the Exploratory Skills and Dynamics list for some time now. This is a kind of evolving master list of exploratory testing structures. James describes it here.
- Oracles. The HICCUPPS consistency heuristics, which James Bach initiated and which I wrote about in this article for Better Software in 2005. (Actually, at the time it was only HICCUPP—History, Image, Comparable Products, Claims, User Expectations, Purpose, Product—but since then we've also added S, for Standards and Statutes. Mike Kelly also talks about HICCUPP here.
- Test Strategy. James Bach's Heuristic Test Strategy Model isn't restricted to exploratory approaches, but certainly helps to guide and structure them.
- Test Framing. Test framing can be seen as a retrospective view of test strategy. Test framing is a key skill that helps us to compose, edit, narrate, and justify the story of our testing in a logical, coherent, and rapid way. The goal of test framing is to link each testing activity with the testing mission via a direct line of logic that connects the mission to the tests. Test framing is described in detail here.
- Speculation. While this post refers to the overall development process, it makes some important points that are relevant to the nature of rapid, iterative, and exploratory testing. Jim Highsmith on "Don't Plan, Speculate".
- Data Type Attacks, Web Tests, Testing Wisdom, Heuristics, and Frameworks. Elisabeth Hendrickson's Test Heuristics Cheat Sheet is a rich set of guideword heuristics and helpful reference information.
- Context Factors, Information Objectives. Cem Kaner most recently delivered his Tutorial on Exploratory Testing for the QAI Quest Conference in Chicago, 2008. There's a similar, but not identical talk here. James Bach's Context Model is a diagram containing many guidewords to context.
- Quick Tests. In our Rapid Software Testing course, James Bach and I talk about quick tests. The course notes are available for free. Fire up Acrobat and search for "Quick Tests".
- Coverage (specific). Michael Hunter's You Are Not Done Yet is a detailed set of coverage ideas to help prompt further exploration when you think you're done.
- Coverage (general). I wrote a series of three articles for Better Software magazine in which I explored the concept of coverage. They're called Got You Covered, Cover or Discover, and A Map By Any Other Name. The summary I gave for the middle article makes the main point for the purpose of this discussion: excellent testing isn't just about covering the "map"—it's also about exploring the territory, which is the process by which we discover things that the map doesn't cover.
- Coverage (general). James Bach wrote this article in 2001, in which he summarizes test coverage ideas under the mnemonic "San Francisco Depot."—Structure, Function, Data, Platform, and Operations. Several years later, I convinced him to add an element to the list, so now it's "San Francisco Depot. The last T is for...
- Time. I realized a few years ago that some guideword heuristics might help us to pay attention to the ways in which products related to time, and vice versa. That turned into a Better Software article called "Time for New Test Ideas".
- Tours. Mike Kelly's FCC CUTS VIDS Touring Heuristics (note the date) provides a set of structured approaches for touring the application.
- The Value of Checklists and the Danger of Scripts.Cem Kaner's keynote presentation from the 2008 Conference for the Association for Software Testing describes experiments in overly patterned learning, and the ways in which the legal profession uses checklists as heuristic tools to guide decsion-making.
- Stopping Heuristics. There are structures to deciding when to stop a given test, a line of investigation, or a test cycle. I catalogued them here, and Cem Kaner made a vital addition here.
- Accountability, Reporting Progress. James and Jon Bach's description of Session-Based Test Management is a set of structures for making exploratory testing sessions more accountable.
- Procedure. The General Functionality and Stability Test Procedure. It was designed for Microsoft in the late 1990s by James Bach, and may be the first documented procedure to guide exploratory test execution and investigation.
- Regression Testing. Karen Johnston provides a list of heuristic guidewords to lend structure to testing for regression in an exploratory way: RCRCRC, for Recent, Core, Risk, Configuration-sensitive, Repaired, and Chronic.
- The Role of Repetition. In the Rapid Software Testing class, James Bach and I teach heuristics on why you might choose to repeat a test, instead of running a new test. We note that repeated tests and regression tests may intersect, but they're orthogonal categories.
- Emotions. I gave a talk on emotions as powerful pointers to test oracles at STAR West in 2007. That helped to inspire some ideas about...
- Noticing, Observation. At STAR East 2009, I did a keynote talk on noticing, which can be important for exploratory test execution. The talk introduces a number of areas in which we might notice, and some patterns to sharpen noticing.
- Leadership. For the 2009 QAI Conference in Bangalore, India, I did a plenary talk in which I noted several important structural similarities between exploratory testing and leadership.
This is a blog posting that I wrote in September, 2008, summarizing some important points about exploratory testing.
The software development and testing business seems to have a very poor understanding of measurement theory and measurement-related pitfalls, so conversations about measurement are often frustrating for me. People assume that I don't like measurement of any kind. Not true; the issue is that I don't like bogus measurement, and there's an overwhelming amount of it out there.
I've written three articles that explain my position on the subject:
I agree with Jerry Weinberg's definition of measurement: the art and science of making reliable and significant observations. I'll suggest that anyone who wants to have a reasonable discussion with me on measurement should read and reflect deeply uponSoftware Engineering Metrics: What Do They Measure and How Do We Know (Kaner and Bond)
This paper provides an excellent description of quantitative measurement, identifies dangerous measurement pitfalls, and suggests some helpful questions to avoid them. One key insight that this paper triggered for me: a metric is a measurement function, the model- and theory-driven operation by which we attach a number to an observation. Metrics are not measurement. A metric is a tool used in the practice of measurement, and to me using "measurement" and "metric" interchangably highlights confusion. When someone from some organization says "we're establishing a metrics program", it's like a cabinet-maker saying "I'm establishing a hammering program."
Here are some more important references on measurement, both quantitative and qualitative, and on the risks of invalid measurement, distortion, and dysfunction:
- "The Dark Side of Software Metrics" (Doug Hoffman)
- "Meaningful Metrics" (Anna Allison)
- How to Lie With Statistics (Darrell Huff)
- Measuring and Managing Performance in Organizations (Robert D. Austin)
- Quality Software Management, Vol. 2: First Order Measurement (Gerald M. Weinberg)
- Why Does Software Cost So Much? (Tom deMarco)
- Experimental and Quasi-Experimental Designs for Generalized Causal Inference (Shadish, Cook, & Campbell)
- Reliability and Validity in Qualitative Research (Kirk & Miller)
Show me measurements that have been thoughtfully conceived, reliably obtained, carefully and critically reviewed, and that avoid the problems identified in these works, and I'll buy into them. Otherwise I'll point out the risks, or recommend that they be trashed. As James Bach says, "Helping to mislead our clients is not a service that we offer."
Investigating Hard-To-Reproduce Bugs
Finding it hard to reproduce the circumstances in which you noticed a bug?
- Here's a set of suggestions in a post on Jerry Weinberg's blog.
- Here's another, on Jonathan Kohl's blog, and this is the article that followed the post.
- And here's a very thorough treatment of the issue from James Bach.
The Heuristic Test Strategy Model
This document by James Bach describes the test strategy model that is central to the practice of Rapid Testing.
Context-Driven Testing Explained
Cem Kaner and James Bach collaborated on a detailed description of context-driven testing, explaining it and contrasting it with other approaches.
A test matrix is a handy approach to organizing test ideas, tracking results, and visualizing test coverage. Read more here.
Visual SourceSafe Defects
While developing a utility to migrate files from Visual SourceSafe (VSS) to another version control package, I had to test Visual SourceSafe itself. These tests demonstrated to me that VSS's file and database management is so defect-ridden as to present a danger to customers using the product in reasonable scenarios and cirucmstances. Although it's an older article (circa 2002), it did turn out to be an excellent example of rapid and exploratory testing approaches, and an example of the kind of test report that I would issue to a client. Your mileage may vary, but these are my findings.
A Review of Error Messages
Creating a good error message is challenging. On the one hand, it needs to be informative, to assist the user, and to suggest reasonable actions to mitigate the problem. On the other hand, it needs to avoid giving hackers and other disfavoured users the kind of information that they seek to compromise security or robustness. Here are some suggestions.
Pairwise Testing and Orthogonal Arrays
Pairwise and orthogonal array test techniques may allow us to obtain better test coverage—or maybe not. Over the years, I've changed my views on these techniques. I explain all-pairs and orthogonal arrays here, and I then include some tempering of the basic story—and some counter-tempering too.