Blog Posts from July, 2009

Three Kinds of Measurement and Two Ways to Use Them

Wednesday, July 22nd, 2009

In the testing business, we’ve been wrestling with the measurement problem for quite a while. I think there are two prongs to the problem. The first is the aphorism that “you can’t control what you can’t measure”. The second is the confusion between measurement (which can be either quantitative or qualitative) and metrics, which are mathematical functions of measurements, and therefore fundamentally quantitative, only quantitative.

I don’t know if you can’t control something that you can’t measure, but you can certainly make responsible, defensible choices control things based on non-quantitative measures. For example, I’m hungry right now, and the non-bald parts of my head are a little shaggy. I’m not really comfortable with the keyboard on my new ThinkPad, but I like the display even though the default fonts seem to be a little on the small side for an astigmatic guy approaching his 50s. I can measure and manage all of these things without applying numbers.

I’m going to go grab a bite after I’ve finished this note; I’ll get my wife to give me a haircut before she heads out on the canoe trip, and I’ll trim my beard on my own. I can’t do much about the keyboard, although I can measure it by saying that I liked my old machine’s keys better. And I can grow the fonts in the browser by pressing Ctrl-+ until I’m happy again. In each case, I’m measuring to manage just the effects that I want, even though I’m doing it without quantitative measures. (Thanks to Matt Heusser for pointing out the haircut example to me; and thanks to Cem Kaner for pointing out the significance of the fact that I griped about the keyboard before complimenting the display.)

Apropos of all this, another of my Test Connection columns has been posted on StickyMinds. This one is about measurement and metrics, and the way that people use and confuse them. You can read it by clicking here, or by going to

I’m grateful for the guidance and compliments given to me by Jerry Weinberg on this one.

I’m also delighted by the appearance of a recent article by Tom DeMarco in IEEE Computer, in which he re-evaluates his thoughts on metrics as expressed in early and influential book, Controlling Software Projects: Management, Measurement, and Estimation (Prentice Hall/Yourdon Press, 1982). He also questions his thoughts on software engineering, as evinced by the title of the piece, “Software Engineering: An Idea Whose Time Has Come and Gone?”. It’s brilliant, and high time that some of Mr. DeMarco’s stature raised these questions. You can read the article here, or by going to


Monday, July 6th, 2009

On Twitter, Kindly Reader @jrl7 (in real life, John Lambert at Microsoft) asks “Is there an example of testability that doesn’t involve improving ability to automate? (improved specs?)“.

(Update, June 5 2014: For a fast and updated answer, see Heuristics of Software Testability.)

Yup. If testing is questioning a product in order to evaluate it, then testability is anything that makes it easier to question or evaluate that product. So testability is anything that makes the program faster or easier to test on some level. Anything that slows down testing or makes it harder reduces testability, which gives bugs an opportunity to hide for longer, or to conceal themselves better.

To me, testability is enabled, extended, enhanced, accelerated or intensified by initiating or improving on some of the things below, either on their own or in combination with others. That suggests that testability ideas are media, in the McLuhan sense. Thus each idea comes with a cautionary note.  As McLuhan pointed out, when a medium is stretched beyond its original or intended capacities, it reverses into the opposite of the intended effect. So the following ideas are heuristic, which means that any of these things could help, but might fail to help or might make things worse if considered or applied automatically or unwisely.

In accordance with Joel Spolsky’s Law of Leaky Abstractions, I’ve classified them into a few leaky categories.

Product Elements

  • Scriptable interfaces to the product, so that we can drive it more easily with automation.
  • Logging of inputs, outputs, or activities within the program. Structure helps; time stamps help; a variety of levels of detail help.
  • Real-time monitoring of the internals of the application via another window, a debug port, or output over the network—anything like that. Printers, for example, often come with displays that can tell us something about what’s going on.
  • Internal consistency checks within the program. For example, if functions depend on network connectivity, and the connection is lost, the application can let us know that instead of simply failing a function.
  • Overall simplicity and modularity of the application, especially in the separation of user interface code from program code. This one needs to be balanced with the knowledge that simpler modules tends to mean more numerous modules, which leads to growth in the number of interfaces, which in turn may mean more interface problems. There are no free lunches, but maybe there are less expensive lunches. Note also that simplicity and complexity are not attributes of a program; they’re relationships between the program and the person observing it. What looks horribly complex to a tester might look simple and elegant to a programmer; what looks frightfully complex to the programmer might look straightforward to a marketer.
  • Use of resource files for localization support, rather than hard-coding of location-dependent strings, dialogs, currencies, time formats, and the like.
  • Readable and maintainable code, thanks to pairing or other forms of technical review, and to refactoring.
  • Restraint in platform support. Very generally, the fewer computers or operating systems or browsers or application framework versions or third-party library versions that we have to support, the easier the testing job will be.
  • Restraint in feature support. Very generally, and all other things being equal, the more features, the longer it takes to test a program.
  • Finally, but perhaps most importantly, an application that’s in good shape before the testers get to it.  That can be achieve, at least to some degree, by diligent testing by programmers.  That testing can be based on unit tests or (perhaps better yet) a test-first approach such as test- or behaviour-driven development.  Why does this kind of testing make a program more testable? If we’re investigating and reporting bugs that we find or (worse) spending time trying to work around blocking bugs, we slow down, and we’re compromised in our ability to obtain test coverage.


Things that make a program faster or easier to use tend to make it faster or easier to test, especially when testing at the user interface level. Any time you see a usability problem in an application, you may be seeing a testability problem too.

  • Ease of learning—that is, the extent to which the application allows the user to achieve expertise in its use.
  • Ease of use—that is, the extent to which the application supports the user in complete a task quickly and reliably.
  • Affordance—that is, the extent to which the application advertises its available features and functions to the user.
  • Clearer error and/or exception messages. This could include unique identifiers to help us to target specific points in the code, a notion of what the problem was, or which file was not found, thank you.


An oracle is a principle or mechanism by which we might recognize a problem. Information about how the system is intended to work is a source of oracles.

  • Better documentation. Note that, for testability purposes, “better” documentation doesn’t necessarily mean “more complete” or “more elaborate” or “thicker”. It might mean “more concise” or “more targeted towards testing” or “more diagrams and illustrations that allow us to visualize how things happen”.
  • Clear reference to relevant standards, and information as to when and why those standards might be relevant or not.
  • “Live oracles”—people who can help us in determining whether we’re seeing appropriate behaviour from the application, when that’s the most efficient mode of knowledge transfer. Programmers, business analysts, product owners, technical support people, end-users, more experienced testers—all are candidates for being live oracles.
  • Programs that give us a comparable result for some feature, function, or state within our product. Such programs may have been created within our organization or outside; they may have been created for testing or for some other purpose; they may be products that are competitors to our own.
  • Availability of old versions is a special case of the comparable program heuristic. Having an old version of a product around for comparison may help to make the current version of our program easier to test.

Equipment and Tools

  • Access to existing ad hoc (in the sense of “purpose-built”, not sloppy) test tools, and help in creating them where needed. Note that a test tool is not merely a program that probes the application under test. It might be a data-generation tool, an oracle program that supplies us with a comparable result, or a program that allows us to set up a new platform with minimal fuss.
  • Availability of test environments. In big organizations and on big projects, I’ve never worked with a test organization that believed it had sufficient isolated platforms for testing.

Build, Setup, and Configuration

  • More rapid building and integration of the product, including automated smoke tests that help us to determine if the program has been built correctly.
  • Simpler or accelerated setup of the application.
  • The ability to change settings or configuration of the application on the fly.
  • Access to source control logs help us to identify where a newly-discovered problem or a regression might have crept into the code.

Project and Process

  • Availability of modules separately for early testing, especially at the integration level.
  • Information about what has already been tested, so we can leverage the information or avoid repeating someone else’s efforts.
  • Access to source code for those of us who can read and interpret it.
  • Proximity of testers, programmers, and other members of the project community.
  • Project community support for testing. Testing is often made much longer and more complicated by constraints imposed by forces that are external to the project. IT managers, for good reasons of their own, are often reluctant to grant autonomy to test groups.
  • Tester skill is inter-related with testability. It might not make sense to put a scriptable interface into your product if you’re not going to use it yourself and you don’t anticipate your testers having the skill to use it either. That might sounds undesirable, yet over the years much great software has been produced without test automation assistance. Still, it’s usually worthwhile to have at least some members of the test team skilled in automation, and to give them a program for which those skills are useful.
  • Stability, or an absence of turbulence, both in the product and in the team that’s producing it. Things that are changing all the time are usually harder to test.

Want more ideas? Have a look at James Bach’s Heuristics of Testability. But in general, ask yourself, “What’s slowing me down in my ability to test this product, and how might I solve that problem?”

Postscript: Bret Pettichord contacted me with a link to this paper, at the end of which he surveys several different definitions of testability.