Blog Posts from September, 2007

If a test passes in a forest, and no one sees it…

Wednesday, September 19th, 2007

Pradeep Soundararajan is a colleague of me and of James Bach. Pradeep would say he’s a student, but in this case the student has surpassed this teacher. Pradeep writes and tests and thinks with passion. In a recent blog post, he came up with this gem:

“…it is not a test that finds a bug but it is a human that finds a bug and a test plays a role in helping the human find it.”

That’s very insightful. It puts the tester, rather than the test, at the centre of testing. It underscores the idea that we produce and perform tests with the intention of revealing information, but until some human observes and evaluates some outcome of the test, the test is silent. It also emphasizes that a test might provide us with the opportunity to observe one bug or several.

Bravo, Pradeep. I’ll be quoting that a lot.

The Most Useful Metrics

Monday, September 17th, 2007

A correspondent on LinkedIn asked recently, “What are the useful metrics for software quality?” Here’s my answer (lightly edited for the blog).

Update, September 20: Whoops! Thanks to Anonymous, I notice that I’ve mistranscribed the original question, which was “What are the useful metrics for code quality?”

Measurement is “the empirical, objective assignment of numbers, according to a rule derived from a model or theory, to attributes of objects or events with the intent of describing them.” (Kaner and Bond, 2004)

So what are you trying to describe about the code?

Quality is “value to some person(s)” (Weinberg), to which James Bach and I add “who matter”.

It’s not clear to me that the quality of a software application depends mostly on the stability and quality of its code base. Lots of other factors could come into play. If the product is the only thing that does what it does, then the quality and stability of the code base might not matter much. If the product isn’t available to its users, then the quality and stability of the code base doesn’t matter at all to them–although it may matter a lot to the development organization. A product could have dozens of bugs per thousand lines of code, but if those bugs don’t matter to someone that matters, then the metric doesn’t matter either.

In my opinion, it’s not worthwhile to talk about metrics until you’ve determined attributes of the code that you’d like to evaluate. Most of those attributes are bound to be subjective, and putting a number on them is a case of reification error. But we could still evaluate the product by assessment, rather than measurement–describing it, or telling a story about it, rather than subjecting it to some numerical model. Here are some of the things that I would consider important in evaluating a code base:

1) Is it testable? (Does it include logging, scriptable interfaces, real-time monitoring capabilities?)

2) Is it supportable? (Does it contain helpful error messages? Can it guide the user, the technical support department, the testers, and the developers on what to do when it gets into trouble?)

3) Is it maintainable? (Is it clearly written? Nicely modular? Easily reviewable? Accessible to those who need access to it? Is it subject to version control? Is it accompanied by unit tests? Are there assertions built into debug builds of the code?)

4) Is it portable? (Can it be adapted easily to new platforms? Operating systems? Does it depend on OS-specific third-party libraries?)

5) Is it localizable? (Can it be adapted easily to some geographic or temporal region that is different from its current one?)

Those are just a few of the questions I would ask. Note that few to none of them are expressible as a number–but that all of them could be highly relevant to the quality of the code–that is, its value to some person.

Update (2009/10/19): I’ve written a couple of articles on this subject for Better Software Magazine:

Three Kinds of Measurement (And Two Ways to Use Them)
Better Software, Vol. 11, No. 5, July 2009

How do we know what’s going on? We measure. Are software development and testing sciences, subject to the same kind of quantitative measurement that we use in physics? If not, what kinds of measurements should we use? How could we think more usefully about measurement to get maximum value with a minimum of fuss? One thing is for sure: we waste time and effort when we try to obtain six-decimal-place answers to whole-number questions.

Issues About Metrics About Bugs
Better Software, Vol. 11, No. 4, May 2009

Managers often use metrics to help make decisions about the state of the product or the quality of the work done by the test group. Yet measurements derived from bug counts can be highly misleading because a “bug” isn’t a tangible, countable thing; it’s a label for some aspect of some relationship between some person and some product, and it’s influenced by when and how we count… and by who is doing the counting.

These columns were also reprinted in LogiGear’s Insider’s Guide to Strategic Software Testing Newsletter.

Unquantifiable doesn’t mean unmeasurable. We measure constantly without resorting to numbers. Goldilocks did it.