Blog Posts from October, 2011

xMMwhy

Friday, October 28th, 2011

Several years ago, I worked for a few weeks as a tester on a big retail project. The project was spectacularly mismanaged, already a year behind schedule by the time I arrived. Just before I left, the oft-revised target date slipped by another three months. Three months later, the project was deployed, then pulled out of production for another six months to be fixed. Project managers and a CIO, among many others, lost their jobs. The company pinned an eight-figure loss on the project.

The software infrastructure was supplied by a big database company, and the software to glue everything together was supplied by development organization in another country. That software was an embarrassment—bloated, incoherent, hard to use, and buggy. Fixes were rarely complete and often introduced new bugs. At one point during my short tenure, all effective worked stopped for five days because the development organization’s servers crashed and no backups were available. All this despite the fact that the software development company claimed CMMI Level 5.

This morning, I was greeted by a Tweet that said

“Deloittes show how a level 5 CMMi company has bad test process at #TMMi conf in Korea! So CMMi needs TMMi – good.”

The TMMi is the Testing Maturity Model Integration. Here’s what the TMMi Foundation says about it:

“The Test Maturity Model Integration has been developed to complement the existing CMMI framework. It provides a structured presentation of maturity levels, allowing for standard TMMi assessments and certification, enabling a consistent deployment of the standards and the collection of industry metrics.”

Here’s what the SEI—the CMMi’s co-ordinator and sponsor—says about it:

“CMMI (Capability Maturity Model Integration) is a process improvement approach that provides organizations with the essential elements of effective processes, which will improve their performance. CMMI-based process improvement includes identifying your organization’s process strengths and weaknesses and making process changes to turn weaknesses into strengths.”

What conclusions could we draw from these three statements?

If a company has achieved CMMI Level 5, yet has a bad test process, then there’s a logical problem here. Either testing isn’t an essential element of effective processes (in which case the TMMI should be unnecessary) or it is (in which case the SEI’s claim of providing the essential processes is unsupportable).

One clear solution to the problem would be to adjudicate all this by way of a Maturity Model Maturity Model (Integrated), the MMMMI, whereby your organization can determine (in a mature fashion, of course) what essential processes are in the first place. Mind you, that could be flawed too. You’d need a set of essential processes to determine how to determine essential processes, so you’ll also need a Maturity Model Maturity Model Maturity Model (Integrated), an MMMMMMI. And in fairly short order, your organization will disappear up its own ass.

Jerry Weinberg points in a different direction, using very strong language. This is from Quality Software Management, Volume 1: Systems Thinking, p. 21:

…cultural patterns are not more or less mature, they are just more or less fitting. Of course, some people have an emotional need for perfection, and they will impose this emotional need on everything they do. Their comparisons have nothing to do with the organization’s problems, but with their own.

“The quest for unjustified perfection is not mature, but infantile.

“Hitler was quite clear on who was the ‘master race’. His definition of Aryan race was supposed to represent the mature end product of all human history, and that allowed Hitler and the Nazis to justify atrocities on “less mature” cultures such as Gypsies, Catholics, Jews, Poles, Czechs, and anyone else who got in their way. Many would-be reformers of software engineering require their ‘targets’ to confess to their previous inferiority. These little Hitlers have not been very successful.

“Very few healthy people will make such a confession voluntarily, and even concentration camps didn’t cause many people to change their minds. This is not ‘just a matter of words’. Words are essential to any change project because they give us models of the world as it was and as we hope it to be. So if your goal is changing an organization, start by dropping the comparisons such as those implied in the loaded term ‘maturity.'”

It’s time for us, the worldwide testing community, to urge Deloitte, the SEI, the TMMI, and the unfortunate testers in Korea who are presently being exposed to the nonsense to recognize what many of us have known for years: maturity models have it backwards.

Should Testers Play Planning Poker?

Wednesday, October 26th, 2011

My colleague and friend Eric Jacobson, who recently (as I write) did a bang-up job on his first conference presentation at STAR West 2011, asks a question in response to this blog post from 2006. (I like it when people reflect on an issue for a few years.) Eric asks:

You are suggesting it may not make sense for testers to give time-based estimates to their teams, but what about relative estimates? Let’s say a Rapid Software Tester is asked to participate in Planning Poker (relative-based story estimation) on an Agile Scrum team. I’ve always considered this a golden opportunity. Are you suggesting said tester may want to refuse to participate in the Planning Poker?

Having observed Planning Poker in action, I’m conflicted. Estimating anything is always a bit of a dodgy business, even at the best of times. That’s especially true for investigation and in particular for discovery. (I’ve written about some of the problems with estimation here and in subsequent posts, and with how those problems pertain to testing here.) Yet Planning Poker may be one way to get a good deal closer to the best of times. I like the idea of testers hearing what’s going on in planning sessions, and of offering perspective on the possible implications of work or change. On the other hand, at Planning Poker sessions I’ve observed or participated in, testers are often pressured to lower their numbers. In an environment where there’s trust, there tends to be much less pressure; in an environment where there’s less trust, I’d take pressure to lower the estimate as a test result with several possible interpretations. (I leave those interpretations as an exercise for the reader, but don’t stop until you get to five, at least.)

In any case, some fundamental problems remain: First, testing is oriented towards discovering things, not building things. At the root of it all, any estimate of how long it will take to test something is like estimating how long it will take you to evaluate someone’s ability to speak Spanish (which I wrote about here), and discovering problems in their ability to express themselves. If you already know something or can reasonably anticipate it, that helps a lot, and the Planning Poker approach (among many others) can help with that to some degree.

The second problem is that there’s not necessarily symmetry between the effort in creating something and the effort in testing it. A function or feature that takes very little effort to program might take an enormous amount of effort to test. What kinds of variation could we put into data, workflow, timing, platform dependencies and interactions, scenarios, and so forth? Meanwhile, a feature that takes signficant amounts of programming effort could take almost no time to test (since “programming effort” could include an enormous amount of testing effort). There are dozens of factors involved, including the amount of testing the programmers do as they code; what kind of review is being done; what the scope of the change is; when particular discoveries get made (during “development time” or “testing time”; the skill of the parties involved; the testability of the product under test; how buggy the finished feature is (in which case there will be more time needed for investigation and reporting)… Planning Poker doesn’t solve the asymmetry problem, but it provides a venue for discussing it and getting started on sorting it out.

The third problem, closely related to the second, is this idea that all testing work associated with developing something must and shall happen within the same iteration. Testing never ends; it only stops. So it’s folly to think that all testing for a given amount of programming work can always fit into the same iteration in which the work is done. I’d argue that we need a more nuanced perspective and more options than that. The decision as to how much testing we’ll need is informed by many factors. Paradoxically, we’ll need some testing to help reveal and inform our notions of how much testing we’ll need.

I understand the desire to close the book on a development story within the sprint. I often—even usually—share that desire. Yet many kinds of testing work must respond to development work, and in such cases the development work has to be complete in some lesser sense than “fully tested”. Many kinds of confirmatory checking work, it seems to me, can be done within the same sprint as the programming work; no problem there. Yet it seems to me that other kinds of testing can reasonably wait for subsequent sprints—indeed, must wait for subsequent sprints, unless we’d like to have programmers stop all programming work altogether after a certain day in the sprint. Let me give you an example: in big banks, some kinds of transactions take several days to wend their way through batch processes that are run overnight. The testing work associated with that can be simulated, for sure (indeed, one would hope that most of such work would be simulated), but only at the expense of some loss of realism. For the test, whether the realism is important or not is always an open question with a fallible answer. Instead of making sure that there’s NO testing debt, consider reasonable, small, and sustainable amounts of testing debt that spans iterations. Agile can be about actual agility, instead of dogma.

So… If playing Planning Poker is part of the context, go for it. It’s a heuristic approach to getting people to consider testing more consciously and thoughtfully, and there’s something to that. It’s oriented towards estimating things in a more comprehensible time frame, and in digestible chunks of task and effort. Planning Poker is fallible, and one approach among many possible approaches. Like everything else, its usefulness largely depends mostly on the people using it, and how they use it.

Confusion as an Oracle

Monday, October 17th, 2011

A couple of weeks back, Sylvia Killinen (@skillinen on Twitter) tweeted:

“Seems to me that much of #testing relies on noticing when one is confused rather than accepting it as Something Computer Programs Do.”

That’s a beautiful observation, near and dear to my heart since 2007 at least. The night I read Sylvia’s tweet, I wanted to blog more on the subject, but sometimes blog posts go in a different direction from where I intend them to go. At the time, I went here. And now I’m back.

Sylvia’s tweet reminded me of a story that Jon Bach tells about learning testing with his brother James. Jon had been working in a number of less-than-prestigious jobs. James suggested that Jon become a tester, and offered to train him how to be an excellent tester. Jon agreed to the idea. Everything went fine for the first couple of weeks, but one day Jon slumped into James’ office looking dejected and demoralized. The conversation went something like this.

“What’s the problem?” asked James.

“I dunno,” said Jon. “I don’t think this whole becoming-a-tester thing is going to work out.”

“Not work out? But you’re doing great!” said James.

“Well, it might look that way to you, but…” Jon paused.

“So what’s the problem?”

“Well, you gave me this program to test,” Jon began. “But I’m just so confused.”

James peered over his glasses. “When you’re confused,” he said, “that’s a sign that there’s something confusing going on. I gave you a confusing product to test. Confusion might not be fun, but it’s a natural consequence when you’re dealing with a confusing product.” James was tacitly suggesting that Jon’s confusion cound be used as an oracle—a heuristic principle or mechanism by which we recognize a problem.

This little story suggests and emphasizes a number of serious and important points.

As I mentioned before, here, feelings don’t tell us what they’re about. Confusion doesn’t come with an arrow that points directly back to its source. Jon felt confused, and thought that the confusion was about him. But that confusion wasn’t just about Jon’s internal state; it was also about the state of the product and how Jon felt about it. Feelings—internal, non-specific and ambiguous—don’t tell us what’s going on; they tell us to pay attention to what’s happening around us. When you’re a novice, you might be inclined to believe that your feelings are telling about yourself, but that’s likely not the whole story, since emotions don’t happen in isolation from everything else. It’s more probable that feelings are telling you about the relationship between you and something else, or someone else, or the situation.

Which reminds me of another story. It happened at Jerry Weinberg’s Problem Solving Leadership workshop in 2008. PSL is full of challenging and rich and tricky exercises, and one day, one team had fallen into a couple of traps and had done rather badly. During the debrief, Jerry remarked on it. “You guys handled a much harder problem than this yesterday, you know. What happened this time?”

One of the participants answered, “The problem screwed us up.”

With only the briefest pause, Jerry peered at the fellow and replied in a gently admonishing way, “Your reaction to the complexity of the problem screwed you up.”

Methodologists and process enthusiasts regularly ignore the complex human and emotional aspects of testing, and so don’t take them into account or use them as a resource. Some actively reject feelings as a rich source of information. One colleague reports that she told her boss about a presentation of mine in which I had discussed the role of emotions in software testing.

“There’s no role for emotions in software testing,” he said quietly.

“I’m not sure I agree,” she said. “I think there might be. I think at least it’s worth considering.”

Abruptly he shouted, “THERE’S NO ROLE FOR EMOTIONS IN SOFTWARE TESTING!”

She remarked he had seemed agitated—a strange reaction considered the mismatch between what he was saying and what he appeared to be feeling. What more might we learn by noting his feelings and considering possible interpretations? What inferences might we draw about the differences between his reaction and hers?

As we increasingly emphasize in the Rapid Software Testing course, recognizing and dealing with your feelings is a key self-management skill. Indeed, for testers, feelings are a kind of first-order measurement. It’s okay to be confused. The confusion is valuable and even desirable if it leads you to the right control action, which is to investigate what your emotions might be telling you and why. If we’re willing to acknowledge our feelings, we can use them productively as cues to start looking for oracles and problems in the product that trigger the feelings—before those problems lead our customers to distress.

In my article Testing Without a Map, I discuss some of the oracles that we present in the Rapid Software Testing class and methodology.

Thanks to Sylvia for the inspiration.

I’ll be bringing Rapid Testing to the Netherlands (October 31-November 2), London (November 28-30), and Oslo (December 14-16). See the right-hand panel for registration details. Join us! Spread the word! Thank you!