Archive for the ‘Certification’ Category

Smoke Testing vs. Sanity Testing: What You Really Need to Know

Tuesday, November 8th, 2011

If you spend any time in forums in which new testers can be found, it won’t be long before someone asks “”What is the difference between smoke testing and sanity testing?”

“What is the difference between smoke testing and sanity testing?” is a unicorn question. That is, it’s a question that shouldn’t be answered except perhaps by questioning the question: Why does it matter to you? Who’s asking you? What would you do if I gave you an answer? Why should you trust my answer, rather than someone else’s? Have you looked it up on Google? What happens if people on Google disagree?

But if you persist and ask me, here’s what I will tell you:

The distinction between the smoke and sanity testing is not generally important. In fact, it’s one of the most trivial aspects of testing that I can think of, offhand. Yet it does point to something that is important.

Both smoke testing and sanity testing refer to a first-pass, shallow form of testing intended to establish whether a product or system can perform the most basic functions. Some people call such testing “smoke testing”; others call it “sanity testing”. “Smoke testing” derives from the hardware world; if you create an electronic circuit, power it up, and smoke comes out somewhere, the smoke test has failed. Sanity testing has no particular derivation that I’m aware of, other than the common dictionary definition of the word. Does the product behave in some crazy fashion? If so, it has failed the sanity test.

Do you see the similarity between these two forms of testing? Can you make a meaningful distinction between them? Maybe someone can. If so, let them make it. If you’re talking to some person, and that person want to make a big deal about the distinction, go with it. Some organizations make a distinction between the smoke and sanity testing; some don’t. If it seems important in your workplace, then ask in your workplace, and adapt your thinking accordingly while you’re there. If it’s important that you provide a “correct” answer on someone’s idiotic certification exam, give them the answer they want according to their “body of knowledge”. Otherwise, it’s not important. Don’t worry about it.

Here’s what is important: wherever you find yourself in your testing career, people will use language that has evolved as part of the culture of that organization. Some consultancies or certification mills or standards bodies claim the goal of providing “a common worldwide standard language for testing”. This is as fruitless and as pointless a goal as a common worldwide standard language for humanity. Throughout all of human history history, people have developed different languages to address things that were important in their cultures and societies and environments. Those languages continue to develop as change happens. This is not a bad thing. This is a good thing.

There is no term in testing of which I am aware whose meaning is universally understood and accepted. There’s nothing either wrong or unusual about that. It’s largely true outside the testing world too. Pick an English word at random, and odds are you’ll find multiple meanings for it. Examples:

  • Pick (choose, plectrum for a guitar)

  • English (a language, spin on a billiard ball)

  • word (a unit of speech, a 32-bit value)

  • random (without a definite path, of equal probability)

  • odds (probability, numbers not divisible by two)

  • multiple (more than one, divisible by)

  • meaning (interpretation, significance)

Never mind the shades and nuances of interpretation within each meaning of each word! And notice that “never mind”, in this context, is being used ironically. Here, “never mind” doesn’t mean “forget” or “ignore”; here, it really means the opposite: “also pay attention to”!

Not only is there no universally accepted term for anything, there’s no universally accepted authority that could authoritatively declare or enforce a given meaning for all time. (Some might point to law, claiming that there are specific terms which have solid interpretations. If that were true, we wouldn’t need courts or lawyers.)

If you find yourself in conversation (or in an interview) with someone who asks you “Do you do X?”, and you’re not sure what X is by their definition, a smart and pragmatic reply starts with, “I may do X, but not necessarily by that name.” After that,

  • You can offer to describe your notion of X (if you have one).

  • You can describe something that you do that could be interpreted as X. That can be risky, so offer this too: “Since I don’t know what you mean by X, here’s something that I do. I think it sounds similar to X, or could be interpreted as X. But I’d like to make sure that we both recognize that we could have different interpretations of what X means.”

  • You can say, “I’d like to avoid the possibility that we might be talking at cross-purposes. If you can describe what X means to you, I can tell you about my experiences doing similar things, if I’ve done them. What does X mean to you?” Upon hearing their defintion of X, then truthfully describe your experience, or say that you haven’t done it.

If you searched online for an answer to the smoke vs. sanity question, you’d find dozens, hundreds of answers from dozens, hundreds of people. (Ironically, the very post that introduces the notion of the unicorn question includes, in the second-to-last paragraph, a description of a smoke test. Or a sanity test. Whatever.) The people who answer the smoke vs. sanity question don’t agree, and neither do their answers. Yet many, even most, of the people will seem very sure of their own answers. People will have their own firm ideas about how many angels can fit on the head of a pin, too. However, there is no “correct” definition for either term outside of a specific context, since there is no authority that is univerally accepted. If someone claimed to be a universally accepted authority, I’d reject the claim, which would put an instant end to the claim of universal acceptance.

With the possibile exception of the skills of memorization, there is no testing skill involved in memorizing someone’s term for something. Terms and their meanings are slippery, indistinct, controversial, and context-dependent. The real testing skill is in learning to deal with the risk of ambiguity and miscommunication, and the power of expressing ourselves in many ways.

xMMwhy

Friday, October 28th, 2011

Several years ago, I worked for a few weeks as a tester on a big retail project. The project was spectacularly mismanaged, already a year behind schedule by the time I arrived. Just before I left, the oft-revised target date slipped by another three months. Three months later, the project was deployed, then pulled out of production for another six months to be fixed. Project managers and a CIO, among many others, lost their jobs. The company pinned an eight-figure loss on the project.

The software infrastructure was supplied by a big database company, and the software to glue everything together was supplied by development organization in another country. That software was an embarrassment—bloated, incoherent, hard to use, and buggy. Fixes were rarely complete and often introduced new bugs. At one point during my short tenure, all effective worked stopped for five days because the development organization’s servers crashed and no backups were available. All this despite the fact that the software development company claimed CMMI Level 5.

This morning, I was greeted by a Tweet that said

“Deloittes show how a level 5 CMMi company has bad test process at #TMMi conf in Korea! So CMMi needs TMMi – good.”

The TMMi is the Testing Maturity Model Integration. Here’s what the TMMi Foundation says about it:

“The Test Maturity Model Integration has been developed to complement the existing CMMI framework. It provides a structured presentation of maturity levels, allowing for standard TMMi assessments and certification, enabling a consistent deployment of the standards and the collection of industry metrics.”

Here’s what the SEI—the CMMi’s co-ordinator and sponsor—says about it:

“CMMI (Capability Maturity Model Integration) is a process improvement approach that provides organizations with the essential elements of effective processes, which will improve their performance. CMMI-based process improvement includes identifying your organization’s process strengths and weaknesses and making process changes to turn weaknesses into strengths.”

What conclusions could we draw from these three statements?

If a company has achieved CMMI Level 5, yet has a bad test process, then there’s a logical problem here. Either testing isn’t an essential element of effective processes (in which case the TMMI should be unnecessary) or it is (in which case the SEI’s claim of providing the essential processes is unsupportable).

One clear solution to the problem would be to adjudicate all this by way of a Maturity Model Maturity Model (Integrated), the MMMMI, whereby your organization can determine (in a mature fashion, of course) what essential processes are in the first place. Mind you, that could be flawed too. You’d need a set of essential processes to determine how to determine essential processes, so you’ll also need a Maturity Model Maturity Model Maturity Model (Integrated), an MMMMMMI. And in fairly short order, your organization will disappear up its own ass.

Jerry Weinberg points in a different direction, using very strong language. This is from Quality Software Management, Volume 1: Systems Thinking, p. 21:

“…cultural patterns are not more or less mature, they are just more or less fitting. Of course, some people have an emotional need for perfection, and they will impose this emotional need on everything they do. Their comparisons have nothing to do with the organization’s problems, but with their own.

“The quest for unjustified perfection is not mature, but infantile.

“Hitler was quite clear on who was the ‘master race’. His definition of Aryan race was supposed to represent the mature end product of all human history, and that allowed Hitler and the Nazis to justify atrocities on “less mature” cultures such as Gypsies, Catholics, Jews, Poles, Czechs, and anyone else who got in their way. Many would-be reformers of software engineering require their ‘targets’ to confess to their previous inferiority. These little Hitlers have not been very successful.

“Very few healthy people will make such a confession voluntarily, and even concentration camps didn’t cause many people to change their minds. This is not ‘just a matter of words’. Words are essential to any change project because they give us models of the world as it was and as we hope it to be. So if your goal is changing an organization, start by dropping the comparisons such as those implied in the loaded term ‘maturity.’”

It’s time for us, the worldwide testing community, to urge Deloitte, the SEI, the TMMI, and the unfortunate testers in Korea who are presently being exposed to the nonsense to recognize what many of us have known for years: maturity models have it backwards.

Common Languages Ain’t So Common

Tuesday, June 28th, 2011

A friend told me about a payment system he worked on once. In the system models (and in the source code), the person sending notification of a pending payment was the payer. The person who got that notice was called the payee. That person could designate somone else—the recipient—to pick up the money. The transfer agent would credit the account of the recipient, and debit the account of the person who sent notification—the payer, who at that point in the model suddenly became known as the sender. So, to make that clear: the payee sends email to the payer, who receives it. The sender pays money to the recipient (who accepts the payment.) Got that clear? It turns out there was a logical, historical reason for all this. Everything seemed okay at the beginning of the project; there was one entity named “payer” and another named “payee”. Payer A and Payee B exchanged both email and money, until someone realized that B might give someone else, C, the right to pick up the money. Needing another word for C, the development group settled on “recipient”, and then added “sender” to the model for symmetry, even though there was no real way for A to split into two roles as B had. Uh, so far.

There’s a pro-certification argument that keeps coming back to the discussion like raccoons to a garage: the claim that, whatever its flaws, “at least certification training provides us with a common language for testing.” It’s bizarre enough that some people tout this rationalization; it’s even weirder that people accept it without argument. Fortunately, there’s an appropriate and accurate response: No, it doesn’t. The “common language” argument is riddled with problems, several of which on their own would be showstoppers.

  • Which certification training, specifically, gives us a common language for testing? Aren’t there several different certification tribes? Do they all speak the same language? Do they agree, or disagree on the “common language”? What if we believe certification tribes present (at best) a shallow understanding and a shallow description of the ideas that they’re describing?
  • Who is the “us” referred to in the claim? Some might argue that “us” refers to the testing “industry”, but there isn’t one. Testing is practiced in dozens of industries, each with its own contexts, problems, and jargon.
  • Maybe “us” refers to our organization, or our development shop. Yet within our own organization, which testers have attended the training? Of those, has everyone bought into the common language? Have people learned the material for practical purposes, or have they learned it simply to pass the certification exam? Who remembers it after the exam? For how long? Even if they remember it, do they always and everafter use the language that has been taught in the class?
  • While we’re at it, have the programmers attended the classes? The managers? The product owners? Have they bought in too?
  • With that last question still hanging, who within the organization decides how we’ll label things? How does the idea of a universal language for testing fit with the notion of the self-organizing team? Shouldn’t choices about domain-specific terms in domain-specific teams be up to those teams, and specific to those domains?
  • What’s the difference between naming something and knowing something? It’s easy enough to remember a label, but what’s the underlying idea? Terms of art are labels for constructs—categories, concepts, ideas, thought-stuff. What’s in and what’s out with respect to a given category or label? Does a “common language” give us a deep understanding of such things? Please, please have a look at Richard Feynman’s take on differences between naming and knowing, http://www.youtube.com/watch?v=05WS0WN7zMQ.
  • The certification scheme has representatives from over 25 different countries, and must be translated into a roughly equivalent number of languages. Who translates? How good are the translations?
  • What happens when our understanding evolves? Exploratory testing, in some literature, is equated with “ad hoc” testing, or (worse) “error guessing”. In the 1990s, James Bach and Cem Kaner described exploratory testing as “simultaneous test design, test execution, and learning”. In 2006, participants in the Workshop on Heuristic and Exploratory Techniques discussed and elaborated their ideas on exploratory testing. Each contributed a piece to a definition synthesized by Cem Kaner: “Exploratory software testing is a style of software testing that emphasizes the personal freedom and responsibility of the individual tester to continually optimize the value of her work by treating test-related learning, test design, test execution, and test result interpretation as mutually supportive activities that run in parallel throughout the project.” That doesn’t roll off the tonque quite so quickly, but it’s a much more thorough treatment of the idea, identifying exploratory testing as an approach, a way that you do something, rather than something that you do. Exploratory work is going on all the time in any kind of complex cognitive activity, and our understanding of the work and of exploration itself evolves (as we’ve pointed out here, and here, and here, and here, and here.). Just as everyday, general-purpose languages adopt new words and ideas, so do the languages that we use in our crafts, in our communities, and with our clients.

In software development, we’re alway solving new problems. Those new problems may involve people to work with entirely new technological or business domains, or to bridge existing domains with new interactions and new relationships. What happens when people don’t have a common language for testing, or for anything else in that kind of development process? Answer: they work it out. As Peter Galison notes in his work on trading zones, “Cultures in interaction frequently establish contact languages, systems of discourse that can vary from the most function-specific jargons, through semispecific pidgins, to full-fledged creoles rich enough to support activities as complex as poetry and metalinguistic reflection.”  Each person in a development group brings elements of his or her culture along for the ride; each project community develops its own culture and its own language.

Yes, we do need common language for testing. Anthropology shows us that meaningful language develops organically when people gather for a common purpose in a particular context. Just as we need tests that are specific to a given context, we need terms that are that way too. So instead of focusing training on memorizing glossary entries, let’s teach testers more about the relationships between words and ideas. Let’s challenge each other to ask better questions about the language we’re using, and how it might be fooling us.

Gaming the Tests

Monday, September 27th, 2010

Let’s imagine, for a second, that you had a political problem at work. Your CEO has promised his wife that their feckless son Ambrose, having flunked his university entrance exams, will be given a job at your firm this fall. Company policy is strict: in order to prevent charges of nepotism, anyone holding a job must be qualified for it. You know, from having met him at last year’s Christmas party, that Ambrose is (how to put this gently?) a couple of tomatoes short of a thick sauce. Yet the policy is explicit: every candidate must not only pass a multiple choice test, but must get every answer right. The standard number of correct answers required is (let’s say) 40.

So, the boss has a dilemma. He’s not completely out to lunch. He knows that Ambrose is (how can I say this?) not the sharpest razor in the barbershop. Yet the boss adamantly wants his son to get a job with the firm. At the same time, the boss doesn’t want to be seen to be violating his own policy. So he leaves it to you to solve the problem. And if you solve the problem, the boss lets you know subtly that you’ll get a handsome bonus. Equally subtly, he lets you know that if Ambrose doesn’t pass, your career path will be limited.

You ponder for a while, and you realize that, although you have to give Ambrose an exam, you have the authority to set the content and conditions of the exam. This gives you some possibilities.

A. You could give a multiple choice test in which all the answers were right. That way, anyone completing the test would get a perfect score.

B. You could give a multiple choice test for which the answers were easy to guess, but irrelvant to the work Ambrose would be asked to do. For example, you could include questions like, “What is the very bright object in the sky that rises in the morning and sets in the evening?” and provide “The Sun” as choice of answer, and the names of hockey players for the other choices.

C. You could find out what questions Ambrose might be most likely to answer correctly in the domain of interest, and then craft an exam based on that.

D. You could give a multiple choice test in which, for every question, one of A, B, or C was the correct answer, and answer D was always “One of the above.”

E. You might give a reasonably difficult multiple-choice exam, but when Ambrose got an answer wrong, you could decide that there’s another way to interpret the answer, and quietly mark it right.

F. You might give Ambrose a very long set of multiple-choice questions (say 400 of them), and then, of his answers, pick 40 correct ones. You then present those questions and answers as the completed exam.

G. You could give Ambrose a set of questions, but give him as much time as he wanted to provide an answer. In addition, you don’t watch him carefully (although not watching carefully is a strategy that nicely supports most of these options).

H. You could ask Ambrose one multiple choice question. If he got it wrong, correct him until he gets it right. Then you could develop another question, ask that, and if he gets it wrong, correct him until he gets it right. Then continue in a loop until you get to 40 questions.

I. This approach is like H, but instead you could give a multiple choice test for which you had chosen an entire set of 40 questions in advance. If Ambrose didn’t get them all right, you could correct him, and then give him the same set of questions again. And again. And over and over again, until he finally gets them all right. You don’t have to publicize the failed attempts; only the final, successful one. That might take some time and effort, and Ambrose wouldn’t really be any more capable of anything except answering these specific questions. But, like all the other approaches above, you could effect a perfect score for Ambrose.

When the boss is clamoring for a certain result, you feel under pressure and you’re vulnerable. You wouldn’t advise anyone to do any of the things above, and you wouldn’t do them yourself. Or at least, you wouldn’t do them consciously. You might even do them with the best of intentions.

There’s an obvious parallel here—or maybe not. You may be thinking of the exam in terms of a certain kind of certification scheme that uses only multiple-choice questions, the boss as the hiring manager for a test group, and Ambrose as a hapless tester that everyone wants to put into a job for different reasons, even though no one is particularly thrilled about the idea. Some critical outsider might come along and tell you point-blank that your exam wasn’t going to evaluate Ambrose accurately. Even a sympathetic observer might offer criticism. If that were to happen, you’d want to keep the information under your hat—and quite frankly, the other interested parties would probably be complacent too. Dealing with the critique openly would disturb the idea that everyone can save face by saying that Ambrose passed a test.

Yet that’s not what I had in mind—not specifically, at least. I wanted to point out some examples of bad or misleading testing, which you can find in all kinds of contexts if you put your mind to it. Imagine that the exam is a set of tests—checks, really. The boss is a product owner who wants to get the product released. The boss’ wife is a product marketing manager. Hapless Ambrose is a program—not a very good program to be sure, but one that everyone wants to release for different reasons, even though no one is particularly thrilled by the idea. You, whether a programmer or a tester or a test manager, are responsible for “testing”, but you’re really setting up a set of checks. And you’re under a lot of pressure. How might your judgement—consciously or subconsciously—be compromised? Would your good intentions bend and stretch as you tried to please your stakeholders and preserve your integrity? Would you admit to the boss that your testing was suspect? If you were under enough pressure, would you even notice that your testing was suspect?

So this story is actually about any circumstance in which someone might set up a set of checks that provide some illusion of success. Can you think of any more ways that you might game the tests… or worse, fool yourself?

Why I Am Not Yet Certified — EuroSTAR Presentation

Wednesday, December 5th, 2007

Today, December 4 2007, I gave a presentation at EuroSTAR on “Why I Am Not (Yet) Certified“. James Bach was originally slated to give a different presentation with the same title, but I got the nod due to the untimely illness of James’ wife Lenore, which caused him to cancel his fall schedule (she’s much better now).

Stuart Reid, the chair of the conference, strongly supports the notion of certifications in their current forms. I disagree with that, but I have considerable respect for people who are willing to provide a platform for opposing views, and I therefore thank him for providing the opportunity to speak. I think the controversy opens up the discussion, and thereby strengthens the conference and the craft of testing.

As I said as I finished the presentation, I felt a little like Martin Luther nailing 42 PowerPoint slides to the screen. The talk was generally well received, but there were several conversations that I found rather sobering.

At least two people to whom I spoke–one a former ISEB instructor–told me that they had wanted to effect change in the multiple choice Foundation exams, but their experience was that that couldn’t happen unless the ISEB/ISTQB Syllabus were to change–and changing that proved an insurmountable obstacle for them.

Almost everyone who approached me afterwards said that they were glad that I had said the things that they had been thinking privately for several years. They tended to be enthusiastic but they also tended to check to see whether they were among friends before they spoke freely. The latter is a tendency we need to break. As it was, it felt like revolution and insurrection were in the air–but nobody was quite brave enough to speak up. I encourage people to talk about this stuff, out loud and in public. Open criticism of things that are damaging to the craft is a form of self-certification in my community.

The complacence and chill were disturbing, but once a group of people were together, the complaints started to flow. Many had taken the ISEB/ISTQB certifications. All but one found little to no value in it. They complained about the triviality and the one-and-only-one-answer nature of the Foundation Level exam. Saddest of all, they noted that in Britain and in several countries on the continent, almost all businesses that are hiring testers require applicants for entry-level jobs to have the ISEB/ISTQB certification. I’m pretty certain that this will have several nasty effects. First, it is likely to discourage people from entering the testing field the way many of our best testers have done–by accident and opportunity. In turn, this will make the profession more insular and less diverse. In turn, this will prevent new ideas from reaching the craft. This is very bad.

We’re already learning this business slowly enough. If you attend conferences–especially the major commercial ones–you’ll hear near endless repetition of the same themes: heavyweight planning and estimation for a task that should be nimble, rapid, and responsive; bloated approaches to test documentation and artifacts; relentless focus on confirmation, verification, and validation, and very little talk of investigation, exploration, and discovery. It’s narcotic–the conferences seem addicted to these talks, and they make the craft sleepy. If we’re going to repeat anything, let’s repeat Einstein’s notion that the we can’t solve problems by using the same level of thinking that we used when we created them.