There’s a mildly rollicking little discussion going on the in the Software Testing Club at the moment, in which Rob Lambert observes, “I’ve seen a couple of conversations recently where people are talking about red, green and yellow box testing.” Rob then asks “There’s the obvious black and white. How many more are there?”
(For what it’s worth, I’ve already made some comments about a related question here.)
At one point a little later in the conversation, Jaffamonkey (I hope that’s a pseudonym) replies,
If applied in modern context Black Box is essentially pure functional testing (or unit testing) whereas White Box testing is more of what testers are required to do, which is more about testing user journeys, and testing workflows, usability etc.
Of course, that’s not what I understand the classical distinction to be.
The classical distinction started with the notion of “black box” testing. You can’t see what’s inside the box, and so you can’t see how it’s working internally. But that may not be so important to you for a particular testing mission; instead, you care about inputs and outputs, and the internal implementation isn’t such a big deal.
You’d probably take a black box approach when a) you don’t have source code; or b) you’re intentionally seeking problems that you might not notice so quickly by inspection, but that you might notice by empirical experiments and observation; or maybe c) you may believe that the internal implementation is going to be varied or variable, so no point in taking it into account with respect to the current focus of your attention. I’m sure you can come up with more reasons.
This “black box” idea suggests a contrast: “glass box” testing. Since glass is transparent, you can see the inner workings, and the insight into what is happening internally gives you a different perspective for risks and test ideas.
Glass box testing might be especially important when a) your mission involves testing what’s happening inside the box (programmers take this perspective more often than not); or b) your overall mission will be simpler, in some dimension, because of your understanding of the internals; or maybe c) you want to learn something about how someone has solved a particular problem. Again, I’m sure you can some up with lots more reasons; these are examples, not definitive lists.
Unhelpfully (to me), someone somewhere along the way decided that the opposite of “black” must be “white”; that black box testing was the kind where you can’t see inside the box; and that therefore white (rather than glass) box testing must the name for the other stuff. At this point, the words and the model began to part company.
Even less helpfully, people stopped thinking in terms of a metaphor and started thinking in terms of labels dissociated from the metaphor. The result is an interpretation like Jaffa’s above, where he (she?) seems to have inverted the earlier interpretations, for reasons I know not why. Who knows? Maybe it’s just a typo.
More unhelpfully still (to me), someone has (or several someones have) apparently come along with color-coding systems for other kinds of testing. Bill Matthews reports that he’s found
Red Box = “Acceptance testing” or “Error message testing” or “networking , peripherals testing and protocol testing”
Yellow Box = “testing warning messages” or “integration testing”
Green Box = “co-existence testing” or “success message testing”
Sources:
http://www.testrepublic.com/forum/topics/define-red-box-testing-yellow
http://www.geekinterview.com/question_details/27985
http://www.allinterview.com/showanswers/7077.html
http://www.coolinterview.com/interview/10080/
For me, there are at least four big problems here.
First, there is already disagreement on which colours map to which concepts. Second, there is no compelling reason that I can see to associate a given colour with any of the given ideas. Third, the box metaphor doesn’t have a clear relationship to what’s going on in the mind or the practice of a tester. The colour is an arbitrary label on an unconstrained container. Fourth, since the definitions appear on interview sites and the sites disagree, there’s a risk that some benighted hiring manager will assume that there is only one interpretation, and will deprive himself of an otherwise skilled tester who read a different site.
(To defend yourself against this fourth problem in interviews, use safety language: “Here’s what I understand by ‘chartreuse-box testing’. This is the interpretation given by this person or group, but I’m aware there may be other interpretations in your context.” For extra points, try saying something like, “Is that consistent with your interpretation? If not, I’d be happy to adopt the term the way you use it around here.” And meaning it. If they refuse to hire you because of that answer, it’s unlikely that working there would have been much fun.)
All of this paintbox of terms is unhelpful (to me) because it means another 30,000 messages on LinkedIn and QAForums, wherein enormous numbers of testers weigh in with their (mis)understandings of some other author’s terms and intentions—and largely with the intention of asking or answering homework questions, so it seems.
The next step is that, at some point, some standards-and-certification body will have to come along and lay down the law about what colour testing you would have to do to find out how many angels can dance on the head of a pin, what colour the pin is, and whether the angels are riding unicorns. And then another, competing standards-and-certification body will object, saying that it’s not angels, it’s fairies, and it’s not unicorns, it’s centaurs, and they’re not dancing, they’re doing gymnastics. And don’t even get us started on the pin!
Courses and certifications on colour-mapping to mythological figures will be available (at a fee) to check (not test!) your ability to memorize a proprietary table of relationships.
Meanwhile, most of the people involved in the discussion will have forgotten—in the unlikely event that they ever knew— that the point of the original black-and-glass exercise was to make things more usefully understandable. Verification vs. validation, anyone? One is building the right thing; the other is building the thing right. Now, quick: which is which? Did you have to pause to think about it? And if you find a problem wherein the thing was built wrong, or that the wrong thing was built, does anyone really care whether you were doing validation testing or verification testing at the time?
Well… maybe they do. So, all that said, remember this: no one outside your context can tell you what words you can or can’t use. And remember this too: no one outside your context can tell you what you can or can’t find useful. Some person, somewhere, might find it handy to refer to a certain kind of testing as “sky testing” and another kind of testing as “ground testing”, and still another as “water testing”. (No, I can’t figure it out either.) If people find those labels helpful, there’s nothing to stop them, and more power to them. But if the labels are unhelpful to you and only make your brain hurt, it’s probably not worth a lot of cycles to try to make them fit for you.
So here are some tests that you can apply to a term or metaphor, whether you produce it yourself or someone else produced it:
- Is it vivid? That is (for a testing metaphor), does it allow you to see easily in your mind’s eye (hear in your mind’s ear, etc.) something in the realm of common experience but outside the world of testing?
- Is it clear? That is, does it allow you to make a connection between that external reference and something internal to testing? Do people tend to get it the first time they hear it, or with only a modicum of explanation? Do people retain the connection easily, such that you don’t have to explain it over and over to the same people? Do people in a common context agree easily, without arguments or nit-picking?
- Is it sticky? Is it easy to remember without having to consult a table, a cheat sheet, or a syllabus? Do people adopt the expression naturally and easily, and do they use it?
If the answer to these questions is Yes across the board, it might be worthwhile to spread the idea. If you’re in doubt, field-test the idea. Ask for (or offer) explanations, and see if understanding is easy to obtain. Meanwhile, if people don’t adopt the idea outside of a particular context, do everyone a favour: ditch it, or ignore it, or keep it within a much closer community.
In his book The Educated Imagination (based on the Massey Lectures, a set of broadcasts he did for the Canadian Broadcasting Corporation in 1963), Northrop Frye said,
“Outside literature, the main motive for writing is to describe this world. But literature itself uses language in a way which associates our minds with it. As soon as you use associative language, you begin using figures of speech. If you say, “this talk is dry and dull”, you’re using figures associating it with bread and breadknives. There are two kinds main kinds of association, analogy and identity, two things are like each other and two things that are each other (my emphasis –MB). One produces a figure of speech called the simile. The other produces a figure called metaphor.”
When we’re trying to describe our work in testing, I think most people would agree that we’re outside the world of literature. Yet we often learn most easily and most powerfully by association—by relating things that we don’t understand well to things that we understand a little better in some specific dimension. In reporting on our testing, we’re often dealing with things that are new to us, and telling stories to describe them. The same is true in learning about testing. Dealing with the new and telling stories leads us naturally to use associative language.
Frye explains why we have to be cautious:
“In descriptive writing, you have to be careful of associative language. You’ll find that analogy, or likeness to something else, is very tricky to handle in description, because the differences are as important as the resemblances. As for metaphor, where you’re really saying “this is that,” you’re turning your back on logic and reason completely because logically two things can never be the same thing and still remain two things.”
Having given that caution, Frye goes on to explain why we use metaphor, and does so in a way that I think might be helpful for our work:
“The poet, however, uses these two crude, primitive, archaic forms of thought in the most uninhibited way, because his job is not to describe nature but to show you a world completely absorbed and possessed by the human mind…The motive for metaphor, according to Wallace Stevens, is a desire to associate, and finally to identify, the human mind with what goes on outside it, because the only genuine joy you can have is in those rare moments when you feel that although we may know in part, as Paul says, we are also a part of what we know.”
So the final test of a term or a metaphor or a heuristic, for me, is this:
- Is it useful? That is, does it help you make sense of the world to the degree that you can identify an idea with something deeper and more resonant than a mere label? Does it help you to own your ideas?
Postscript, 2013/12/10:
“A study published in January in PLOS ONE examined how reading different metaphors—’crime is a virus’ and ‘crime is a beast’—affected participants’ reasoning when choosing solutions to a city’s crime problem…. (Researcher Paul) Thibodeau recommends giving more thought to the metaphors you use and hear, especially when the stakes are high. ‘Ask in what ways does this metaphor seem apt and in what ways does this metaphor mislead,’ he says. Our decisions may become sounder as a result.”
Excerpted from Salon.
Further reading: Round Earth Test Strategy (James Bach)