Blog Posts from September, 2010

Test Framing

Wednesday, September 29th, 2010

A few months ago, James Bach introduced me to the idea of test framing. He identified it as a testing skill, and did some work in developing the concept by field-testing it with some of his online students. We’ve been refining it lately. I’ll be giving a brief talk on it at the Kitchener-Waterloo Software Quality Association on Thursday, September 30, 2010, and I’ll be leading a half-day workshop on it at EuroSTAR. Here’s our first public cut at a description.

The basic idea is this: in any given testing situation

  • You have a testing mission (a search for information, and your mission may change over time).
  • You have information about requirements (some of that information is explicit, some implicit; and it will likely change over time).
  • You have risks that inform the mission (and awareness of those risks will change over time).
  • You have ideas about what would provide value in the product, and what would threaten it (and you’ll refine those ideas as you go).
  • You have a context in which you’re working (and that context will change over time).
  • You have oracles that will allow you to recognize a problem (and you’ll discover other oracles as you go).
  • You have models of the product that you intend to cover (and you’ll extend those models as you go).
  • You have test techniques that you may apply (and choices about which ones you use, and how you apply them).
  • You have lab procedures that you follow (that you may wish to follow more strictly, or relax).
  • You configure, operate, and observe the product (using test techniques, as mentioned above), and you evaluate the product (by comparing it to the oracles mentioned above, in relation to the value of the product and threats to that value).
  • You have skills and heuristics that you may apply.
  • You have issues related to the cost versus the value of your activities that you must assess.
  • You have time (which may be severely limited) in which to perform your tests.
  • You have tests that you (may) perform (out of an infinite selection of possible tests that you could perform).

Test framing involves the capacity to follow and express a direct line of logic that connects the mission to the tests. Along the way, the line of logical reasoning will typically touch on elements between the top and the bottom of the list above. The goal of framing the test is to be able to answer questions like

  • Why are you running (did you run, will you run) this test (and not some other test)?
  • Why are you running that test now (did you run that test then, will you run that test later)?
  • Why are you testing (did you test, will you test) for this requirement, rather than that requirement?
  • How are you testing (did you test, well you test) for this requirement?
  • How does the configuration you used in your tests relate to the real-world configuration of the product?
  • How does your test result relate to your test design?
  • Was the mission related to risk? How does this test relate to that risk?
  • How does this test relate to other tests you might have chosen?
  • Are you qualified (were you qualified, can you become qualified) to test this?
  • Why do you think that is (was, would be) a problem?
  • The form of the framing is a line of propositions and logical connectives that relate the test to the mission. A proposition is a statement that expresses a concept that can be true or false. We could think of these as affirmative declarations or assumptions. Connectives are word or phrases (“and”, “not”, “if”, “therefore”, “and so”, “unless”, “because”, “since”, “on the other hand”, “but maybe”, and so forth) that link or relate propositions to each other, generating new propositions by inference. This is not a strictly formal system, but one that is heuristically and reasonably well structured. Here’s a fairly straightforward example:

    GIVEN: (The Mission:) Find problems that might threaten the value of the product, such as program misbehaviour or data loss.

    Proposition: There’s an input field here.
    Proposition: Upon the user pressing Enter, the input field sends data to a buffer.
    Proposition: Unconstrained input may overflow a buffer.
    Proposition: Buffers that overflow clobber data or program code.
    Proposition: Clobbered data can result in data loss.
    Proposition: Clobbered program code can result in observable misbehaviour.

    Connecting the propositions: IF this input field is unconstrained, AND IF it consequently overflows a buffer, THEREFORE there’s a risk of data loss OR program misbehaviour.

    Proposition: The larger the data set that is sent to this input field, the greater the chance of clobbering program code or data.

    Connection: THEREFORE, the larger the data set, the better chance of triggering an observable problem.

    Connection: IF I put an extremely long string into this field, I’ll be more likely to observe the problem.

    Conclusion: (Test:) THEREFORE I will try to paste an extremely long string in this input field AND look for signs of mischief such as garbage in records that I observed as intact before, or memory leaks, or crashes, or other odd behaviour.

    Now, to some, this might sound quite straightforward and, well, logical. However, in our experience, some testers have surprising difficulty with tracing the path from mission down to the test, or from the test back up to mission—or with expressing the line of reasoning immediately and cogently.

    Our approach, so far, is to give testers something to test and a mission. We might ask them to describe a test that they might choose to run; and to have them describe their reasoning. As an alternative, we might ask them why they chose to run a particular test, and to explain that choice in terms of tracing a logical path back to the mission.

    If you have an unframed test, try framing it. You should be able to do that for most of your tests, but if you can’t frame a given test right away, it might be okay. Why? Because as we test, we not only apply information; we also reveal it. Therefore, we think it’s usually a good idea to alternate between focusing and defocusing approaches. After you’ve been testing very systematically using well-framed tests, mix in some tests that you can’t immediately or completely justify. One of the possible justifications for an unframed test is that we’re always dealing with hidden frames. Revealing hidden or unknown frames is a motivation behind randomized high-volume automated tests, or stress tests, or galumphing, or any other test that might (but not certainly) reveal a startling result. The fact that you’re startled provides a reason, in retrospect, to have performed the test. So, you might justify unframed tests in terms of plausible outcomes or surprises, rather than known theories of error. You might encounter a “predicatable” problem, or one more surprising to you. In that case, better that you should say “Who knew?!” than a customer.

    To test is to tell two parallel stories: a story of the product, and the story of our testing. James and I believe that test framing is a key skill that helps us to compose, edit, narrate, and justify the story of our testing in a logical, coherent, and rapid way. Expect to hear more about test framing, and please join us (or, if you like, argue with us) as we develop the idea.

    See http://www.developsense.com/resources/TestFraming.pdf for current updates.

    Gaming the Tests

    Monday, September 27th, 2010

    Let’s imagine, for a second, that you had a political problem at work. Your CEO has promised his wife that their feckless son Ambrose, having flunked his university entrance exams, will be given a job at your firm this fall. Company policy is strict: in order to prevent charges of nepotism, anyone holding a job must be qualified for it. You know, from having met him at last year’s Christmas party, that Ambrose is (how to put this gently?) a couple of tomatoes short of a thick sauce. Yet the policy is explicit: every candidate must not only pass a multiple choice test, but must get every answer right. The standard number of correct answers required is (let’s say) 40.

    So, the boss has a dilemma. He’s not completely out to lunch. He knows that Ambrose is (how can I say this?) not the sharpest razor in the barbershop. Yet the boss adamantly wants his son to get a job with the firm. At the same time, the boss doesn’t want to be seen to be violating his own policy. So he leaves it to you to solve the problem. And if you solve the problem, the boss lets you know subtly that you’ll get a handsome bonus. Equally subtly, he lets you know that if Ambrose doesn’t pass, your career path will be limited.

    You ponder for a while, and you realize that, although you have to give Ambrose an exam, you have the authority to set the content and conditions of the exam. This gives you some possibilities.

    A. You could give a multiple choice test in which all the answers were right. That way, anyone completing the test would get a perfect score.

    B. You could give a multiple choice test for which the answers were easy to guess, but irrelvant to the work Ambrose would be asked to do. For example, you could include questions like, “What is the very bright object in the sky that rises in the morning and sets in the evening?” and provide “The Sun” as choice of answer, and the names of hockey players for the other choices.

    C. You could find out what questions Ambrose might be most likely to answer correctly in the domain of interest, and then craft an exam based on that.

    D. You could give a multiple choice test in which, for every question, one of A, B, or C was the correct answer, and answer D was always “One of the above.”

    E. You might give a reasonably difficult multiple-choice exam, but when Ambrose got an answer wrong, you could decide that there’s another way to interpret the answer, and quietly mark it right.

    F. You might give Ambrose a very long set of multiple-choice questions (say 400 of them), and then, of his answers, pick 40 correct ones. You then present those questions and answers as the completed exam.

    G. You could give Ambrose a set of questions, but give him as much time as he wanted to provide an answer. In addition, you don’t watch him carefully (although not watching carefully is a strategy that nicely supports most of these options).

    H. You could ask Ambrose one multiple choice question. If he got it wrong, correct him until he gets it right. Then you could develop another question, ask that, and if he gets it wrong, correct him until he gets it right. Then continue in a loop until you get to 40 questions.

    I. This approach is like H, but instead you could give a multiple choice test for which you had chosen an entire set of 40 questions in advance. If Ambrose didn’t get them all right, you could correct him, and then give him the same set of questions again. And again. And over and over again, until he finally gets them all right. You don’t have to publicize the failed attempts; only the final, successful one. That might take some time and effort, and Ambrose wouldn’t really be any more capable of anything except answering these specific questions. But, like all the other approaches above, you could effect a perfect score for Ambrose.

    When the boss is clamoring for a certain result, you feel under pressure and you’re vulnerable. You wouldn’t advise anyone to do any of the things above, and you wouldn’t do them yourself. Or at least, you wouldn’t do them consciously. You might even do them with the best of intentions.

    There’s an obvious parallel here—or maybe not. You may be thinking of the exam in terms of a certain kind of certification scheme that uses only multiple-choice questions, the boss as the hiring manager for a test group, and Ambrose as a hapless tester that everyone wants to put into a job for different reasons, even though no one is particularly thrilled about the idea. Some critical outsider might come along and tell you point-blank that your exam wasn’t going to evaluate Ambrose accurately. Even a sympathetic observer might offer criticism. If that were to happen, you’d want to keep the information under your hat—and quite frankly, the other interested parties would probably be complacent too. Dealing with the critique openly would disturb the idea that everyone can save face by saying that Ambrose passed a test.

    Yet that’s not what I had in mind—not specifically, at least. I wanted to point out some examples of bad or misleading testing, which you can find in all kinds of contexts if you put your mind to it. Imagine that the exam is a set of tests—checks, really. The boss is a product owner who wants to get the product released. The boss’ wife is a product marketing manager. Hapless Ambrose is a program—not a very good program to be sure, but one that everyone wants to release for different reasons, even though no one is particularly thrilled by the idea. You, whether a programmer or a tester or a test manager, are responsible for “testing”, but you’re really setting up a set of checks. And you’re under a lot of pressure. How might your judgement—consciously or subconsciously—be compromised? Would your good intentions bend and stretch as you tried to please your stakeholders and preserve your integrity? Would you admit to the boss that your testing was suspect? If you were under enough pressure, would you even notice that your testing was suspect?

    So this story is actually about any circumstance in which someone might set up a set of checks that provide some illusion of success. Can you think of any more ways that you might game the tests… or worse, fool yourself?

    Why Exploratory? Isn’t It All Just Testing?

    Friday, September 24th, 2010

    The post “Exploratory Testing and Review” continues to prompt comments whose responses, I think, are worthy of their own posts. Thank you to Parthi, who provides some thoughtful comments and questions.

    I always wondered and in attempted to see the difference between the Exploratory testing that you are talking about and the testing that I am doing. Unlike the rest of the commenter’s, this post made this question all the more valid and haunting.

    From what you have written, as long as there is a loop between the test design and execution, its exploratory testing? And the shorter the loop, exploratory nature goes up?

    Yes, that’s right. A completely linear process would be entirely scripted, with no exploratory element to it. The existence of a loop suggests that the testing is to some degree exploratory. This suggests (to me, at least) a link to one of the points of Jerry Weinberg’s Perfect Software and Other Illusions About Testing. Testing, he suggests, is gathering information with the intention of informing a decision, and he also says that if you’re not going to use that information, you might as well not test. I’ll go a little further and suggest that if you “test” with no intention of using the information in any way, you might be doing something, but you’re not really testing.

    As we’ve said before, some people seem to have interpreted the fact that there’s a distinction between exploratory testing and scripted testing as meaning that you can only be doing one or the other. That’s a misconception. It’s like saying that there are only two kinds of liquid water: hot or cold. Yet there are varying gradations of water: almost freezing, extremely cold, chilly, cool, room temperature, tepid, warm, hot, scalding, boiling. To stretch the metaphor, a test is it’s being done by a machine (that is, a check) is like ice. It’s frozen and it’s not going anywhere. An investigation of a piece of software done by a tester with no purpose other than to assuage his curiosity is like steam; it’s invisible and vaporous. But testing in most cases is to some extent scripted and to some extent exploratory. No matter how exploratory, a test is to some degree informed by a mission that typically came from someone else, at some point in the past; that is, the test is to some degree scripted. No matter how scripted, a test is to some degree informed by decisions and actions that come from the individual tester in the moment—otherwise the tester would freeze and stop working, just like a machine, as soon as he or she was unable to perform some step specified in the script. That is, all testing is to some degree exploratory.

    In addition to the existence of loops, there other elements too. Very generally,

    • the extent to which the tester has freedom to make his or her own choices about which step to take next, which tests to perform, which tools to use, which oracles to apply, and which coverage to obtain (more freedom means more exploratory and less scripted; more control means less exploratory and more scripted);
    • the extent to which the tester is held responsible for the choices being made and the quality of his or her work. More responsibility on the tester means more exploratory and less scripted; more responsibility on some other agency means less exploratory and more scripted.
    • the extent to which all available information (including the most recent information) informs the design and execution of the next test. The broader the scope of the information that informs the test, the more exploratory; the narrower the scope of information that informs the test , the more scripted.
    • the extent to which the mission—the search for information—is open-ended and new information is welcomed. The more new information will be embraced, the more exploratory the mission; the more new information will be ignored or rejected, the less exploratory the mission.
    • again, very generally, the length of the loops that include designing, performing, and interpreting an activity and learning from it, and then feeding that information back into the next cycle of design, performance, interpretation, and learning. I’m not talking here so much about timing and sequences of actions so much as the cognitive engagement. Timing is a factor; that’s one reason one reason that we now favour “parallel” over “simultaneous”. But more importantly, the more difficult it is to unsnarl the tangle of your interactions and your ideas, the more exploratory a process you’re in. The more rapidly you are able to shift from one heavy focus (say on executing the test) to another heavy focus (pondering the implications of what you’ve just seen) to another (running a thought experiment in your head) to yet another (revising your design), very generally, the more exploratory the process. Another way to put it: the more organic the process, the more exploratory it is; the more linear the process, the more scripted it is.

    Is this what you are saying? If yes, there is hardly any difference in what I do at my work and what you preach and this is true with most of my team (am talking about 600+ testers in my organization) and we simply call this Testing.

    I’d smilingly suggest that you can “simply” call it whatever you like. The more important issue is whether you want to simply call it something, or whether you want to achieve a deeper understanding of it. The risk associated with “simply” calling it something is that you’ll end up doing it simply, and that may fail to serve your clients when they are producing and using very complex products and services and systems. Which is, these days, mostly what’s happening.

    For example, is there really a difference between what I’m talking about and what are your 600+ testers doing? Can you describe what they’re doing? How would you describe it? How would you frame their actions in terms of risk, cost, value, skill, diversity, heuristics, oracles, coverage, procedures, context, quality criteria, product elements, recording, reporting? Is all that stuff “simply” testing? For any one of those elements of testing, where are your testers in control of their own process, and when are they being controlled? Are all 600+ at equivalent stages of development and experience? Are they all simply testing simply, or are some testing in more complex ways?

    Watch out for the magic words “simply” or “just”. Those are magic words. They cast a spell, blinding and deafening people to complexity. Yet the blindness and deafness don’t make the complexity go away. Even though these words have all the weight of snowflakes, their cumulative effect is to cover up complexity like a heavy snowfall covers up a garden.

    May be these posts should be titled “Testing” than “Exploratory Testing”?

    There is already good number of groups/people taking advantage of the (confused state of the larger) testing community (like certification boards). Why to add fuel to this instead of simplifying things?

    There’s a set of important answers to that, in my view.

    • Testing is a complex cognitive activity comprising many other complex cognitive activities. If we want to understand testing and learn how to do it well, we need to confront and embrace that complexity, instead of trying to simplify it away.
    • If we want our clients to understand the value, the costs, the extents, and the limitations of the services we can provide for them, we need to be able to explain what we’re doing, how we’re doing it, and why we’re doing it. That’s important so that both we and they can collaborate in making better informed choices about the information that we’re all seeking and the ways we go about obtaining that information.
    • One way to “simplify” matters is to pretend that testing is “simply” the preparation and then following of a script, or that exploratory testing is “simply” fooling around with the computer. If you’re upset at all about the certification boards that trivialize testing (as I am), it’s important to articulate and demonstrate the fact that testing is not at all a simple activity, or that comprehension of it can be assessed with any validity via a 40-question multiple choice test. Such a claim, in my opinion, is false, and charging money for such a test while making such a claim is, in my opinion, morally equivalent to theft. The whole scheme is founded in the premise that testing a tester is “simply” a matter of putting the tester through 40 checks. If we really wanted to evaluate and qualify a tester, we’d use an exploratory process: interviews, auditions, field testing, long sequence tests, compatibility tests, and so on. And we wouldn’t weed people out on the basis of them failing to take a bogus exam, any more than we’d reject a program for not being run against a set of automated checks that were irrelevant to what the program was actually supposed to do.
    • Just as software development is done in many contexts, so testing is done in many contexts. As we say in the Rapid Testing class, in excellent testing, your context informs your choices and vice versa. And in excellent testing, both your context and your choices evolve over time. I would argue that a heavily scripted process is more resistant to this evolution. That might be a good thing for certain purposes and certain contexts, and a not-at-all good thing for other purposes and other contexts.

    Many people say, for example, that to test medical devices, you must do scripted testing. There is indeed much in medical device testing that must be checked. Problems of a certain class yield very nicely to scripted tests (checks), such that a scripted approach is warranted. The trouble comes with the implicit suggestion that if you must do scripted testing, you must not do exploratory testing. Yet if we agree that problems in a product don’t follow scripts; if we agree that there will be problems in requirements as well as in code; if we agree that we can’t recognize incompleteness or ambiguity in advance of encountering their consequences; if we agree that although we can address the unexpected we can’t eliminate it; and if we agree that people’s lives may be at stake: isn’t it the case that we must do exploratory testing in addition to any scripted testing that we might or might not do?

    The answer is, to my mind, certainly Yes. So, to what extent, from moment to moment, are we emphasising one approach or the other? That’s not a question that we can answer by saying that we’re “just” testing.

    Thanks again, Parthi, for prompting this post.

    Can Exploratory Testing Be Automated?

    Wednesday, September 22nd, 2010

    In a comment on the previous post, Rahul asks,

    One doubt which is lingering in my mind for quite sometime now, “Can exploratory testing be automated?”

    There are (at least) two ways to interpret and answer that question. Let’s look first at answering the literal version of the question, by looking at Cem Kaner’s definition of exploratory testing:

    Exploratory software testing is a style of software testing that emphasizes the personal freedom and responsibility of the individual tester to continually optimize the value of her work by treating test-related learning, test design, test execution, and test result interpretation as mutually supportive activities that run in parallel throughout the project.

    If we take this definition of exploratory testing, we see that it’s not a thing that a person does, so much as a way that a person does it. An exploratory approach emphasizes the individual tester, and his/her freedom and responsibility. The definition identifies design, interpretation, and learning as key elements of an exploratory approach. None of these are things that we associate with machines or automation, except in terms of automation as a medium in the McLuhan sense: an extension (or enablement, or enhancement, or acceleration, or intensification) of human capabilities. The machine to a great degree handles the execution part, but the work in getting the machine to do it is governed by exploratory—not scripted—work.

    Which brings us to the second way of looking at the question: can an exploratory approach include automation? The answer there is absolutely Yes.

    Some people might have a problem with the idea, because of a parsimonious view of what test automation is, or does. To some, test automation is “getting the machine to perform the test”. I call that checking. I prefer to think of test automation in terms of what we say in the Rapid Software Testing course: test automation is any use of tools to support testing.

    If yes then up to what extent? While I do exploration (investigation) on a product, I do whatever comes to my mind by thinking in reverse direction as how this piece of functionality would break? I am not sure if my approach is correct but so far it’s been working for me.

    That’s certainly one way of applying the idea. Note that when you think in a reverse direction, you’re not following a script. “Thinking backwards” isn’t an algorithm; it’s a heuristic approach that you apply and that you interact with. Yet there’s more to test automation than breaking. I like your use of “investigation”, which to me suggests that you can use automation in any way to assist learning something about the program.

    I read somewhere on Shrini Kulkarni’s blog that automating exploratory testing is an oxymoron, is it so?

    In the first sense of the question, Yes, it is an oxymoron. Machines can do checking, but they can’t do testing, because they’re missing the ability to evaluate. Here, I don’t mean “evaluation” in the sense of performing a calculation and setting a bit. I mean evaluation in the sense of making a determination about what people value; what they might choose or prefer.

    In the second way of interpreting the question, automating exploratory testing is impossible—but using automation as part of an exploratory process is entirely possible. Moreover, it can be exceedingly powerful, about which more below.

    I see a general perception among junior testers (even among ignorant seniors) that in exploratory testing, there are no scripts (read test cases) to follow but first version of the definition i.e. “simultaneous test design, test execution, and learning” talks about test design also, which I have been following by writing basic test cases, building my understanding and then observing the application’s behavior once it is done, I move back to update the test cases and this continues till stakeholders agree with state of the application.

    Please guide if it is what you call exploratory testing or my understanding of exploratory testing needs modifications.

    That is an exploratory process, isn’t it? Let’s use the rubric of Kaner’s definition: it’s a style of working; it emphasizes your freedom and responsibility; it’s focused on optimizing the quality of your work; it treats design, execution, interpretation, and learning in a mutually supportive way; and it continues throughout the project. Yet it seems that the focus of what you’re trying to get to is a set of checks. Automation-assisted exploration can be very good for that, but it can be good for so much more besides.

    So, modification? No, probably not much, so it seems. Expansion, maybe. Let me give you an example.

    A while ago, I developed a program to be used in our testing classes. I developed that program test-first, creating some examples of input that it should accept and process, and input that it should reject. That was an exploratory process, in that I designed, executed, and interpreted unit checks, and I learned. It was also an automated process, to the degree that the execution of the checks and the aggregating and reporting of results was handled by the test framework. I used the result of each test, each set of checks, to inform both my design of the next check and the design of the program. So let me state this clearly:

    Test-driven development is an exploratory process.

    The running of the checks is not an exploratory process; that’s entirely scripted. But the design of the checks, the interpretation of the checks, the learning derived from the checks, the looping back into more design or coding of either program code or test code, or of interactive tests that don’t rely on automation so much: that’s all exploratory stuff.

    The program that I wrote is a kind of puzzle that requires class participants to test and reverse-engineer what the program does. That’s an exploratory process; there aren’t scripted approaches to reverse engineering something, because the first unexpected piece of information derails the script. In workshopping this program with colleagues, one in particular—James Lyndsay—got curious about something that he saw. Curiosity can’t be automated. He decided to generate some test values to refine what he had discovered in earlier exploration. Sapient decisions can’t be automated. He used Excel, which is a powerful test automation tool, when you use it to support testing. He invented a couple of formulas. Invention can’t be automated. The formulas allowed Excel to generate a great big table. The actual generation of the data can be automated. He took that data from Excel, and used the Windows clipboard to throw the data against the input mechanism of the puzzle. Sending the output of one program to the input of another can be automated. The puzzle, as I wrote it, generates a log file automatically. Output logging can be automated. James noticed the logs without me telling him about them. Noticing can’t be automated. Since the program had just put out 256 lines of output, James scanned it with his eyes, looking for patterns in the output. Looking for specific patterns and noticing them can’t be automated unless and until you know what to look for, BUT automation can help to reveal hitherto unnoticed patterns by changing the context of your observation. James decided that the output he was observing was very interesting. Deciding whether something is interesting can’t be automated. James could have filtered the output by grepping for other instance of that pattern. Searching for a pattern, using regular expressions, is something that can be automated. James instead decided that a visual scan was fast enough and valuable enough for the task at hand. Evaluation of cost and value, and making decisions about them, can’t be automated. He discovered the answer to the puzzle that I had expressed in the program… and he identified results that blew my mind—ways in which the program was interpreting data in a way that was entirely correct, but far beyond my model of what I thought the program did.

    Learning can’t be automated. Yet there is no way that we would have learned this so quickly without automation. The automation didn’t do the exploration on its own; instead, automation super-charged our exploration. There were no automated checks in the testing that we did, so no automation in the record-and-playback sense, no automation in the expected/predicted result sense. Since then, I’ve done much more investigation of that seemingly simple puzzle, in which I’ve fed back what I’ve learned into more testing, using variations on James’ technique to explore the input and output space a lot more. And I’ve discovered that the program is far more complex than I could have imagined.

    So: is that automating exploratory testing? I don’t think so. Is that using automation to assist an exploratory process? Absolutely.

    For a more thorough treatment of exploratory approaches to automation, see

    Investment Modeling as an Exemplar of Exploratory Test Automation (Cem Kaner)

    Boost Your Testing Superpowers (James Bach)

    Man and Machine: Combining the Power of the Human Mind with Automation Tools (Jonathan Kohl)

    “Agile Automation” an Oxymoron? Resolved and Testing as a Creative Endeavor (Karen Wysopal)

    …and those are just a few.

    Thank you, Rahul, for the question.

    Exploratory Testing and Review

    Wednesday, September 22nd, 2010

    The following is a lightly-edited version of something that I wrote on the software-testing mailing list, based on a misapprehension that we who advocate exploratory testing suggest that review or other forms of testing should be dropped.

    Exploratory testing was, for many years, described as “simultaneous test design, test execution, and learning”. In 2006, a few of us who have been practising and studying exploratory testing got together to exchange some of what we had learned over the years, and to see if we could work on refining the definition. I did a presentation that described some of my experience at those meetings. Cem Kaner wrote a synthesis of our ideas, and several of us who were there (and many who weren’t) have since explicitly agreed with it.

    Exploratory software testing is a style of software testing that emphasizes the personal freedom and responsibility of the individual tester to continually optimize the value of her work by treating test-related learning, test design, test execution, and test result interpretation as mutually supportive activities that run in parallel throughout the project.

    I use that definition when I want to be explicit. Much of the time, though, I keep things shorter. I still use something close to the older definition, with one minor change.

    There’s a problem with the word “simultaneous”. When we say “simultaneous”, people seem to think that means that everything is happening at the same time and to the same degree; that is, all three of design, execution, and learning are turned up to 10, all the time. Some people believe that exploratory testing is something that happens only as direct, hands-on interaction with a working product, only after code has been written and compiled and linked. But that’s only one of the times at which something can be explored. In fact, exploratory approaches can be applied to any idea or artifact, at any stage of devlopment of the product or service. That means you can be emphasizing test execution and learning, and relaxing emphasis on test design for a while; you can be designing a test and learning from that, while not executing the test immediately. You can be designing and executing in very short cycles, but learning less now than you might learn later. So, for that reason, we’ve started to say “parallel”, rather than “simultaneous”.

    One of the things that I get from Cem’s synthesis is the notion of mutually supporting activities. The traditional, more linear approaches suggest that excellent test execution depends on excellent test design. It’s hard to disagree with that. But excellent test design—and improving test design— depends on feedback from test execution, too. In general, when the loops of design, execution, and learning are shorter, the feedback from one can inform the others more quickly. But that’s not to say that you can’t design a test and then wait to act on it, if that’s the most appropriate thing to do for the moment. However, when there are very long loops (or no loops), then you’re working in a scripted way, rather than an exploratory way. Shorter loops mean that testing is more exploratory.

    In addition, something is more exploratory when an individual or a group of people (rather than a process or a script) is in charge. You can do test design and test planning in a less exploratory way by mixing it with a only little test execution. You can do test design and test planning in a more exploratory way by mixing it with a lot of test execution. (Even in a heavily scripted process, that exploratory activity happens a lot without people noticing it, so it seems.)

    For example, review is a testing activity—questioning a product in order to evaluate it. There are scripted and exploratory forms of review. Consider code review. A completely scripted form of code review is a static analysis tool that looks for problems that it has been programmed to identify. A more exploratory form of code review is a bunch of people looking over a couple of pages of code, looking for specific problems that have been outlined on a checklist. A still more exploratory form of code review is a bunch of people looking for problems from a checklist, but also looking for any other problems that they might see. Perhaps the most exploratory form of code review is pair programming—people looking over code that is sort-of working, creating unit checks, revising the code, running the checks, and iterating right then and there.

    Other forms of technical review can take the same arc. In the most scripted form, people receive (say) a functional design document, run it through a spelling and grammar checker, and sign off on it—and that’s the only review of the document that ever happens. In a less scripted form, people receive the design document and review it, comparing it to a list of specified requirements and quality criteria. In a more exploratory form, people look at examples or a prototype of various functions, and discuss what they’ve seen; at the end of the conversation, the designer takes the notes away and goes back to build a new prototype. In an extremely exploratory form of design, people sit around a projector and work on a Fitnesse page, raising ideas and concerns, discussing them, resolving them, and updating the examples and notes on the prototype in real time.

    No one who talks seriously about exploratory testing, so far as I know, talks about getting rid of review. What we do talk about is getting rid of things that waste time and mental power by introducing interruptions, needless documentation, and processes or tools that over-mediate interaction between the tester and the product. Don’t get rid of documentation; get rid of excessive amounts of documentation, or unhelpful documentation. Don’t test thoughtlessly, and don’t get rid of thinking; get rid of overthinking or freezing in the headlights. Don’t get rid of test design; shorten the feedback loops between getting an idea and acting on an idea, and then feed what you’ve learned through action back into the design. Don’t control testers’ activities though a script; guide a tester with concise documentation&charters, checklists, coverage outlines, or risk lists—that help the tester to keep focused, but that allow them to defocus and investigate using their own mindsets and skill sets when it’s appropriate to do so.

    Testing is investigation of a product. Investigation can be applied at any time, to any idea or artifact. That investigation is ongoing, and it comprises design, execution, and learning. From one moment to the next, one might take precedence over the others, but which one is at the fore can flip at any instant. What distinguishes the exploratory mindset from the scripted mindset is the degree to which the tester, rather than some other agency, has the freedom and responsibility to make decisions about what he or she will do next.

    Encouraging Programmers to be Testers

    Monday, September 20th, 2010

    A colleague wrote to me recently and asked about a problem that he’s had in hiring. He says…

    The kind of test engineers we’re looking for are ones that can think their way around a system and look for all the ways that things can go wrong (pretty standard, so far), and then code up a tool or system that can automatically verify that those things haven’t gone wrong (a bit more rare, especially, for some reason, in the part of the country where I’m working).

    The problem is that I suspect there is a larger pool of candidates who fit this description, but think of themselves as software developers. They never consider a software engineer in test role because they think “oh, that’s QA”.

    It’s a little more easy on the west coast—in Silicon Valley and the Pacific Northwest, I mean—because big companies like Microsoft, Google, and others define a lot of their test engineering roles as “software developers who build systems that just happen to test other systems as their product”.

    Any thoughts on how to shift that perception here to be more closely aligned with the west coast?

    My reply went like this:

    I don’t like that particular perception, myself. I observe that the “Software Development Engineer in Test” view often presupposes that the SDET role is a consolation prize for not being hired as a Real Programmer on the one hand. On the other hand, the view poses a barrier to entry for non-programmers who want to be testers.

    If your prospects think, “Oh, that’s QA” in this dismissive kind of way, what kind of testers are they going to be? Moreover, what kind of programmers are they going to be? To me and to our community, testing is questioning a product in order to evaluate it. Testing is a service to the project, wherein testers help to discover risks and problems that threaten the value of the product and the goals of the business. While testers are specialists in that kind of role, testing is something that any programmer should be doing too, as he or she is writing, reviewing, and revising code. So if prospects think that testing is some kind of second-class role or task, who needs them? Encourage them to take a hike and come back when they’ve learned not to be such prima donnas.

    If offered a gaggle of programming enthusiasts to choose from, I’d prefer to hire the person who is most genuinely interested in testing. I’d give that person a toolsmith role, wherein (s)he provides programming services to other testers. In addition, the toolsmith can provide the service of aiding testers in learning to program, while the testers help the toolsmith to sharpen his/her testing skills.

    If you’d like to motivate programmer-types to test, here’s a suggestion: Instead of treating testing as a programming exercise, treat it as a more general problem-solving task in which hand-crafted tools might be very helpful. James Bach and I do this by giving people testing puzzles that look simple but that have devious twists and traps. In my experience, most programmers (and all of the best ones) enjoy challenging intellectual problems, irrespective of whether programming is at the centre of solution. In our classes and at conferences and in jam sessions, we show programmers over and over again how they can be fooled by puzzles, just as anyone else can. We hasten to point out that programming can help to solve problems in a manner that can be far more more practical and useful than other approaches for solving some problems in some contexts. But as it goes with production code, so it goes with test code: working out how to solve the problem is the most challenging part, and writing a program is a means to that end, rather than the end in itself. The trick is to help people to recognize when they might want to emphasize writing code and when they might want to emphasize other approaches and other lines of investigation.

    As Cem Kaner points out in his talk “Investment Modeling as an Exemplar of Exploratory Test Automation“, the best skilled testers (and especially those with programming skills) have the kind of mindset that we might easily associate with the best quantitative analysts. On Wall Street, they’re called quants, and they make a lot of money. In order to be successful and to reduce risk, both their programs and their models have to be useful, reliable, accurate—which means they have to be tested, and tested well.

    Trouble is, many managers treat testing as a rote, clerical, bureaucratic, mechanistic activity. If that were all there is to testing, it would be a dead-end role and ripe for automation, but testing is so much more than coding up a set of checks. It’s up to us to help others to recognize what testing can do, by offering conversation, challenges, and leadership by example—and information about products and projects that people genuinely value. To do that, we need a community of testers who are passionate about their craft and practice it with skill, recognizing that it’s thinking—about risk, cost, value, and learning—that’s central.

    But if you put the hammer and hammering at the centre of the task, you’ll get enthusiatic hammerers applying for the job, where I suspect you really want cabinet makers.

    Done, The Relative Rule, and The Unsettling Rule

    Thursday, September 9th, 2010

    The Agile community (to the degree that such a thing exists at all; it’s a little like talk about “the software industry”) appears to me to be really confused about what “done” means.

    Whatever “done” means, it’s subject to the Relative Rule. I coined the Relative Rule, inspired by Jerry Weinberg‘s definition of quality (“quality is value to some person(s)”). The Relative Rule goes like this:

    For any abstract X, X is X to some person, at some time.

    For example, the idea of a “bug” is subject to the Relative Rule. A bug is not a thing that exists in the world; it doesn’t have a tangible form. A bug is a relationship between the product and some person. A bug is a threat to the value of the product to some person. The notion of a bug might be shared among many people, or it might be exclusive to some person.

    Similarly: “done” is “done to some person(s), at some time,” and implicitly, “for some purpose“. To me, a tester’s job is to help people who matter—most importantly, the programmers and the product owner—make an informed decision about what constitutes “done” (and as I’ve said before, testers aren’t there to make that determination themselves). So testers, far from worrying about “done”, can begin to relax right away.

    Let’s look at this in terms of a story.

    A programmer takes on an assignment to code a particular function. She goes into cycles of test-driven development, writing a unit check, writing code to make the check pass, running a suite of prior unit checks and making sure that they all run green, and repeating the loop, adding more and more checks for that function as she goes. Meanwhile, the testers have, in consultation with the product owner, set up a suite of examples that demonstrate basic functionality, and they automate those examples as checks.

    The programmer decides that she’s done writing a particular function. She feels confident. She runs them against the examples. Two examples don’t work properly. Ooops, not done. Now she doesn’t feel so confident. She writes fixes. Now the examples all work, so now she’s done. That’s better.

    A tester performs some exploratory tests that exercise that function, to see if it fulfills its explicit requirements. It does. Hey, the tester thinks, based on what I’ve seen so far, maybe we’re done programming… but we’re not done testing. Since no one—not testers, not programmers, not even requirements document writers—imagine!—is perfect, the tester performs other tests that explore the space of implicit requirements.

    The tester raises questions about the way the function might or might not work. The tester expands the possibilities of conditions and risks that might be relevant. Some of his questions raise new test ideas, and some of those tests raise new questions, and some of those questions reveal that certain implicit requirements haven’t been met. Not done!

    The tester is done testing, for now, but no one is now sure that programming is done. The programmer agrees that the tester has raised some significant issues. She’s mildly irritated that she didn’t think of some of these things on her own, and she’s annoyed that others are not explicit in the specs that were given to her. Still, she works on both sets of problems until they’re addressed too. (Done.)

    For two of the issues the tester has raised, the programmer disagrees that they’re really necessary (that is, things are done, according to the programmer). The tester tries to make sure that this isn’t personal, but remains concerned about the risks (things are not done, according to the tester). After a conversation, the programmer persuades the tester that these two issues aren’t problems (oh, done after all), and they both feel better.

    Just to be sure, though, the tester brings up the issues with the product owner. The product owner has some information about business risk that neither the tester nor the programmer had, and declares emphatically that the problem should be fixed (not done).

    The programmer is reasonably exasperated, because this seems like more work. Upon implementing one fix, the programmer has an epiphany; everything can be handled by a refactoring that simultaneously makes the code easier to understand AND addresses both problems AND takes much less time. She feels justifiably proud of herself. She writes a few more unit checks, refactors, and all the unit checks pass. (Done!)

    One of the checks of the automated examples doesn’t pass. (Damn; not done.) That’s frustrating. Another fix; the unit checks pass, the examples pass, the tester does more exploration and finds nothing more to be concerned about. Done! Both the programmer and the tester are happy, and the product owner is relieved and impressed.

    Upon conversation with other programmers on the the project team, our programmer realizes that there are interactions between her function and other functions that mean she’s not done after all. That’s a little deflating. Back to the drawing board for a new build, followed by more testing. The tester feels a little pressured, because there’s lots of other work to do. Still, after a little investigation, things look good, so, okay, now, done.

    It’s getting to the end of the iteration. The programmers all declare themselves done. All of the unit checks are running green, and all of the ATDD checks are running green too. The whole team is ready to declare itself done. Well, done coding the new features, but there’s still a little uncertainty because there’s still a day left in which to test, and the testers are professionally uncertain.

    On the morning of the last day of the iteration, the programmers get into scoping out the horizon for the next iteration, while testers explore and perform some new tests. They apply oracles that show the product isn’t consistent with a particular point in a Request-For-Comment that, alas, no one has noticed before. Aaargh! Not done.

    Now the team is nervous; people are starting to think that they might not be done what they committed to do. The programmers put in a quick fix and run some more checks (done). The testers raise more questions, perform more investigations, consider more possibilities, and find that more and more stopping heuristics apply (you’ll find a list of those here: http://www.developsense.com/blog/2009/09/when-do-we-stop-test/). It’s 3:00pm. Okay, finally: done. Now everyone feels good. They set up the demo for the iteration.

    The Customer (that is, the product owner) says “This is great. You’re done everything that I asked for in this iteration.” (Done! Yay!) “…except, we just heard from The Bank, and they’ve changed their specifications on how they handle this kind of transaction. So we’re done this iteration (that is, done now, for some purpose), but we’ve got a new high-priority backlog item for next Monday, which—and I’m sorry about this—means rolling back a lot of the work we’ve done on this feature (not done for some other purpose). And, programmers, the stuff you were anticipating for next week is going to be back-burnered for now.”

    Well, that’s a little deflating. But it’s only really deflating for the people who believe in the illusion that there’s a clear finish line for any kind of development work—a finish line that is algorithmic, instead of heuristic.

    After many cycles like the above, eventually the programmers and the testers and the Customer all agree that the product is indeed ready for deployment. That agreement is nice, but in one sense, what the programmers and the testers think doesn’t matter. Shipping is a business decision, and not a technical one; it’s the product owner that makes the final decision. In another sense, though, the programmers and testers absolutely matter, in that a responsible and effective product owner must seriously consider all of the information available to him, weighing the business imperatives against technical concerns. Anyway, in this case, everything is lined up. The team is done! Everyone feels happy and proud.

    The product gets deployed onto the bank’s system on a platform that doesn’t quite match the test environment, at volumes that exceed the test volumes. Performance lags, and the bank’s project manager isn’t happy (not done). The testers diligently test and find a way to reproduce the problem (they’re done, for now).

    The programmers don’t make any changes to the code, but find a way to change a configuration setting that works around the problem (so now they’re done). The testers show that the fix works in the test environments and at heavier loads (done). Upon evaluation of the original contract, recognition of the workaround, and after its own internal testing, the bank accepts the situation for now (done) but warns that it’s going to contest whether the contract has been fulfilled (not done).

    Some people are tense; others realize that business is business, and they don’t take it personally. After much negotiation, the managers from the bank and the development shop agree that the terms of the contract have been fulfilled (done), but that they’d really prefer a more elegant fix for which the bank will agree to pay (not done). And then the whole cycle continues. For years.

    So, two things:

    1) Definitions and decisions about “done” are always relative to some person, some purpose, and some time. Decisions about “done” are always laden with context. Not only technical considerations matter; business considerations matter too. Moreover, the process of deciding about doneness is not merely logical, but also highly social. Done is based not on What’s Right, but on Who Decides and For What Purpose and For Now. And as Jerry Weinberg points out, decisions about quality are political and emotional, but made by people who would like to appear rational.

    However, if you want to be politically, emotionally, and rationally comfortable, you might want to take a deep breath and learn to accept—with all of your intelligence, heart, and good will—not only the first point, but also the second…

    2) “Done” is subject to another observation that Jerry often makes, and that I’ve named The Unsettling Rule:

    Nothing is ever settled.

    Update, 2013-10-16: If you’re still interested, check out the esteemed J.B. Rainsberger on A Better Path To the “Definition of Done”.

    The Motive for Metaphor

    Friday, September 3rd, 2010

    There’s a mildly rollicking little discussion going on the in the Software Testing Club at the moment, in which Rob Lambert observes, “I’ve seen a couple of conversations recently where people are talking about red, green and yellow box testing.” Rob then asks “There’s the obvious black and white. How many more are there?”

    (For what it’s worth, I’ve already made some comments about a related question here.)

    At one point a little later in the conversation, Jaffamonkey (I hope that’s a pseudonym) replies,

    If applied in modern context Black Box is essentially pure functional testing (or unit testing) whereas White Box testing is more of what testers are required to do, which is more about testing user journeys, and testing workflows, usability etc.

    Of course, that’s not what I understand the classical distinction to be.

    The classical distinction started with the notion of “black box” testing. You can’t see what’s inside the box, and so you can’t see how it’s working internally. But that may not be so important to you for a particular testing mission; instead, you care about inputs and outputs, and the internal implementation isn’t such a big deal. You’d take this kind of approach when a) you don’t have source code; or b) you’re intentionally seeking problems that you might not notice so quickly by inspection, but that you might notice by empirical experiments and observation; or maybe c) you may believe that the internal implementation is going to be varied or variable, so no point in taking it into account with respect to the current focus of your attention. I’m sure you can come up with more reasons.

    This “black box” idea suggests a contrast: “glass box” testing. Since glass is transparent, you can see the inner workings, and the insight into what is happening internally gives you a different perspective for risks and test ideas. This is especially important when a) your mission involves testing what’s happening inside the box (programmers take this perspective more often than not); or b) your overall mission will be simpler, in some dimension, because of your understanding of the internals; or maybe c) you want to learn something about how someone has solved a particular problem. Again, I’m sure you can some up with lots more reasons; these are examples, not defnitive lists.

    Unhelpfully (to me), someone somewhere along the way decided that the opposite of “black” must be “white”; that black box testing was the kind where you can’t see inside the box; and that therefore white (rather than glass) box testing must the name for the other stuff. At this point, the words and the model begin to part company.

    Even less helpfully, people stopped thinking in terms of a metaphor and started thinking in terms of labels dissociated from the metaphor. The result is an interpretation like Jaffa’s above, where he (she?) seems to have inverted the earlier interpretations, for reasons I know not why. Who knows? Maybe it’s just a typo.

    More unhelpfully still (to me), someone has (or several someones have) apparently come along with color-coding systems for other kinds of testing. Bill Matthews reports that he’s found

    Red Box = “Acceptance testing” or “Error message testing” or “networking , peripherals testing and protocol testing”
    Yellow Box = “testing warning messages” or “integration testing”
    Green Box = “co-existence testing” or “success message testing”

    Sources:
    http://www.testrepublic.com/forum/topics/define-red-box-testing-yellow
    http://www.geekinterview.com/question_details/27985
    http://www.allinterview.com/showanswers/7077.html
    http://www.coolinterview.com/interview/10080/

    For me, there are at least four big problems here. First, there is already disagreement on which colours map to which concepts. Second, there is no compelling reason that I can see to associate a given colour with any of the given ideas. Third, the box metaphor doesn’t have a clear relationship to what’s going on in the mind or the practice of a tester. The colour is an arbitrary label on an unconstrained container. Fourth, since the definitions appear on interview sites and the sites disagree, there’s a risk that some benighted hiring manager will assume that there is only one interpretation, and will deprive himself of an otherwise skilled tester who read a different site. (To defend yourself against this fourth problem, use safety language: “Here’s what I understand by ‘chartreuse-box testing’. This is the interpretation given by this person or group, but I’m aware there may be other interpretations in your context.” For extra points, try saying something like, “Is that consistent with your interpretation? If not, I’d be happy to adopt the term the way you use it around here.” And meaning it. If they refuse to hire you because of that answer, it’s unlikely that working there would have been much fun.)

    All of this paintbox of terms is unhelpful (to me) because it means another 30,000 messages on LinkedIn and QAForums, wherein enormous numbers of testers weigh in with their (mis)understandings of some other author’s terms and intentions—and largely with the intention of asking or answering homework questions, so it seems. The next step is that, at some point, some standards-and-certification body will have to come along and lay down the law about what colour testing you would have to do to find out how many angels can dance on the head of a pin, what colour the pin is, and whether the angels are riding unicorns. And then another, competing standards-and-certification body will object, saying that it’s not angels, it’s fairies, and it’s not unicorns, it’s centaurs, and they’re not dancing, they’re doing gymnastics. And don’t get us started on the pin! Courses and certifications on colour-mapping to mythological figures will be available (at a fee) to check (not test!) your ability to memorize a proprietary table of relationships. Meanwhile, most of the people involved in the discussion will have forgotten—in the unlikely event that they ever knew— that the point of the original black-and-glass exercise was to make things more usefully understandable. Verification vs. validation, anyone? One is building the right thing; the other is building the thing right. Now, quick: which is which? Did you have to pause to think about it? And if you find a problem wherein the thing was built wrong, or that the wrong thing was built, does anyone really care whether you were doing validation testing or verification testing at the time?

    Well… maybe they do. So, all that said, remember this: no one outside your context can tell you what words you can or can’t use. And remember this too: no one outside your context can tell you what you can or can’t find useful. Some person, somewhere, might find it handy to refer to a certain kind of testing as “sky testing” and another kind of testing as “ground testing”, and still another as “water testing”. (No, I can’t figure it out either.) If people find those labels helpful, there’s nothing to stop them, and more power to them. But if the labels are unhelpful to you and only make your brain hurt, it’s probably not worth a lot of cycles to try to make them fit for you.

    So here are some tests that you can apply to a term or metaphor, whether you produce it yourself or someone else produced it:

    • Is it vivid? That is (for a testing metaphor), does it allow you to see easily in your mind’s eye (hear in your mind’s ear, etc.) something in the realm of common experience but outside the world of testing?
    • Is it clear? That is, does it allow you to make a connection between that external reference and something internal to testing? Do people tend to get it the first time they hear it, or with only a modicum of explanation? Do people retain the connection easily, such that you don’t have to explain it over and over to the same people? Do people in a common context agree easily, without arguments or nit-picking?
    • Is it sticky? Is it easy to remember without having to consult a table, a cheat sheet, or a syllabus? Do people adopt the expression naturally and easily, and do they use it?

    If the answer to these questions is Yes across the board, it might be worthwhile to spread the idea. If you’re in doubt, field-test the idea. Ask for (or offer) explanations, and see if understanding is easy to obtain. Meanwhile, if people don’t adopt the idea outside of a particular context, do everyone a favour: ditch it, or ignore it, or keep it within a much closer community.

    In his book The Educated Imagination (based on the Massey Lectures, a set of broadcasts he did for the Canadian Broadcasting Corporation in 1963), Northrop Frye said, “Outside literature, the main motive for writing is to describe this world. But literature itself uses language in a way which associates our minds with it. As soon as you use associative language, you begin using figures of speech. If you say, “this talk is dry and dull”, you’re using figures associating it with bread and breadknives. There are two kinds main kinds of association, analogy and identity, two things are like each other and two things that are each other (my emphasis –MB). One produces a figure of speech called the simile. The other produces a figure called metaphor.”

    When we’re trying to describe our work in testing, I think most people would agree that we’re outside the world of literature. Yet we learn most easily and most powerfully by association—by relating things that we don’t understand well to things that we understand a little better in some specific dimension. In reporting on our testing, we’re often dealing with things that are new to us, and telling stories to describe them. The same is true in learning about testing. Dealing with the new and telling stories leads us naturally to use associative language. Frye explains why we have to be cautious: “In descriptive writing, you have to be careful of associative language. You’ll find that analogy, or likeness to something else, is very tricky to handle in description, because the differences are as important as the resemblances. As for metaphor, where you’re really saying “this is that,” you’re turning your back on logic and reason completely because logically two things can never be the same thing and still remain two things.”

    Having given that caution, Frye goes on to explain why we use metaphor, and does so in a way that I think might be helpful for our work: “The poet, however, uses these two crude, primitive, archaic forms of thought in the most uninhibited way, because his job is not to describe nature but to show you a world completely absorbed and possessed by the human mind…The motive for metaphor, according to Wallace Stevens, is a desire to associate, and finally to identify, the human mind with what goes on outside it, because the only genuine joy you can have is in those rare moments when you feel that although we may know in part, as Paul says, we are also a part of what we know.”

    So the final test of a term or a metaphor or a heuristic, for me, is this:

    • Is it useful? That is, does it help you make sense of the world to the degree that you can identify an idea with something deeper and more resonant than a mere label? Does it help you to own your ideas?

    Postscript, 2013/12/10: “A study published in January in PLOS ONEexamined how reading different metaphors—’crime is a virus’ and ‘crime is a beast’—affected participants’ reasoning when choosing solutions to a city’s crime problem…. (Researcher Paul) Thibodeau recommends giving more thought to the metaphors you use and hear, especially when the stakes are high. ‘Ask in what ways does this metaphor seem apt and in what ways does this metaphor mislead,’ he says. Our decisions may become sounder as a result.” Excerpted from Salon.