DevelopsenseLogo

Expected Results

Klára Jánová is a dedicated tester who studies and practices and advocates Rapid Software Testing. Recently, on LinkedIn, she said:

I might EXPECT something to happen. But that doesn’t necessarily mean that I WANT IT/DESIRE for IT to happen. I even may want it to happen, but it not happening doesn’t have to automatically mean that there’s a problem.

The point of this post: no more “expected results” in the bug reports, please!

In reply, Derek Charles asked:

Then how else would you communicate to the developer or the team what is SUPPOSED to happen? I think that expected results are very necessary especially when regressions are found during testing.

Klara replied:

I suggest to describe the behavior that the tester recognizes as problematic and explain WHY it might be a problem for someone—the reasoning why the behavior is perceived as a bug—that’s what really matters.

Exactly so. Klára is referring here to problems and oracles—means by which we recognize problems when we encounter them in testing.

There’s an issue with the “what is supposed to happen” stuff: in development work, what is supposed to happen is not always entirely clear. Moreover, and more importantly, since testers don’t run the project or the business, we don’t mandate what is supposed to happen.

For instance, while testing, I may observe something in the product that I find confusing, or surprising, or wrong. When I look up the intended behaviour in the specification, it says one thing; the developer, claiming that the spec is out of date, contradicts it; and the product owner confirms that the spec is outdated. But she also says that the developer’s interpretation of what should happen is not what she wants him to implement. And then, when I consult an RFC, the product owner’s interpretation is inconsistent with what the RFC says should be the appropriate behaviour.

Fortunately, I don’t have to decide, and I don’t have to say what should happen. My job as a tester is to report on an apparent inconsistency between the product and presumably desirable things, or between the product and someone’s expressed desire or requirement. In the case above, I let the product owner know about the inconsistency between her interpretation and the standard, and she makes the call on what she and the business want from the product.

That is, even though I have certain expectations, I might be wrong about them and about what I think should be. For instance, she might decide that our product is not going to support that standard. She might point out that the standard I’m considering has been superseded by a later one. In any case, what is supposed to happen gets decided not by me, but by the people who run things. That’s what they’re paid for. This is a good thing, not a bad thing.

But still, I’d like to honour Derek’s question: as testers, how should we report a problem without referring to “expected results”?

  • Instead of saying “expected result” and leaving it that, we could say “inconsistent with the specification”.

    Inconsistency with the specification is a special case of a more general way of recognizing and describing a problem: inconsistency with claims. “Inconsistency with claims” is an oracle heuristic. (A heuristic is a fallible means for solving a problem; an oracle is a special kind of heuristic which, fallibly, helps you to solve the problem of identifying and describing a bug.) When a product is inconsistent with a claim that someone important makes about it, there’s likely a problem, either with the product or the claim. As a tester, I don’t have to decide which.

    The specification is a particular form of a claim that someone is making about what the product is like, or what it should be like. Claims can be made in design sessions, planning meetings, pair programming, hallway conversations, training workshops… Claims can be represented in help files, marketing materials, workflow diagrams, lookup tables, user manuals, whiteboard sketches, UML diagrams… Claims can also be represented in the code of an automated check, where someone has written code to compare the output of the product with an anticipated and presumably desirable result. Recognizing many sources of claims and inconsistencies with them makes us more powerful testers.

    Whatever relevant claim you’re referring to, having said “inconsistent with a claim” (and having identified the nature of the claim, and where or whom it comes from), you don’t need to say “expected result”.

  • Instead of saying “expected result” and leaving it that, you could say “inconsistent with how the product used to work”.

    Inconsistency with history is an oracle heuristic. After a change, the product might have a new bug in it. On the other hand, the product might have been wrong all along, and now it’s right. (This is an example of how oracles can mislead us or conflict with each other, which is why it’s a good idea to identify the oracles we’re applying in problem reports.) If you (or others) aren’t aware of why the desirable change was made, that’s a different kind of problem, but a problem nonetheless.

    Either way, having said “inconsistent with how the product used to work” (and having described that in terms of a problem), you don’t need to say “expected result”.

  • Instead of saying “expected result” and leaving it that, you could say “inconsistent with respect to the product itself”.

    Inconsistency within the product is an oracle heuristic. This can takes a number of forms: the product might return inconsistent results from one run to the next; the product could afford a tidy, smooth interface in one place, and a frustrating, confusing interface in another; the product could present output very precisely in one part of the product, and imprecisely in another; one component in the product could log output using one format, while another component’s log output is in a different format, which makes analysis more difficult…

    The inconsistency might be undesirable (because of a reliability problem), or it might be completely desirable (a Web page for a newspaper should change from day to day), or it might desirable or undesirable in ways that you’re not aware of (since, like me, you probably don’t know everything).

    In general, people tend to prefer things that present themselves in a consistent way. Here’s a trivial example from Microsoft Office (Office 365, these days): to search for text in Word, the keyboard command is Ctrl-F. In Outlook, part of the same product suite, Ctrl-F triggers the Forward Message action instead; F4 triggers a search. Had Outlook and Word been designed by the same teams at the same time, this probably would have been identified as a bug, and addressed. In the end, the Office suite’s program managers decided that consistency with history dominated inconsistency within the product, and now we all have to live with that. Oh well.

    In any case, having said “inconsistent with respect to some aspect of the same product” (and having identified the specifics of the inconsistency), you don’t need to say “expected result”.

  • Instead of saying “expected result” and leaving it that, you could say “inconsistency with a comparable product” (and identify the product, and the nature of the inconsistency).

    Inconsistency with a comparable product is an oracle heuristic. Any product (something that someone has produced) that provides a relevant point of comparison is, by defintion, a comparable product. That includes competitive products, of course; Microsoft Word and Google Docs are comparable products, in that sense. Microsoft Word and WordPad are comparable products too; they have many features in common. If Word can’t open an .RTF file generated by WordPad, we have reason to suspect a problem in one product or the other. If WordPad prints an RTF file properly, and Word does not, we have reason to suspect a problem in Word.

    Is the Unix program wc (wc stands for “word count”) a comparable product to Microsoft Word? All wc does is count words in text files, so no, except… Word has a word-counting feature. If Word’s calculation for the number of words in a text file is inexplicably different from wc‘s count, we have reason to suspect a problem in one product or the other.

    Test tools and suites of automated output checks represent comparable products too. If the output from your product is inconsistent with the specified and desired results provided by your test tool, or with some data that it processes to produce such results, you have reason to suspect a problem somewhere.

    In any case, having said “inconsistent with a comparable product”, and having identified the product and the basis for comparison, you don’t need to say “expected result”.

Those are just a few examples. When we teach Rapid Software Testing, we offer a set of oracle heuristics that identify principles of desirable (and undesirable) consistency (and inconsistency) for identifying bugs; you can read more about those here.

James Bach has recently identified another principle that might apply to bugs but that, in my view, more powerfully applies to enhancement requests: we desire the product to be consistent with acceptable quality: that is, not only good, but every bit as good as it can be.

Why is all this a big deal? Several reasons, I think.

First, “expected result” begs the question of where the expectation comes from. It’s just a middleman for something we could say more specifically. Why not get to the point and say it while at the same time sounding like a pro? Because…

Second, being specific about where the expectation comes from saves time and focuses conversation on the (un)desirable (in)consistencies that matter when developers and product owners are deciding whether something is a bug worth fixing. It also helps to focus repair in the appropriate claim (for example, if the product is right and the spec is wrong, it’s a prompt to repair the spec).

Third, it helps for us to remember that our job as testers is not to confirm that the product works “as expected”, but to ask “is there a problem here?” A product can fulfill an expectation and nonetheless have terrible problems about it. It’s our job to seek and find and describe inconsistencies and problems that matter before it’s too late.

And finally…

Fourth, speaking in terms of an oracle instead of an “expected result” can help to avoid patronizing, condescending, time-wasting, and obvious elements of bug reports that cause developers to feel insulted or to roll their eyes.

Actual result: Product crashes.

Expected result: Product does not crash.

Don’t be that tester.

Further reading:

Not-So-Great Expectations
Oracles From the Inside Out
FEW HICCUPPS

Want to learn how to observe, analyze, and investigate software? Want to learn how to talk more clearly about testing with your clients and colleagues? Rapid Software Testing Explored, presented by me and set up for the daytime in North America and evenings in Europe and the UK, November 9-12. James Bach will be teaching Rapid Software Testing Managed November 17-20, and a flight of Rapid Software Testing Explored from December 8-11. There are also classes of Rapid Software Testing Applied coming up. See the full schedule, with links to register here.

3 replies to “Expected Results”

  1. I do not agree fully with this.

    Michael replies: Okay. I don’t mind if people don’t agree, but I love it when they think, add to my argument, or bring a counter-argument.

    Its a way of sometimes getting an excuse of your product knowledge.

    I’m not sure I understand this. Help me out?

    If you are aware it was behaving the x way before and now its failing, you should mention whats expected.

    I’d put this differently, I think. If the product was behaving in some way before and is behaving differently now and is failing, it must be inconsistent with some oracle—so it’s a good idea to identify the nature of that inconsistency. There’s a problem with “what is expected”, though: depending on the source of information or the oracle, “expected” can mean anticipated, desired, required, tolerable, unsurprising, or specified—and different information sources and different oracles might conflict. As the post above suggest, it’s not really up to the tester to declare what should happen in any absolute kind of sense, but it might be okay for the tester to say “according this [principle, mechanism, artifact, person, comparable product, or feeling], the product should do this.”

    Otherwise it waste a lot of time

    It might. It might waste more time if the tester is wrong, though.

    Reply

Leave a Comment