Blog: Transpection Transpected

Part of the joy of producing this blog is in seeing what happens when other people pick up the ideas and run with them.  That happened when I posted a scenario on management mistakes a few weeks ago, and Markus Gärtner responded with far more energy and thought than I would have expected. Thanks, Markus.

Last week I posted a transcript of a transpection session between me and James Bach.  The responses and the comments were very gratifying, but Oliver Vilson’s comment has sparked a discussion of its own. Oliver says,

I would have to say it is not only possible to test the clock-in-the-box but actually necessary.

I see it as an exercise when you have to test part of a system which you have no control over.

For example I’ve had problems with integration to the third party systems that gave absolute nonsense errors about things nobody could think of at that time and it messed up the correct behavior of the primary system pretty badly. We could do nothing but to observe what happened. Almost no possible way to change input data by end user. It either happened or not. But it ended up as very useful experience about testing.

I discussed your exercise with my colleague Rasmus and we found at least few ways to test it without giving it direct input itself

1) Expectations – for example: What format does it show time? Is it understandable?
2) End-values – turnover of seconds/minutes/hours where, for example, 59 -> 00
3) Load testing – how much does it starts to lie in 10 seconds, 1 minute, 1 hour, 1 day, 1 month, 1 year etc compared to let’s say quantum clock or NIST-F1.
4) What time zone time is it showing? Can be tricky because look at India’s time zone for example.
5) How long does the battery last before it shuts down? or before it starts to “lie”? How rapidly does it start to lie when batteries are running lower?
6) How are the digits shown? Are they visible via any other angle? Are they too small or too big?

And few ways to have direct input without moving or touching the box itself
1) Put powerful-enough magnet next to the box to see what happens.
2) Set EMP-bomb off near the box to see what happens.

With best regards
Oliver V.

I’ve had the pleasure of meeting Oliver Vilson a couple of times.  I find his thinking to be incisive and insightful, and he has provided me with a couple of excellent stories.  The first thing that Oliver has done here is to help with transfer:  the idea that our odd little thought experiment about the clock can be transferred to real-world contexts.  Oliver is right:  no matter what we test, much of the time we interact with things that are black boxes, closed to us.  Sometimes we have to take the operation of the black boxes on trust.  Other time we have to test them, and as we’re testing them, we’re nastily constrained by our inability to control or influence the factors in the experiment.  Identifying those factors, getting around those constraints (to the degree that we can), and figuring out what and how to observe are all central to testing skill.

As I was reading, it also occurred to me that Oliver’s list of test ideas could provide a very nice example of the way to use the HICCUPPS(F) mnemonic for oracle heuristics and the CRUSSPICSTMPL mnemonic (!) for quality criteria in both a retrospective and in a generative way.

Let’s recall:  HICCUPPS(F) is a mnenomic by which we remember consistency heuristics for oracles, the principles or mechanisms by which we might recognize a problem.  We perceive no problem when all of the following heuristics hold, and we suspect a problem when any one of the following heuristics is violated:

History: The present version of the system is consistent with past versions of itself.
Image: The system is consistent with an image that the organization wants to project.
Comparable Products: The system is consistent with comparable systems.
Claims: The system is consistent with what important people say it’s supposed to be.
Users’ Expectations: The system is consistent with what users want.
Product: Each element of the system is consistent with comparable elements in the same system.
Purpose: The system is consistent with its purposes, both explicit and implicit.
Statutes: The system is consistent with applicable laws.
That’s the HICCUPPS part.  What’s with the (F)?  “F” stands for “Familiar problems”:
Familiarity: The system is not consistent with the pattern of any familiar problem.

That is, we suspect a problem in the item to be tested if we see some consistency with a problem that we’ve seen before.  We perceive “no problem” in the item to be tested when it doesn’t present a familiar problem to us while we’re testing.  I’ve written about an earlier version of this list of oracle heuristics here.

The quality criteria for a product are those aspects of it that would tend to please favoured users—customers, or people who benefit from the efficient and accurate work of that customer.  Quality critieria can also be seen as things that would stymie disfavoured users—users that we don’t like, such as intruders, black hat hackers, snoops, denial-of-service enthusiasts, thieves, and so forth.

In the Rapid Software Testing course, we talk about quality criteria in terms of a set of guideword heuristics—labels for groups of ideas that trigger deeper analysis.  Our quality criteria include:

  • Capability
  • Reliability
  • Usability
  • Security
  • Scalability
  • Performance
  • Installability
  • Compatibility
  • Supportability
  • Testability
  • Maintainability
  • Portability
  • Localizability

These criteria are part of the Heuristic Test Strategy Model, first developed by James Bach.

So let’s look at Oliver’s example in terms of the oracles that are being used and the quality criteria that are being questioned here. I’ll start by tagging each test idea with one or more oracle heuristics and one or more quality criteria.

1) Expectations – for example: What format does it show time? Is it understandable?

Oracles:  user expectations, (implicit) purpose.  Quality critieria:  Usability, localizability

2) End-values – turnover of seconds/minutes/hours where, for example, 59 -> 00

Oracles:  user expectations, relevant standards.  Quality critieria:  capability, reliability.

3) Load testing – how much does it starts to lie in 10 seconds, 1 minute, 1 hour, 1 day, 1 month, 1 year etc compared to let’s say quantum clock or NIST-F1.

Oracles:  History, comparable products; familiar problem (clocks gaining or losing time).  Quality criteria:  reliability, performance.

4) What time zone time is it showing? Can be tricky because look at India’s time zone for example.

Oracles:  User expectations; implicit purpose.  Quality criteria:  usability, localizability.

5) How long does the battery last before it shuts down? or before it starts to “lie”? How rapidly does it start to lie when batteries are running lower?

Oracles:  History, user expectations.  Quality criteria:  Reliability, performance

6) How are the digits shown? Are they visible via any other angle? Are they too small or too big?

Oracles:  User expectations, implicit purpose.  Quality criteria:  Usability, testability.

Now, I’d like you to notice a few things.  First, the classifications that I’ve set here are my own.  They’re arbitrary.  You can agree with them or disagree.  That doesn’t matter so much.

What matters more, I think, is the excerise in which we think about the relationships between the test ideas, the quality criteria, the oracles, and the risks.  For a product of any kind, there’s risk associated with the idea that a relevant quality criterion of some kind will not be fulfilled.    By using the oracle and quality criteria guidewords, we can become conscious of the chaing of logic or “framing” of the test, which in turn helps us to compose, edit, narrate, and justify the product story and the testing story.

After we’ve applied oracle and quality-criteria tags to each of Oliver’s test ideas, we might start to notice some things. First, he has used a number of diverse heuristics by which he might recognize a problem.  In doing that, he has also identified tests that would address a number of quality criteria.  He did that quite spontaneously, without specifications or other documentation.  That is, as we’ve emphasized so often, it’s perfectly possible to test with incomplete or insufficient or inconsistent or ambiguous or out-of-date information, because

When information is missing, testing is a great way to generate it.

In providing a set of test ideas as he’s done, Oliver also brings to the surface a number of ideas and assumptions about the clock.  Whether those assumptions turn out to be right or wrong isn’t so important.  What’s far more important is getting started in observing similarities and differences between the assumptions and the reality.  The process of doing this is central to generating knowledge about the product.  This is very similar to Karl Weick’s observation, responding to a story in which a platoon of soldiers had a map that didn’t match the territory, but found their way home anyway:

“This raises the intriguing possibility that when you’re lost, any old map will do … maybe when you are confused any old strategic plan will do. Strategic plans are a lot like maps. They animate and orient people. Once people begin to act, they generate tangible outcomes in some context, and this helps them discover what is occurring, what needs to be explained, and what should be done next. Managers keep forgetting that it is what they do, not what they plan, that explains their success. They keep giving credit to the wrong thing—namely, the plan—and having made this error, they then spend more time planning and less time acting. They are astonished when more planning improves nothing.”  (Karl Weick, Sensemaking in Organizations, p. 54-55)

Oliver’s list (implicitly) includes test ideas that take advantage of the user expectations, comparable product, purpose, standards, and familiar problem heuristics.  We can see and justify what’s there by comparing it with the HICCUPPS(F) list, and noting that inconsistency with those items would point us to a problem.  “User expectations” seems to dominate the list of oracle heuristics.  One question we could ask is “how might we refine or expand the set of user expectations that we have?”  Another question is “are our ideas about oracles overloaded in the direction of user expectations?”

We can use the HICCUPPS(F) list to see what’s there, but with the list we can also see what might be missing:  questions about history (is there another clock like this?  is this the first one that we’ve ever seen?); about image (who is our client here?  what are possible perceptions that the client might want to project?); claims (what do people say about this clock, anyway? how is it supposed to work?  is there any useful information, whether documented or not, on this?); product (can we learn anything about the product by observing parts of it that should be consistent with one another? does the product include any internal sanity checks?).

Similarly, we can use the quality criteria list to help us generate ideas based on the things that might threaten the value of the product.  We can see some test ideas based on capability, reliability, usability, performance, and localizability.  What other factors might we choose to consider?  Which ones might be more important in our testing mission?  Less important?  Are there any that are crucial, or irrelevant?

Are there security concerns related to the clock?  Why is it in this box?  What would happen if someone were to get inside?  Could the functioning of the clock be affected by heat, cold, light, acceleration, bombardment?  What are the boundaries between the clock, its containers, and other systems?   Scalability:  is this a prototype clock, or are there going to be many like it?  Could it be used for very short-term or long-term measurements of time?  What if large numbers of people need access to the information it provides?  Installability:  How did it get there?  Can it be updated?  How would we get rid of it?  Compatibility:  does the clock interface with anything else?  How?  Supportability:  What do we do if someone has a problem with the clock? Can we get at it then?  How?  And if we can get at it then, why not now?  Testability:  You say that there is no way to provide input to the clock.  Really?  Is there some other way that you might be interpreting “input”?  What interfaces might be available?  What reference material?  What oracles?  Does the clock produce any information other than its display?  Are there any markings on it?  Guides to its internals?  Maintainability:  Supposedly I’m testing this because you want to be able to identify problems with it.  Do you want to be able to fix those problems?  Who would be responsible for doing that?  Is there source code or are there architectural drawings for the program that runs the clock?  Portability:  does that program work on other clocks?  What information can we learn about this clock that might be transferrable to other clocks?

As tools to help us see what’s there and see what’s missing, we can use the HICCUPPS(F) list to evaluate our oracles.  We can use the quality criteria list to evaluate our requirements coverage and make decisions about it.  At some point, we’ll also talk about product elements that point to coverage ideas.  We’ll also talk about the project environment that influences our context and our choices, both of which evolve over time.  But for now, that’s for later.  Thank you to Oliver for providing an excellent example on which, in this space, we could do a little something like transpection.

4 Responses to “Transpection Transpected”

  1. Veretax says:

    Another Great Blog Entry Michael, and thanks for pointing out James Bach’s “Heuristic Test Strategy Model” only in the last 18 months have I begun to become more than what I call a ‘dumb’ tester, but actually trying to be more sapient in how I go about what I do. Blogs like this one go a long way to filling gaps in my knowledge on the discipline of Testing. Thank you, very much.

    My pleasure. Thank you for the kind words. Keep up the study and the practice!

  2. [...] list is copied from Michael Bolton’s recent blog post, I hope he doesn’t [...]

  3. [...] “Transpection Transpected”, by Michael Bolton [...]

  4. Karlo Smid says:

    Hi Michael!

    Like in Inception movie, your Rapid Software Testing class held in Las Vegas on STP conference last year subconsciously planted in my mind a lot of ideas how to become better tester. I have been reading your blog since than and those ideas, one by one, have been becoming conscious part of my mind.

    Thanks!

Leave a Reply