How To Get What You Want From Testing (for Managers): The Q & A

November 22nd, 2017

On November 21, 2017, I delivered a webinar as part of the regular Technobility Webinar Series presented by my friend and colleague Peter de Jager. The webinar was called “How To Get What You Want from Testing (for Managers)”, and you can find it here.

Alas, we had only a one-hour slot, and there were plenty of questions afterwards. Each of the questions I received is potentially worthy of a blog post on its own. Here, though, I’ll try to summarize. I’ll give brief replies, suggesting links to where I might have already provided some answers, and offering an IOU for future blog posts if there are further questions or comments.

If the CEO doesn’t appreciate testing and QA enough in order to develop application of high quality how QA Manager can get battle?

First, in my view, testers should think seriously about whether they’re in the quality assurance business at all. We don’t run the business, we don’t manage the product, and we can’t assure quality. It’s our job to shine light on the product we’ve got, so that people who do run the business can decide whether it’s the product they want.

Second, the CEO is also the CQAO—the Chief Quality Assurance Officer. If there’s a disconnect between what you and the CEO believe to be important, at least one of two things is probably true: the CEO knows or values something you don’t, or you know or value something the CEO doesn’t. The knowledge problem may be resolved through communication—primarily conversation. If that doesn’t solve the disconnect, it’s more likely a question of a different set of values. If you don’t share those, your options are to adopt the CEO’s values, or to accept them, or to find a CEO whose values you do share.

How to change managers to focus on the whole (e.g. lead time) instead of testing metrics (when vendor X is coding and vendor Y is testing?

As a tester, it’s not my job to change a manager, as such. I’m not the manager’s manager, and I don’t change people. I might try to steer the manager’s focus to some degree, which I can do by pointing to problems and to risks.

Is there a risk associated with focusing on testing metrics instead of on the whole? My answer, based on experience, is Yes; many managers have trouble observing software and software development. If you also believe that there’s risk associated with metrics fixation, have you made those risks clear to the manager? If your answer to that question is “I’ve tried, but I haven’t been successful,” get in touch with me and perhaps I can be of help.

One way I might be able to help right away is to recommend some reading: “Software Engineering Metrics: What Do They Measure and How Do We Know” (Kaner and Bond); by Measuring and Managing Performance in Organizations, by Robert Austin; and by Jerry Weinberg’s Quality Management series, especially Quality Software Management, Vol. 2: First Order Measurement (also available as two e-books, “How to Observe Software Systems” and “Responding to Significant Software Events“). Those works outline the problems, and suggest some solutions.

An important suggestion related to this is to offer, to agree upon, and to live up to a set of commitments, along with an understanding of what roles involve.

How do you ensure testers put under the microscope the important bugs and ignore the trivial stuff?

My answers here include: training, continuous feedback, and continuous learning; continuous development, review, and refinement of risk models; providing testers with knowledge of and access to stakeholders that matter. Good models of quality criteria, product elements, and test techniques (the Heuristic Test Strategy Model is an example) help a lot, too.

Suggestions for the scalability of regression testing, specifically when developers say “this touches everything.”

When the developer says “this touches everything”, one interpretation is “I’m not sure what this touches” (since “this touches everything” is, at best, rarely true). “I’m not sure what this touches,” in turn, really means “I don’t actually understand what I’m modifying”. That interpretation (like others revealed in the “Testing without Machinery” chapter in Perfect Software and Other Illusions About Testing) points to a Severity 0 product and project risk.

So this is not simply a question about the scalability of regression testing. This is a question about the scalability of architecture, design, development, and programming. These are issues for the whole team—including managers—to address. The management suggestion is to steer things towards getting the developer’s model of the code more clear, refactoring the code and the design until it’s comprehensible.

I’ve done some presentations on regression testing before, and some blog posts here and here (although, ten years later, we no longer talk about “manual tests” and “automated tests”; we do talk about testing and checking.)

How can we avoid a lot of back and forth, e.g. submitting issues piecemeal vs. submitting in a larger set.

One way would be to practice telling a three-part testing story and delivering the news. That said, I’m not sure I have enough context to help you out. Please feel free to leave a comment or otherwise get in touch.

How much of what you teach can expand to testing in other contexts?

Plenty of it, I’d say. In Rapid Software Testing, we take an expansive view of testing; evaluating a product by learning about it through exploration and experimentation, which includes to some degree: questioning, study, modeling, observation, inference, and plenty of other stuff. That’s applicable to lots of contexts. We also teach applied critical thinking—that is, thinking about thinking with the intention of avoiding being fooled. That’s extensible to lots of domains.

I’m curious about the other contexts you might have in mind, and why you ask. If I can be of assistance, please let me know.

I agree 100% with everything you say, and I feel daily meetings are not working, because everybody says what they did and what they will do and there is no talk about the state of the product.

That’s a significant test result—another example of the sort of thing that Jerry Weinberg is referring to in the “Testing without Machinery” chapter in Perfect Software and Other Illusions About Testing. I’m not sure what your role is. As a manager, I would mandate reports on the status of the product as part of the daily conversation. As a tester, I would provide that information to rest of the team, since that’s my job. What I found about the product, what I did, and what happened when I did are part of that three-part testing story.

Perhaps test cases are used as they are quantifiable and helps organise the testing of parts of the product.

Sure. Email messages are also quantifiable and they also help to organise the testing of parts of the product. Yet we don’t evaluate a business by the number of email messages it produces. Let’s agree to think all the way through this.

“Writing” is not “factory work” either, but as you say, many, many tools help check spelling, grammar, awkward phrases, etc.

Of course. As I said in a recent blog post, “A good editor uses the spelling checker, while carefully monitoring and systematically distrusting it.” We do not dispute the power and usefulness of tools. We do remain skeptical about what will happen when tools are put in the hands of those who don’t have the skill or wisdom to apply them appropriately.

I appreciate the “testing/checking” distinction, but wish you had applied the labels in the other way, given the rise of ‘the checklist” as a crucial tool in both piloting and surgery—application of the checklist is NOT algorithmic, when done as recommended.

Indeed. I’m confident, though, that smart people can keep track of the differences between “checking” or “a check” and a “checklist”, just as they can keep track of the differences between “testing” or “a test” and “testimony”.

I would include “breaking” the product (to see if it fails gracefully and to find its limits).

I prefer to say “stressing” the product, and I agree absolutely with the goal to discover its limits, whether it fails gracefully, and what happens when it doesn’t.

“Breaking” is a common metaphor, to be sure. It worries me to some degree because of the public relations issue: ““The software was fine until the testers broke it.” “We could ship our wonderful product on time if only the testers would stop breaking it.” “Normal customers wouldn’t have problems our wonderful product; it’s just that the testers break it.” “There are no systemic management or development problems that have been leading to problems in the product. Nuh-uh. No way. The testers broke it.

What about problems with the customer/user? Uninformed use or, even more problematical, unexpected use (especially if many users do the unexpected — using Lotus 123 as a word processor, for instance). I would argue you need to “model the human (including his or her context of machine, attempted use, level of understanding, et al.).

And I would not argue with you. I’d agree, especially with respect to the tendency of users to do surprising things.

I think stories about the product should also include items about what the product did right (or less bad than expected), especially items that were NOT the focus of the design (ie, how well does Lotus 123 work as a word processor?).

They could certainly include that to some degree. After all, the first items in our list of what to report in the product story are what it is and what it does, presumably successfully. Those things are important, and it’s worth reporting good news when there’s some available. My observation and experience suggests that reporting the good news is less urgent than noting what the product doesn’t do, or doesn’t do well, relative to what people want it to do, or hope that it does. It’s important, to me, that the good news doesn’t displace the important bad news. We’re not in the quality reassurance business.

So far, you appear to be missing much of the user side– what will they know, what context will they be in, what will they be doing with the product? (A friend was a hero for getting over-sized keyboards for football players to deal with all the typing problems that appeared in the software but were the consequence of huge fingers trying to hit normal-sized keytops.

That’s a great story.

I am really missing any mention of users, especially in a age of sensitivity to disabilities like lack of sight, hearing, manual dexterity, et al. It wasn”t until 11:45 or so that I heard about “use” and “abuse.” Again, all good stuff, but missing a huge part of wht I think is the source of risk.

I was mostly talking to project managers and their relationships with testers in this talk (and at that, I went over time). In other talks (and especially when talking to testers), I would underscore the importance of modeling the end user’s desires and actions.

Please in the blog, talk about risk from unexpected use (abuse and unusual uses).

The short blog post that I mentioned above talks about that a bit. Here is a more detailed one. Lately, I’ve been enjoying Gojko Adzic’s book Humans vs. Computers, and can recommend it.

The End of Manual Testing

November 8th, 2017

Testers: when we speak of “manual testing”, we help to damage the craft.

That’s a strong statement, but it comes from years of experience in observing people thinking and speaking carelessly about testing. Damage arises when some people who don’t specialize in testing (and even some who do) become confused by the idea that some testing is “manual” and some testing is “automated”. They don’t realize that software development and the testing within it are design studio work, not factory work. Those people are dazzled by the speed and reliability of automation in manufacturing. Very quickly, they begin to fixate on the idea that testing can be automated. Manual bad; automated good.

Soon thereafter, testers who have strong critical thinking skills and who are good at finding problems that matter have a hard time finding jobs. Testers with only modest programming skills and questionable analytical skills get hired instead, and spend months writing programs that get the machine to press its own buttons. The goal becomes making the automated checks run smoothly, rather than finding problems that matter to people. Difficulties in getting the machine to operate the product take time away from interaction with and observation of the product. As a result, we get products that may or may not be thoroughly checked, but that have problems that diminish or even destroy value.

(Don’t believe me? Here’s an account of testing from the LinkedIn engineering blog, titled “How We Make Our UI Tests Stable“. It’s wonderful that LinkedIn’s UI tests (checks, really) are stable. Has anyone inside LinkedIn noticed that LinkedIn’s user interface is a hot, confusing, frustrating, unusable mess? That LinkedIn Groups have lately become well-nigh impossible to find? That LinkedIn rudely pops up a distracting screen after each time you’ve accepted a new invitation to connect, interrupting your flow, rather than waiting until you’ve finished accepting or rejecting invitations? That these problems dramatically reduce the desire of people to engage with LinkedIn and see the ads on it?)

Listen: there is no “manual testing”; there is testing. There are no “manual testers”; there are testers. Checking—an element of testing, a tactic of testing—can be automated, just as spell checking can be automated. A good editor uses the spelling checker, while carefully monitoring and systematically distrusting it. We do not call spell checking “automated editing”, nor do we speak of “manual editors” and “automated editors”. Editors, just “editors”, use tools.

All doctors use tools. Some specialists use or design very sophisticated tools. No one refers to those who don’t as “manual doctors”. No one speaks of “manual researchers”, “manual newspaper reporters”, “manual designers”, “manual programmers”, “manual managers”. They all do brain- and human-centred work, and they all use tools.

Here are seven kinds of testers. The developer tests as part of coding the product, and the good ones build testability into the product, too. The technical tester builds tools for herself or for others, uses tools, and in general thinks of her testing in terms of code and technology. The adminstrative tester focuses on tasks, agreements, communication, and getting the job done. The analytical tester develops models, considers statistics, creates diagrams, uses math, and applies these approaches to guide her exploration of the product. The social tester enlists the aid of other people (including developers) and helps organize them to cover the product with testing. The empathic tester immerses himself in the world of the product and the way people use it. The user expert comes at testing from the outside, typically as a supporting tester aiding responsible testers.

Every tester interacts with the product by various means, perhaps directly and indirectly, maybe at high levels or low levels, possibly naturalistically or artificially. Some testers are, justifiably, very enthusiastic about using tools. Some testers who specialize in applying and developing specialized tools could afford to develop more critical thinking and analytical skill. Correspondingly, some testers who focus on analysis or user experience or domain knowledge seem to be intimidated by technology. It might help everyone if they could become more familiar and more comfortable with tools.

Nonetheless, referring to any of the testing skill sets, mindsets, and approaches as “manual” spectacularly misses the mark, and suggests that we’re confused about the key body part for testing: it’s the brain, rather than the hands. Yet testers commonly refer to “manual testing” without so much as a funny look from anyone. Would a truly professional community play along, or would it do something to stop that?

On top of all this, the “manual tester” trope leads to banal, trivial, clickbait articles about whether “manual testing” has a future. I can tell you: “manual testing” has no future. It doesn’t have a past or a present, either. That’s because there is no manual testing. There is testing.

Instead of focusing on the skills that excellent testing requires, those silly articles provide shallow advice like “learn to program” or “learn Selenium”. (I wonder: are these articles being written by manual writers or automated writers?) Learning to program is a good thing, generally. Learning Selenium might be a good thing too, in context. Thank you for the suggestions. Let’s move on. How about we study how to model and analyze risk? More focus on systems thinking? How about more talk about describing more kinds of coverage than code coverage? What about other clever uses for tools, besides for automated checks?

(Some might reply “Well, wait a second. I use the term ‘manual testing’ in my context, and everybody in my group knows what I mean. I don’t have a problem with saying ‘manual testing’.” If it’s not a problem for you, I’m glad. I’m not addressing you, or your context. Note, however, that your reply is equivalent to saying “it works on my machine.”)

Our most important goal as testers, typically, is to learn about problems that threaten value in our products, so that our clients can deal with those problems effectively. Neither our testing clients nor people who use software divide the problems they experience into “manual bugs” and “automated bugs”. So let’s recognize and admire technical testers, testing toolsmiths and the other specialists in our craft. Let us not dismiss them as “manual testers”. Let’s put an end to “manual testing”.

(At Least) Four Things for Testers To Do in Planning Meetings

October 18th, 2017

There’s much talk these days of DevOps, and Agile development, and “shift left”. Apparently, in these process models, it’s a revelation that testers can do more than test a built product, and that testers can and should be involved at every step of development.

In Rapid Software Testing, that’s not exactly news. From the beginning, we’ve rejected the idea that the product has to be complete, or has to pass some kind of “quality gate” or meet “acceptance criteria” before we start testing. We welcome the opportunity to test anything that anyone is willing to give us. We’ll happily do testing at any time from the moment someone has an idea for a product until long after the product has been released.

When testers are invited to planning meetings, there’s clearly no product to test. So what are we there for?

We’re there to learn. Testing is evaluating a product by learning about it through exploration and experimentation. At the meeting, there is a product to test. Running code is not the only kind of product we can test—not by a long shot. Ideas, designs, documents, drawings, and prototypes are products too. We can explore them, and perform thought experiments on them—and we can learn about them and evaluate them.

At the meeting, we’re there to learn about the product; to learn about the technology; to learn about the contexts in which the product will be used; to learn about plans for building the product. Our role is to become aware of all of the sources of information that might aid in our testing, and in development of the product generally. We’re there to find out about risks that threaten the value of the product in the short and long term, and about problems that might threaten the on-time, successful completion of the product.

We’re there to advocate for testability. Testability might happen by accident, without our help. It’s the role of a responsible tester to make sure that testability happens intentionally, by design. Note that testability is not just about stuff that’s intrinsic to the product. There are factors in the project, in our notions of value, and in our understanding of the risk gap that influence testability. Testability is also subjective with respect to us, our knowledge and skills, and our relationship to the team. So part of our jobs during preparation for development is to ask for the help we’ll need to make ourselves more powerful testers.

We’re there to challenge. Other people are in roles oriented towards building the product. They are focused on synthesis, and envisioning success. If they’re designers, they might be focused on helping the user to accomplish a task, on efficiency, or effectiveness, or on esthetics. If they’re business people, they might be focused on accomplishing some business goal, or meeting a deadline. Developers are often focused more on the details than on the big pictures. All of those people may be anxious to declare and meet a definition of “done”.

The testing role is to think critically about the product and the project; to ask how we might be fooling ourselves. We’re tilted towards asking good questions instead of getting “the right answer”; towards analysis more than synthesis; towards skepticism and suspicion more than optimism; towards anticipating problems more than seeking solutions. We can do those other things, but when we do, we pop for that moment out of a testing role and into a building role.

As testers, we’re trying to notice problems in what people are talking about in the meetings. We’re trying to identify obstacles that might hinder the user’s task; ways in which the product might be ineffective, inefficient, or unappealing. We’re trying to recognize how the business goal might not be met, or how the deadline could be blown. We’re alternating between small details and the big picture. We’re trying to figure out how our definition of done might be inadequate; how we might be fooling ourselves into believing we’re done when we’re not. We’re here to challenge the idea that something is okay when it might not be okay.

We’re there to establish our roles as testers. A role is a heuristic that helps in managing time, focus, and responsibility. The testing role is a commitment to perform valuable and necessary services: to focus on discovering problems, ideally early when they’re small, so that they can be prevented from turning into bigger problems later; to build a product and a project that affords rapid, inexpensive discovery and learning; and to challenge the ideas and artifacts that represent what we think we know about the product and its design. These tasks are socially, psychologically, emotionally, and politically difficult. Unless we handle them gracefully, our questioning, problem-focused role will be seen as merely disruptive, rather than an essential part of the process of building something excellent.

In Rapid Software Testing, we don’t claim that someone must be in the testing role, or must have the job title “tester”. We do believe that having someone responsible for the testing role helps to put focus on the task of providing helpful feedback. This should be a service to the project, not an obstacle. It requires us to maintain close social distance while maintaining a good deal of critical distance.

Of course, the four things that I’ve mentioned here can be done in any development model. They can be done not only in planning meetings, but at any time when we are engaging with others, at any time in the product’s development, at any level of granularity or formality. DevOps and Agile and “shift left” are context. Testing is always testing.

Some related posts:

What Exploratory Testing Is Not (Part 2): After-Everything-Else Testing

Exploratory Testing and Review

Exploratory Testing is All Around You

Testers Don’t Prevent Problems

What Is A Tester?

Testing is…

RST Slack Channel

October 8th, 2017

Over the last few months, we’ve been inviting people from the Rapid Software Testing class to a Slack channel. We’re now opening it up to RST alumni.

If you’ve taken RST in the past, you’re welcome to join. Click here (or email me at slack@developsense.com), let me know where and when you took the class, and with which instructor. I’ll reply with an invitation.

Dev*Ops

October 4th, 2017

A while ago, someone pointed out that Development and Operations should work together in order to fulfill the needs and goals of the business, and lo, the DevOps movement was born. On the face of it, that sounds pretty good… except when I wonder: how screwed up must things have got for that to sound like a radical, novel, innovative idea?

Once or twice, I’ve noticed people referring to DevTestOps, which seemed to be a somewhat desperate rearguard attempt to make sure that Testing doesn’t get forgotten in the quest to fulfill the needs and goals of the business. And today—not for the first time— I noticed a reference to DevSecOps, apparently suggesting that Security is another discipline that should also be working with other groups in order to fulfill the needs and goals of the business.

Wow! This is great! Soon everyone who is employed by a business will be working together to fulfill the needs and goals of the business! Excelsior!

So, in an attempt to advance this ground-breaking, forward-thinking, transformative concept, I hereby announce the founding of a new movement:

DevSecDesTestDocManSupSerFinSalMarHumCEOOps

Expect a number of conferences and LinkedIn groups about it real soon now, along with much discussion about how to shift it left and automate it.

(How about we all decide from the get-go that we’re all working together, collaboratively, using tools appropriately, supporting each other to fulfill the needs and goals of the business? How about we make that our default assumption?)

Deeper Testing (3): Testability

September 29th, 2017

Some testers tell me that they are overwhelmed at the end of “Agile” sprints. That’s a test result—a symptom of unsustainable pace. I’ve written about that in a post called “Testing Problems are Test Results“.

In Rapid Software Testing, we say that testing is evaluating a product by learning about it through exploration and experimentation, which includes to some degree: questioning, study, modeling, observation, inference, and plenty of other stuff—perhaps including the design and programming of automated checks, and analysis of the outcome after they’ve been run. Note that the running of these checks—the stuff that the machine does—does not constitute testing, just as the compiling of the product—also the stuff that the machine does—does not constitute programming.

If you agree with this definition of testing, when someone says “We don’t have enough time for testing,” that literally means “we don’t have enough time to evaluate our product.” In turn, that literally means “we don’t have time to learn about this thing that (presumably) we intend to release.” That sounds risky.

If you believe, as I do, that evaluating a product before you release it is usually a pretty good idea, then it would probably also be a good idea to make testing as fast and as easy as possible. That is the concept of testability.

Most people think of a testability quite reasonably in terms of visibility and controllability in the product. Typically visibility refers to log files, monitoring, and other ways of seeing what the product is up to; controllability usually refers to interfaces that allow for easy manipulation of the product, most often via scriptable application programming interfaces (APIs).

It’s a good thing to have logging and scriptable interfaces, testability isn’t entirely a property of the product. Testability is a set of relationships between the product, the team, the tester, and the context in which the product is being developed and maintained. Changes in any one of these can make a product more or less testable. In Rapid Software Testing, we refer to the set of these relationships as practical testability. This breaks down into five other subcategories that overlap and interact to some degree.

Epistemic testability. (Yes, it’s a ten-dollar word. Epistemology is the study of how we know what we know. Excellent testing requires us to study epistemology if we want to avoid being fooled about what we know.) As we’re building a product, there’s a risk gap, the difference between what we know and what we need to know. A key purpose of testing is to explore the risk gap, shining light on the things that we don’t know, and identifying places that are beyond the current extent of our knowledge. A product that we don’t know well in a test space that we don’t know much about tends to make testing harder or slower.

Value-related testability. It’s easier to test a product when we know something about how people might use it and how they might intend get value from it. That means understanding people’s goals and purposes, and how the product is designed to fulfill and support them. It means considering who matters&mash;not just end users or customers, but also anyone who might have a stake in the success of the product. It means learning about dimensions of quality that might be more important or not so important to them.

Intrinsic testability. It’s easier to test a product when it is designed to help us understand its behaviour and its state. When the parts of the product are built cleanly and simply, testing as we go, testing the assembled parts will be easier. When we have logging and visibility into the workings of the product, and when we have interfaces that allow us to control it using tools, we can induce more variation that helps to shake the bugs out.

Project-related testability. It’s easier to test when the project is organized to support evaluation, exploration, experimentation, and learning. Testing is faster and easier when testers have access to other team members, to information, to tools, and to help.

Subjective testability. The tester is at the centre of the relationships between the product, the project, and the testing mission. Testing will be faster, easier, and better when the tester’s skills—and testing skill on the team—are sufficient to deal with the situation at hand.

Each one of these dimensions of testability fans out into specific ideas for making a product faster and easier to test. You can find a set of ideas and guidewords in a paper called Heuristics of Software Testability.

On an Agile team, a key responsibility for the tester is to ask and advocate for testability, and to highlight things that make testing harder or slower. Testability doesn’t come automatically. Teams and their managers are often unaware of obstacles. Programmers may have created unit checks for the product, which may help to reduce certain kinds of coding and design errors. Still, those checks will tend to be focused on testing functions deep in the code. Testability for other quality criteria—usability, compatibility, performance, installabilty, or compatibility, to name only a few—may not get much attention without testers speaking up for them.

A product almost always gets bigger and more complex with every build. Testability helps us to keep the pace of that growth sustainable. A less testable product contributes to an unsustainable pace. Unsustainable pace ratchets up the risk of problems that threaten the value of the product, the project, and the business.

So here’s a message for the tester to keep in front of the team during that sprint planning meeting, during the sprint, and throughout the project:

Let’s remember testability. When testing is harder or slower, bugs have more time and more opportunity to stay hidden. The hidden bugs are harder to find than any bugs we’ve found so far—otherwise we would have found them already. Those bugs—deeper, less obvious, more subtle, more intermittent—may be far worse than any bugs we’ve found so far. Right now, testability is not as good as it could be. Is everyone okay with that?

Deeper Testing (2): Automating the Testing

April 22nd, 2017

Here’s an easy-to-remember little substitution that you can perform when someone suggests “automating the testing”:

“Automate the evaluation
and learning
and exploration
and experimentation
and modeling
and studying of the specs
and observation of the product
and inference-drawing
and questioning
and risk assessment
and prioritization
and coverage analysis
and pattern recognition
and decision making
and design of the test lab
and preparation of the test lab
and sensemaking
and test code development
and tool selection
and recruiting of helpers
and making test notes
and preparing simulations
and bug advocacy
and triage
and relationship building
and analyzing platform dependencies
and product configuration
and application of oracles
and spontaneous playful interaction with the product
and discovery of new information
and preparation of reports for management
and recording of problems
and investigation of problems
and working out puzzling situations
and building the test team
and analyzing competitors
and resolving conflicting information
and benchmarking…”

And you can add things to this list too. Okay, so maybe it’s not so easy to remember. But that’s what it would mean to automate the testing.

Use tools? Absolutely! Tools are hugely important to amplify and extend and accelerate certain tasks within testing. We can talk about using tools in testing in powerful ways for specific purposes, including automated (or “programmed“) checking. Speaking more precisely costs very little, helps us establish our credibility, and affords deeper thinking about testing—and about how we might apply tools thoughtfully to testing work.

Just like research, design, programming, and management, testing can’t be automated. Trouble arises when we talk about “automated testing”: people who have not yet thought about testing too deeply (particularly naïve managers) might sustain the belief that testing can be automated. So let’s be helpful and careful not to enable that belief.

Deeper Testing (1): Verify and Challenge

March 16th, 2017

What does it mean to do deeper testing? In Rapid Software Testing, James Bach and I say:

Testing is deep to the degree that it has a probability of finding rare, subtle, or hidden problems that matter.

Deep testing requires substantial skill, effort, preparation, time, or tooling, and reliably and comprehensively fulfills its mission.

By contrast, shallow testing does not require much skill, effort, preparation, time, or tooling, and cannot reliably and comprehensively fulfill its mission.

Expressing ourselves precisely is a skill. Choosing and using words more carefully can sharpen the ways we think about things. In the next few posts, I’m going to offer some alternative ways of expressing the ideas we have, or interpreting the assignments we’ve been given. My goal is to provide some quick ways to achieve deeper, more powerful testing.

Many testers tell me that their role is to verify that the application does something specific. When we’re asked to that, it can be easy to fall asleep. We set things up, we walk through a procedure, we look for a specific output, and we see what we anticipated. Huzzah! The product works!

Yet that’s not exactly testing the product. It can easily slip into something little more than a demonstration—the kinds of things that you see in a product pitch or a TV commercial. The demonstration shows that the product can work, once, in some kind of controlled circumstance. To the degree that it’s testing, it’s pretty shallow testing. The product seems to work; that is, it appears to meet some requirement to some degree.

If you want bugs to survive, don’t look too hard for them! Show that the product can work. Don’t push it! Verify that you can get a correct result from a prescribed procedure. Don’t try to make the product expose its problems.

But if you want to discover the bugs, present a challenge to the product. Give it data at the extremes of what it should be able to handle, just a little beyond, and well beyond. Stress the product out; overfeed it, or starve it of something that it needs. See what happens when you give the product data that it should reject. Make it do things more complex than the “verification” instructions suggest. Configure the product (or misconfigure it) in a variety of ways to learn how it responds. Violate an explicitly stated requirement. Rename or delete a necessary file, and observe whether the system notices. Leave data out of mandatory fields. Repeat things that should only happen once. Start a process and then interrupt it. Imagine how someone might accidentally or maliciously misuse the product, and then act on that idea. While you’re at it, challenge your own models and ideas about the product and about how to test it.

We can never prove by experiment—by testing—that we’ve got a good product; when the product stands up to the challenge, we can only say that it’s not known to be bad. To test a product in any kind of serious way is to probe the extents and limits of what it can do; to expose it to variation; to perform experiments to show that the product can’t do something—or will do something that we didn’t want it to do. When the product doesn’t meet our challenges, we reveal problems, which is the first step towards getting them fixed.

So whenever you see, or hear, or write, or think “verify”, try replacing it with “challenge“.

The Test Case Is Not The Test

February 16th, 2017

A test case is not a test.

A recipe is not cooking. An itinerary is not a trip. A score is not a musical performance, and a file of PowerPoint slides is not a conference talk.

All of the former things are artifacts; explicit representations. The latter things are human performances.

When the former things are used without tacit knowledge and skill, the performance is unlikely to go well. And with tacit knowledge and skill, the artifacts are not central, and may not be necessary at all.

The test case is not the test. The test is what you think and what you do. The test case may have a role, but you, the tester, are at the centre of your testing.

Further reading: http://www.satisfice.com/blog/archives/1346

Drop the Crutches

January 5th, 2017

This post is adapted from a recent blast of tweets. You may find answers to some of your questions in the links; as usual, questions and comments are welcome.

Update, 2017-01-07: In response to a couple of people asking, here’s how I’m thinking of “test case” for the purposes of this post: Test cases are formally structured, specific, proceduralized, explicit, documented, and largely confirmatory test ideas. And, often, excessively so. My concern here is directly proportional to the degree to which a given test case or a given test strategy emphasizes these things.

crutches

I had a fun chat with a client/colleague yesterday. He proposed—and I agreed—that test cases are like crutches. I added that the crutches are regularly foisted on people who weren’t limping to start with. It’s as though before the soccer game begins, we hand all the players a crutch. The crutches then hobble them.

We also agreed that test cases often lead to goal displacement. Instead of a thorough investigation of the product, the goal morphs into “finish the test cases!” Managers are inclined to ask “How’s the testing going?” But they usually don’t mean that. Instead, they almost certainly mean “How’s the product doing?” But, it seems to me, testers often interpret “How’s the testing going?” as “Are you done those test cases?”, which ramps up the goal displacement.

Of course, “How’s the testing going?” is an important part of the three-part testing story, especially if problems in the product or project are preventing us from learning more deeply about the product. But most of the time, that’s probably not the part of story we want to lead with. In my experience, both as a program manager and as a tester, managers want to know one thing above all:

Are there problems that threaten the on-time, successful completion of the project?

The most successful and respected testers—in my experience—are the ones that answer that question by actively investigating the product and telling the story of what they’ve found. The testers that overfocus on test cases distract themselves AND their teams and managers from that investigation, and from the problems investigation would reveal.

For a tester, there’s nothing wrong with checking quickly to see that the product can do something—but there’s not much right—or interesting—about it either. Checking seems to me to be a reasonably good thing to work into your programming practice; checks can be excellent alerts to unwanted low-level changes. But when you’re testing, showing that the product can work—essentially, demonstration—is different from investigating and experimenting to find out how it does (or doesn’t) work in a variety of circumstances and conditions. Sometimes people object saying that they have to confirm that the product works and that they have don’t have time to investigate. To me, that’s getting things backwards. If you actively, vigorously look for problems and don’t find them, you’ll get that confirmation you crave, as a happy side effect.

No matter what, you must prepare yourself to realize this:

Nobody can be relied upon to anticipate all of the problems that can beset a non-trivial product.

fractalWe call it “development” for a reason. The product and everything around it, including the requirements and the test strategy, do not arrive fully-formed. We continuously refine what we know about the product, and how to test it, and what the requirements really are, and all of those things feed back into each other. Things are revealed to us as we go, not as a cascade of boxes on a process diagram, but more like a fractal.

The idea that we could know entirely what the requirements are before we’ve discussed and decided we’re done seems like total hubris to me. We humans have a poor track record in understanding and expressing exactly what we want. We’re no better at predicting the future. Deciding today what will make us happy ten months—or even days—from now combines both of those weaknesses and multiplies them.

For that reason, it seems to me that any hard or overly specific “Definition of Done” is antithetical to real agility. Let’s embrace unpredictability, learning, and change, and treat “Definition of Done” as a very unreliable heuristic. Better yet, consider a Definition of Not Done Yet: “we’re probably not done until at least These Things are done”. The “at least” part of DoNDY affords the possibility that we may recognize or discover important requirements along the way. And who knows?—we may at any time decide that we’re okay with dropping something from our DoNDY too. Maybe the only thing we can really depend upon is The Unsettling Rule.

Test cases—almost always prepared in advance of an actual test—are highly vulnerable to a constantly shifting landscape. They get old. And they pile up. There usually isn’t a lot of time to revisit them. But there’s typically little need to revisit many of them either. Many test cases lose relevance as the product changes or as it stabilizes.

Many people seem prone to say “We have to run a bunch of old test cases because we don’t know how changes to the code are affecting our product!” If you have lost your capacity to comprehend the product, why believe that you still comprehend those test cases? Why believe that they’re still relevant?

Therefore: just as you (appropriately) remain skeptical about the product, remain skeptical of your test ideas—especially test cases. Since requirements, products, and test ideas are subject to both gradual and explosive change, don’t overformalize or otherwise constrain your testing to stuff that you’ve already anticipated. You WILL learn as you go.

Instead of overfocusing on test cases and worrying about completing them, focus on risk. Ask “How might some person suffer loss, harm, annoyance, or diminished value?” Then learn about the product, the technologies, and the people around it. Map those things out. Don’t feel obliged to be overly or prematurely specific; recognize that your map won’t perfectly match the territory, and that that’s okay—and it might even be a Good Thing. Seek coverage of risks and interesting conditions. Design your test ideas and prepare to test in ways that allow you to embrace change and adapt to it. Explain what you’ve learned.

Do all that, and you’ll find yourself throwing away the crutches that you never needed anyway. You’ll provide a more valuable service to your client and to your team. You and your testing will remain relevant.

Happy New Year.

Further reading:

Testing By Percentages
Very Short Blog Posts (11): Passing Test Cases
A Test is a Performance
Test Cases Are Not Testing: Toward a Culture of Test Performance” by James Bach & Aaron Hodder (in http://www.testingcircus.com/documents/TestingTrapeze-2014-February.pdf#page=31)