Blog Posts from April, 2021

“Manual Testing”: What’s the Problem?

Tuesday, April 27th, 2021

I used to speak at conferences. For the HUSTEF 2020 conference, I had intended to present a talk called “What’s Wrong with Manual Testing?” In the age of COVID, we’ve all had to turn into movie makers, so instead of delivering a speech, I delivered a video instead.

After I had proposed the talk, and it was accepted, I went through a lot of reflection on what the big deal really was. People have been talking about “manual testing” and “automated testing” for years. What’s the problem? What’s the point? I mulled this over, and video contains some explanations of why I think it’s an important issue. I got some people — a talented musician, an important sociologist, a perceptive journalist and systems thinker, a respected editor and poet, and some testers — to help me out.

In the video, I offer some positive alternatives to “manual testing” that are much less ambiguous, more precise, and more descriptive of what people might be talking about: experiential testing (which we could contrast with “instrumented testing”; exploratory testing (which we have already contrasted with “scripted testing”; attended testing (which we could contrast with “unattended testing”); and there are some others. More about all that in a future post.

I also propose how it came to be that important parts of testing — the rich, cognitive, intellectual social, process of evaluating a product by learning about it through experiencing, exploring and experimenting — came to be diminished and pushed aside by obsessive, compulsive fascination with automated checking.

But there’s a much bigger problem that I didn’t discuss in the video.

You see, a few days before I had to deliver the video, I was visiting an online testing forum. I read a question from a test manager who wanted to interview and qualify “manual testers”. I wanted provide a helpful reply, and as part of that, I asked him what he meant by “manual testing”. (As I do. A lot of people take this as being fussy.)

His reply was that he was wanting to identify candidates who don’t use “automated testing” as part of their tool set, but who were to be given the job of creating and executing manually scripted human-language tests and performing all the critical thinking skills that both approaches require.

(Never mind the fact that testing can’t be automated. Never mind that scripting a test is not what testing is all about. Never mind that no one even considers the idea of scripting programmers, or management. Never mind all that. Wait for what comes next.)

Then he said that “the position does not pay as much as the positions that primarily target automated test creation and execution, but it does require deeper engagement with product owners”. He went on to say that he didn’t want to get into the debate about “manual and automated testing”; he said that he didn’t like “holy wars”.

And there we have it, ladies and gentlemen; that’s the problem. Money talks. And here, the money—the fact that these testers are going to be paid less—is implicitly suggesting that talking to machines is more valuable, more important, than deeper engagement with people.

The money is further suggesting that skills stereotypically associated with men (who are over-represented in the ranks of programmers) are worth more than skills stereotypically associated with women (who are not only under-represented but also underpaid and also pushed out of the ranks of programmers by chauvinism and technochauvinism). (Notice, by the way, that I said “stereotypically” and not “justifiably”; there’s no justification available for this.)

Of course, money doesn’t really talk. It’s not the money that’s doing the talking.  It’s our society, and people within it, who are saying these things. As so often happens, people are using money to say things they dare not speak out loud.

This isn’t a “holy war” about some abstract, obscure point of religious dogma. This is a class struggle that affects very real people and their very real salaries. It’s a struggle about what we value. It’s a humanist struggle. And the test manager’s statement shows that the struggle is very, very real.

Suggestions for the (New) Testers

Friday, April 23rd, 2021

A friend that I’m just getting to know runs a training and skills development program for new testers. Today he said, “My students are now starting a project which includes test design, test techniques, and execution of testing. Do you have any input or advice for them?” Here’s my reply.

Test design, test techniques, and execution of testing are all good things. I’d prefer performing tests to “test execution”. In that preference, I’m trying to emphasize that a test is a performance, by an engaged person who adapts to what he or she is experiencing. “Test execution” sounds more like following a recipe, or a programmed set of instructions.

Of these things, my advice is to perform testing first. But that advice can be a little confusing to people who believe that testing is only operating some (nearly) finished product in a search for coding errors. In Rapid Software Testing, we take a much more expansive view: testing is the process of evaluating a product by learning about it through experiencing, exploring and experimenting, which includes to some degree questioning, studying, modeling, observation, inference, etc.

Testing includes analysis of the product, its domain, the people using it, and risk related to all of those. Testing includes critical thinking and scientific thinking. Testing includes performing experiments—that is, tests—all the way along. But I emphasized the learning part just back there, because testing starts with learning, ends with reporting what we’ve learned, feeds back into more learning, and is about learning every step of the way.

We learn more most powerfully from experiencing, exploring, and experimenting; performing experiments; performing tests. So, my advice to the new tester is to start with performing tests to study the product, without focusing too much on test design and test techniques, at first.

Side note: the “product” that you’ve been asked to test may not be a full, working, running piece of software. It may be a feature or component or function that is a part of a product. It may be a document, or a design drawing, a diagram, or even an idea for a product or feature that you’re being asked to review. In the latter cases, “performing a test” might mean the performance of a thought experiment. That’s not the same as the real-world experience of the running product, hence the quotes around “performing a test”. A thought experiment can be a great and useful thing to help nip bugs in the bud, before bugs in an idea turn into bugs in a product. But if we want to determine the real status of the real product, we’ll need to perform real testing on the real product.

So: learn the product (or feature, or design, or document, or idea), and identify how people might get value from it. Survey the product to identify its functions, features, and interfaces. Explore the product, and gain experience with it by engaging in a kind of purposeful play. Don’t look for bugs, particularly—not right away. Look for benefits. Look for how the product is intended to help people get their work done, to help them to communicate with other people, to help them to get something they want or need, to help them to have fun. Try doing things with the product—accomplishing a task, having a conversation, playing the game.

Record your thoughts and ideas and feelings reasonably thoroughly. Pay attention to things that surprise you, or that trigger your interest, or that prompt curiosity. Note things that you find confusing, and notice when the confusion lifts. If you have been learning the product for a while, and that confusion hasn’t gone away, that’s significant; it means there’s some confusing going on. If you get ideas about potential problems (that is, risks), note those. If you get ideas for designing tests, or applying tools, note those too.

Capture what you’re learning in point form, or in mind maps, or in narratives of what you’re doing. Sketeches and diagrams can help too. Don’t make your notes too formal; formality tends to be expensive, and it’s premature at this stage. It might be a good idea to test with someone else, with one person focusing on interacting with the product, and the other minding the task of taking notes and observations. Or you might choose to narrate and record your survey of the product on video to review later on; or to use like the black boxes on airplanes to figure out what led to problems or crashes.

You’ll probably see some bugs right away. If you do, note them quickly, but don’t investigate them. If you spotted a bug this easily, this early, and you take a quick note about it, you’ll almost certainly be able to see the bug again later. Investigating shallow bugs is not the job at the moment. They job right now is to develop your mental model of the product, so that you become prepared to find bugs that are more subtle, more deeply hidden, and potentially much more important or damaging.

Identify the people who might use the product… and then consider other groups of people you might have forgotten. That would include novice users of the product; expert users of the product; experts in the product domain who are novice users of the product; impatient users; plodding users; users under pressure; disabled users… Consider the product in terms of things that people value: capability, reliability, usability, charisma, security, scalability, compability, performance, installability… (As a new tester, or a tester in training, you might know these as quality criteria.)

You might also want to survey the product from the perspective of people who are not users as such, but who are definitely affected by the product: customer support people; infrastructure and operations people; other testers (like testing toolsmiths, or accessibility specialists); future testers; current developers future developers… Think in terms of what they might value from the product: supportability, testability, maintainability, portability, localizability. (These are quality criteria too, but they’re focused on the internal organization more than on their direct benefit to the end user.)

Refine your notes. Create lists, mind maps, tables, sketches, diagrams, flowcharts, stories… whatever helps you to reflect on your experience.

Share your findings with other people in the test or development (or in this case, study) group. That’s very important. It’s a really good way both to share knowledge and to de-bias ourselves and to reveal things that we might have forgotten, ignored, or dismissed too quickly.

Have these questions in mind as you go: What is this that we’re building? Who are we building it for? How would they get value from it? As time goes by, you’ll start to raise other questions: What could go wrong? How would we know? How could people’s value might be threatened or compromised? How could we test this? How should we test this? Then you’ll be ready to make better choices about test design, and applying test techniques.

Of course, this isn’t just advice for the new tester. It applies to anyone who wants to do serious testing. Testing that starts by reading a document and leaps immediately to creating formal, procedurally scripted test cases will almost certainly be weak testing, uninformed by knowledge of the product and how people will engage with it. Testing that starts with being handed some API documentation and leaps to the creation of automated checks for correct results will miss lots of problems that programmers will encounter—problems that we could discover if we try to experience it the way programmers—especially outside programmers—will.

As we’re developing the product, we’re learning about it. As we’re learning the product, we’re developing ideas about what it is, what it does, how people might use it, and how they might get value from it, and that learning feeds back into more development. As we develop our understanding of the product more deeply, we can be much better prepared to consider how people might try to use it unsuccessfully, how they might misuse it, and how their value might be threatened. That’s why it’s important, I believe, to do test execution perform testing first—to prepare ourselves better for test design and for identifying and applying test techniques—so we can find better bugs.

This post has been greatly influenced by ideas on sympathetic testing that came to me—over a couple of decades—from Jon Bach, James Bach, and Cem Kaner.

Evaluating Test Cases, Checks, and Tools

Sunday, April 11th, 2021

For testers who are being asked to focus on test cases and testing tools, remember this: a test case never finds a bug. The tester finds a bug, and the test case may play a role in finding the bug. (Credit to Pradeep Soundararajan for putting this so succinctly, all those years ago.)

Similarly, an automated check never finds a bug. The tester finds a bug, and the check may play a role in finding the bug.

A testing tool never finds a bug. The tester finds a bug, and the tool may play a role in finding the bug.

If you suspect that managers are putting too much emphasis on test cases, or automated checks, or testing tools—artifacts—, try this:

Start a list.

Whenever you find a bug, make a quick note about the bug and how you found it. Next to that, put a score on the value of the artifact. Write another quick note to describe and explain why you gave the the artifact a particular score.

Score 3 when you notice that an artifact was essential in finding the bug; there’s no way you could have found the bug without the artifact.

Score 2 if the artifact was significant in finding the bug; you could have found the bug, but the artifact was reasonably helpful.

Score 1 if the artifact helped, but not very much.

Score 0 if the artifact played no role either way.

Score -1 whenever you notice the artifact costing you some small amount of time, or distracting you somewhat.

Score -2 whenever the artifact when you notice the artifact costing you significant time or disruption from the task of finding problems that matter.

Score -3 whenever you notice that the artifact is actively preventing you from finding problems—when your attention has been completely diverted from the product, learning about it, and discovering possible problems in it, and has been directed towards the care and feeding of the artifact.

Notice that you don’t need to find a bug to offer a score. Pause your work periodically to evaluate your status and take a note. If you haven’t found a bug in the last little while, note that. In any case, every now and then, identify how long you’ve been on a particular thread of investigation using a test case, or a set of checks, or a tool. Evaluate your interaction with the artifact.

Periodically review the list with your manager and your team. The current total score might be interesting; if it’s high, that might suggest that your tools or test cases or other artifacts are helping you. If it’s low or negative, that might suggest that the tools or test cases or other artifacts are getting in your way.

Don’t take too long on the aggregate score; practically no time at all. It’s far more important to go through the list in detail. The more extreme numbers might be the most interesting. You might want to pay the greatest or earliest attention to the things that score the lowest and highest first, but maybe not. You might prefer to go through the list in order.

In any case, as soon as you begin your review of a particular item, throw away the score, because the score doesn’t really mean anything. It’s arbitrary. You could call it data, but it’s probably not valid data, and it’s almost certainly not reliable data. If people start using the data to control the decisions, eventually the data will be used to control you. Throw the score away.

What matters is your experience, and what you and the rest of the team can learn from it. Turn your attention to your notes and your experience. Then start having a real conversation with your manager and team about the bug, about the artifact or tool, and about your testing. If the artifact was helpful, identify how it helped, and how it might help next time, and how it could fool you if you became over-reliant on it. If the artifact wasn’t helpful, consider how it interfered with your testing, how you might improve or adjust it or whether you should put it to bed for a while or throw it away.

Learn from every discovery. Learn from every bug.

Related reading:

Assess Quality, Don’t Measure It