Blog Posts for the ‘Regression’ Category

Very Short Blog Posts (17): Regression Obsession

Thursday, April 24th, 2014

Regression testing is focused on the risk that something that used to work in some way no longer works that way. A lot of organizations (Agile ones in particular) seem fascinated by regression testing (or checking) above all other testing activities. It’s a good idea to check for the risk of regression, but it’s also a good idea to test for it. Moreover, it’s a good idea to make sure that, in your testing strategy, a focus on regression problems doesn’t overwhelm a search for problems generally—problems rooted in the innumerable risks that may beset products and projects—that may remain undetected by the current suite of regression checks.

One thing for sure: if your regression checks are detecting a large number of regression problems, there’s likely a significant risk of other problems that those checks aren’t detecting. In that case, a tester’s first responsibility may not be to report any particular problem, but to report a much bigger one: regression-friendly environments ratchet up not only product risk, but also project risk, by giving bugs more time and more opportunity to hide. Lots of regression problems suggest a project is not currently maintaining a sustainable pace.

And after all, if a bug clobbers your customer’s data, is the customer’s first question “Is that a regression bug, or is that a new bug?” And if the answer is “That wasn’t a regression; that was a new bug,” do you expect the customer to feel any better?

Related material:

Regression Testing (a presentation from STAR East 2013)
Questions from Listeners (2a): Handling Regression Testing
Testing Problems Are Test Results
You’ve Got Issues

Questions from Listeners (2a): Handling Regression Testing

Saturday, August 7th, 2010

This is a followup to an earlier post, Questions from Listeners (2): Is Unit Testing Automated? The original question was

Unit testing is automated. When functional, integration, and system test cannot be automated, how to handle regression testing without exploding the manual test with each iteration?

Now I’ll deal with the second part of the question.

Part One: What Do We Really Mean By “Automation”?

Some people believe that “automation” means “getting the computer to do the testing”. Yet computers don’t do testing any more than compilers do programming, than cruise control does driving, than blenders do cooking. In Rapid Software Testing, James Bach and I teach that test automation is any use of tools to support testing.

When we’re perform tests on a running program, there’s always a computer involved, so automation is always around to some degree. We can use tools to help us configure the program, to help us observe some aspect of the program as it’s running, to generate data, to supply input to the program, to monitor outputs, to parse log files, to provide an oracle against which outputs can be compared, to aggregate and visualize results, to reconfigure the program or the system,… In that sense, all tests can be automated.

Some people believe that tests can be automated. I disagree. Checks can be automated. Checks are a part of an overall program of testing, and can aid it, but testing itself can’t be automated. Testing requires human judgement to determine what will be observed and how it will be observed; testing requires requires human judgement to ascribe meaning to the test result. Human judgement is needed to ascribe significance to the meaning(s) that we ascribe, and human judgement is required to formulate a response to the information we’ve revealed with the test. Is there a problem with the product under test? The test itself? The logical relationship between the test and the product? Is the test relevant or not? Machines can’t answer those questions. In that sense, no test can be automated.

Automation is a medium. That is, it’s an extension of some human capability, not a replacement for it. If we test well, automation can extend that. If we’re testing badly, then automation can help us to test badly at an accelerated rate.

Part Two: Why Re-run Every Test?

My car is 25 years old. Aside from some soon-to-be addressed rust and threadbare upholstery, it’s in very good shape. Why? One big reason is that my mechanic and I are constantly testing it and fixing important problems.

When I’m about to set out on a long journey in my car, I take the it in to Geoffrey, my mechanic. He performs a bunch of tests on the car. Some of those tests are forms of review: he checks his memory and looks over the service log to see which tests he should run. He addresses anything that I’ve identified as being problemmatic or suspicious. Some of Geoffrey’s tests are system-level tests, performed by direct observation: he listens to the engine in the shop and takes the car out for a spin on city streets and on the highway. Some of his tests are functional tests: he applies the brakes to check to see if they lose pressure. Some of his tests are unit tests, assisted by automation: he uses a machine to balance the tires and a gauge to pressure-test the cooling system. Some of his smoke tests are refined by tools: a look at the tires is refined by a pressure gauge; when he sees wear on the tires, he uses a gauge to measure the depth of the tread. Some of his tests are heavily assisted by automation: he has a computer that hooks up to a port on the car, and the computer runs checks that give him gobs of data that would be difficult or impossible for him to obtain otherwise.

When I set out on a medium-length trip, I don’t take the car in, but I still test for certain things. I walk around the car, checking the brake lights and turn signals. I look underneath for evidence of fluid leaks. I fill the car with gas, and while I’m at the gas station, I lift the hood and check the oil and the windshield wiper fluid. For still shorter trips, I do less. I get in, turn on the ignition, and look at the fuel gauge and the rest of the dashboard. I listen to the sound of the engine. I sniff the air for weird smells–gasoline, coolant, burning rubber.

As I’m driving, I’m making observations all the time. Some of those observations happen below the level of my consciousness, only coming to my attention when I’m surprised by something out of the ordinary, like a bad smell or the strange sound. On the road, I’m looking out the window, glancing at the dashboard, listening to the engine, feeling the feedback from the pedals and the steering wheel. If I identify something as a problem, I might ignore it until my next scheduled visit to the mechanic, I might leave it for a little while but still take it in earlier than usual, or I might take the car in right away.

When Geoffrey has done some work, he tells me what he has done, so to some degree I know what he’s tested. I also know that he might have forgotten something in the fix, and that he might not have tested completely, so after the car has been in the shop, I need to be more alert to potential problems, especially those closely related to the fix.

Notice two things: 1) Both Geoffrey and I are testing all the time. 2) Neither Geoffrey nor I repeat all of the tests that we’ve done on every trip, nor on every visit.

When I’m driving, I know that the problems I’m going to encounter as I drive are not restricted to problems with my car. Some problems might have to do with others—pedestrians or animals stepping out in front of me, or other drivers making turns in my path, encroaching on my lane, tailgating. So I must remain attentive, aware of what other people are doing around me. Some problems might have to do with me. I might behave impatiently or incompetently. So it’s important for me to keep track of my mental state, managing my attention and my intention. Some problems have to do with context. I might have to deal with bad weather or driving conditions. On a bright sunny day, I’ll be more concerned about the dangers of reflected glare than about wet roads. If I’ve just filled the tank, I don’t have to think about fuel for another couple hundred miles at least. Because conditions around me change all the time, I might repeat certain patterns of observation and control actions, but I’m not going to repeat every test I’ve ever performed.

Yes, I recognize that software is different. If software were a car, programmers would constantly be adding new parts to the vehicle and refining the parts that are there. On a car, we don’t add new parts very often. More typically, old parts wear out and get replaced. As such, change is happening. After change, we concentrate our observation and testing on things that are most likely to be affected by the change, and on things that are most important. In software, we do exactly the same thing. But in software, we can take an extra step to reduce risk: low-level, automated unit tests that provide change detection and rapid feedback, and which are the first level of defense against accidental breakage. I wrote about that here.

Part Three: Think About Cost, Value, Risk, and Coverage

Testing involves interplay between cost, value, and risk. The risk is generally associated with the unknown—problems that you’re not aware of, and the unknown consequences of those problems. The value is in the information you obtain from performing the test, and in the capacity to make better-informed decisions. There are lots of costs associated with tests. Automation reduces many of those costs (like execution time) and increases others (like development and maintenance time). Every testing activity, irrespective of the level of automation, introduces opportunity costs against potentially more valuable activities. A heavy focus on running tests that we’ve run before—and which have not been finding problems—represents opportunity cost against tests that we’ve never run and that won’t be found by our repeated tests. A focus on the care and feeding of repeated tests diminishes our responsiveness to new risk. A focus on repetition limits our test coverage.

Some people object to the idea of relaxing attention on regression tests, because their regression tests find so many problems. Oddly, these people are often the same people who trot out the old bromide that bugs that are found earlier are less expensive to fix. To those people, I would say this: If your regression tests consistently find problems, you’ll probably want to fix most of them. But there’s another, far more important problem that you’ll want to fix: someone has created an environment that’s favourable to backsliding.

Regression Testing, part I

Wednesday, September 6th, 2006

More traffic from the Agile Testing mailing list; Grig Gheorghiu is a programmer in Los Angeles who has some thoughtful observations and questions.

I’m well aware of the flame wars that are going on between the ‘automate everything’ camp and the ‘rapid testing’ camp. I was hoping you can give some practical, concrete, non-sweeping-generalization-based examples of how your testing strategy looks like for a medium to large project that needs to ensure that every release goes out with no regressions. I agree that not all tests can be automated. For those tests that are not automated, my experience is that you need a lot of manual testers to make sure things don’t slip through the cracks.

That sounds like a sweeping generalization too. 🙂

I can’t provide you with a strategy that ensures that every release goes out with no regressions. Neither can anyone else.

Manual tests are typically slower to run than automated tests. However, they take almost no development time, and they have diversity and high cognitive value, thus they tend to have a higher chance of revealing new bugs. Manual tests can also reveal regression bugs, especially when they’re targeted for that purpose. Human testers are able to change their tests and their strategies based on observations and choices, and they can do it at an instant.

Automated tests can’t do that. At the end-user, system, and integration level, they tend to have a high development cost. At the unit level, the development cost is typically much lower. Unit tests (and even higher-level tests) are typically super-fast to run and ensure that all the tests that used to pass still pass.

When the discussion about manual tests vs. automated tests gets pathological, it’s because some people seem to miss the point that testing is about making choices. We have an infinite number of tests that we could run. Whatever subset of those tests we choose, and no matter how many we think we’re running, we’re still running a divided-by-infinity fraction of all the possibilities. That means that we’re rejecting huge numbers of tests with every test cycle. One of the points of the rapid testing philosophy is that by making intelligent choices about risks and coverage, we change our approach from compulsively rejecting huge numbers of tests to consciously rejecting huge numbers of tests.

So the overall strategy for regression testing is to make choices that address the regression risk. All that any strategy can do–automated, manual, or some combination of the two–is to provide you with some confidence that there has been minimal regression (where minimal might be equal to zero, but no form of testing can prove that).

Other factors can contribute to that confidence. Those factors could include small changes to the code; well-understood changes, based on excellent investigation of the original bug; smart, thoughtful developers; strolls through the code with the debugger; unit testing at the developer level; automated unit testing; paired developers; community ownership of the code; developers experienced with the code base; readable, maintainable, modular code; technical review; manual and automated developer testing of the fix at some higher level than the unit level; good configuration management. People called “testers” may or may not be involved in any of these activities, and I’m probably missing a few. All of these are filters against regression problems, and many of them are testing activities to some degree. Automated tests of some description might be part of the picture too.

When it comes to system-level regression testing, here are the specific parts of one kind of strategy. As soon as some code, any code, is available, rapid testers will test the change immediately after the repair is done. In addition, they’ll do significant manual testing around the fix. This need not require a whole bunch of testers, but investigative skills are important. Generally speaking, a fix for a well-investigated and well-reproduced bug can be tested pretty quickly, and feedback about success or failure of the fix can also be provided rapidly. We could automate tests at this higher level, but in my experience, at this point it is typically far more valuable to take advantage of the human tester’s cognitive skills to make sure that nothing else is broken. (Naturally, if a tool extends our capabilities in some form, we’ll use it if the cost is low and the value high.)

In an ideal scenario, it’s the unit tests that help to ensure that this particular problem doesn’t happen again. But even in a non-ideal scenario, things tend to be okay. Other filters heuristically capture the regression problems–and when regressions make it as far as human testers, thoughtful people will usually recognize that it’s a serious problem. If that happens, developers and development management tend to make changes farther up the line.

It was ever thus. In 1999, at the first Los Altos Workshop on Software Testing, a non-scientific survey of a roundtable of very experienced testers indicated that regression tests represented about 15% of the total bugs found over the course of the project, and that the majority of these were found in development or on the first run of the automated tests. Cem Kaner, in a 1996 class that I was attending that an empirical survey had noted a regression rate of about 6%. Both he and I are highly skeptical about empirical studies of software engineering, but those resultscould be translated with suitable margin of error into “doesn’t happen very much”.

With agile methods–including lots of tests at the level of functional units–that turns into “hardly happens at all”, especially since the protocol is to write a new test, at the lowest level possible, for every previous failure determined by other means. This extra check is both necessary and welcome in agile models, because the explicit intention is to move at high velocity. Agile models mitigate the risk of high velocity, I perceive, by putting in low-level automated unit tests, where it’s cheap to automate–low cost and high value. Automated regression at the system level is relatively high cost and low value–that is, it’s expensive, if all we’re addressing is the risk of developers making silly mistakes.

I’ll have more to say about some specific circumstances in a future post. Meanwhile, I hope this helps.