Regression Testing Strategies

There is a lot written about regression testing, and yet there seems to be a lot of confusion about regression testing as well. Just to make sure we are all on the same page, by regression I am referring to the denotation of the word to indicate a relapse to a less perfect or developed state (American Heritage Dictionary). So, the primary objective of regression testing is to determine whether or not modifications or additions in each new build of a product cause previous functionality to regress to a non-functioning state or error condition. It is important to note the purpose of a regression test suite is not necessarily to expose new defects. The primary purpose of a regression test is to identify changes in behavior from a previously established baseline, which is supported by Beizer’s and Myers’ definitions of regression testing.

However, even on small projects the number of tests required to ensure new builds do not regress or change previous functionality can be quite numerous. So, regression testing demands a strategy in which we limit the number of tests to establish an effective baseline measurement. In IEEE 610 documentation it states regression testing is selective retesting. Thus, the key to an effective regression testing strategy is to design a test suite that provides a high degree of confidence without retesting everything. To limit the number of tests in the regression test suite, we must systematically reduce the number of possible tests. So, we must decide what tests are included in a regression test suite?

Deciding what tests to include in the regression test suite

The most effective regression test suites I have seen include two categories of tests. The first category of tests includes high priority tests for commonly expected functionality (e.g. the 20% of the product that 80% of the customers demand or rely on). The second category of tests includes any functional defects that are found and fixed. Found and fixed functional defects are included because fixed defects do occasionally regress, and if a business decision was made previously to fix a defect then we probably want it fixed before we release the product.

Prioritized feature area/functionality buckets

The tests in the regression test suite should also be partitioned into functional areas and each test in each functional partition or bucket should also be prioritized based on risk assessment criteria.  If the regression test suite is especially large or time is limited, and the regression suite is portioned into functional areas (and those areas are mapped to the project files or modules contain that specific functionality and any dependencies) the regression test pass can execute a limited subset of tests from the regression test suite that strategically target the modules that have changed (and tests for dependent modules as well). Simple directory comparison tools (such as Diff2Dirs), and tools to identify dependencies between modules (such as Depends) are useful in identifying which modules change between builds and to map out dependencies between the modules in each build.

Automate, Automate, Automate

Also, since the regression test suite will ideally be ran on each new build, this is one suite of tests that should be 99.999% automated. Similar to the BVT/BAT test suite the purpose of the regression test suite is not necessarily to expose defects; a regression test suite provides baseline measurement of functionality. Therefore, since these are tests that will be ran several times during the software development lifecycle and are not necessarily designed to expose new defects the ROI for automation is very high. In fact, any test that cannot be automated is suspect for inclusion in the regression test suite.

These are a few ideas to develop a highly successful automation strategy. What other tactics have you found to be successful?

Comments

  • Anonymous
    January 12, 2007
    >>> So, the primary objective of regression testing is to determine whether or not modifications or additions in each new build of a product cause previous functionality to regress to a non-functioning state or error condition. This objective of regression testing is nearly "IMPOSSIBLE" to achieve to its totality. (I added nearly because in contexts, owing to business and market conditions, one might claim that they have achieved it) How do you qualify "non-functioning state" or "error-condition". what if there are multiple (infinite) ways to define these? What is your Test oracle here? Previous version of application and corresponding regression suite. The effectiveness of your regression testing thus depends upon "how much you know about" - these" non functioning states and error conditions. Right? Checking that the new build caused previously functionality to regress to a non-functioning state or error condition - is nothing short of saying "Test to make sure that everything that was working previously, is working now, too. When people say "By doing regression testing, we make sure that new code has not broken any existing functionality" or any variation of this statement - what they actually mean or do is to make sure that "Tests (as part of their regression suite) that were passing earlier, are passing now too." ( I owe this statement of wisdom to Michael Bolton) Please note that "old tests passing" is way different from "old code working now, too". This is a big GAP I observe when people talk about one thing (test passing) and in reality meaning the other ("old code working now) Shrini

  • Anonymous
    January 12, 2007
    Shrini, Please read the post carefully, because I think you are assuming a regression test suite should retest everything; and that is not what is written. Could a regression test suite be large? Absolutely. But, I hope your statement that "regression testing is nearly IMPOSSIBLE" doesn't imply a defeatist attitude and mean that we shouldn't create a smart regression test strategy to establishes a baseline measure of specific expectations as part of our overall testing strategy. In response to your rhetorical questions, IMHO a tester is responsibile to understand the product's attributes and capabilities, and thus should know what a non-functioning state or error condition is. (Please refer to my post on the role of testing, if you don't know my position on this.) I view Michael's "wisdom" as simply a play on words. So, just to make it clear when I discuss regression testing I am not talking about "testing my tests." I design a regression test suite composed of specific tests capable of tactically analyzing possible regressions (or other potential changes) in the product from the pre-established baseline. (Of course, I am assuming you know how to establish baseline measurements.) Hopefully in future replies you will present useful information or at least refute statements with specific and accurate facts rather than simply playing with words, making simple assumptions, or asking rhetorical questions.

  • Anonymous
    January 19, 2007
    The comment has been removed

  • Anonymous
    January 20, 2007
    Yes, it is quite possible that I misinterpreted your rebuttals to my post suggesting possible regression testing strategies because you have not stated your position with definitive clarity or substantiated your position with logic, reason, or fact. The difference between you and I is that I take a position and attempt to put forth clear, unambiguous, and concise arguments to explain or defend my position. (My position may be incorrect in which case I will admit I am wrong or not completely aware of the facts and modify accordingly.) Conversely, vague rhetorical statements such as “regression testing is nearly IMPOSSIBLE to achieve to its totality” or analyzing “to find out the feasibility of fulfilling such objective” are without meaningful substance. It’s sort of like telling a 5 year old child to do something and they reply “No!”, and when you ask why they simply reply, “Because I said so!” Responsibility is the capacity of rational thought or action, the ability to discharge obligations, characterized by good judgment and sound thinking, involves accountability, and yes, is given to individuals who are reliable and dependable. Thus, responsibility is usually given to individuals who are accountable for something within their ability. That is why I stated, a “tester is responsible to understand the product's attributes and capabilities, and thus should know what a non-functioning state or error condition is.” If they don’t have that ability, then I will not rely on them for rational thoughts regarding the product’s attributes or capabilities, or give them the responsibility for making sound judgments regarding the product. As a test manager I was responsible and accountable to my managers to provide them with information that I could defend based on hard data and facts. To accomplish that, I relied on the talented people on my teams whom I trusted and could depend upon. If someone lacked the ability we attempted to train them. If a person was incapable of performing necessary obligations they were often reassigned roles that matched their abilities. Also, your statement that “if we knew complete product features and ALL error conditions – there would not be any bugs slipped out of testing” is simply foolish and erroneous. (Also, if you are going to quote someone, quote them correctly.) I have taught a variety of subjects to a variety of people for many years. It is truly a gift to be able to rephrase a statement, or use synonymous terminology, or even apply a metaphor to clearly communicate a complex idea or fact under a variety of circumstances. (For example, it is quite difficult for some people to fully understand JBS Haldane’s concept without understanding Boyle’s and Dalton’s laws, while some people simply accept the theory at face value.) This is quite different than “playing with words” in which a person twists or manipulates the meaning of words to imply some alternate philosophy or definition because they are attempting to disguise their message. Yes people who play with words are masterful with the language also, but the intent of playing with words is not to inform or educate, it is simply to mislead or to obfuscate the the problem with extraneous information.

  • Anonymous
    January 22, 2007
    The comment has been removed

  • Anonymous
    January 23, 2007
    The comment has been removed

  • Anonymous
    January 23, 2007
    The comment has been removed

  • Anonymous
    January 23, 2007

  • "regression testing demands a strategy in which we limit the number of tests to establish an effective baseline measurement." Can we consider BVTs a part of regression testing startegy? v-vinaku@microsoft.com
  • Anonymous
    January 24, 2007
    Hi Vinayak, This is a great question! The short answer is I don't consider the BVT as part of the regression test suite. I do consider the BVT as a separate (and most likely the first) baseline measurement established after each new build. But, in my experience the regression test suite (assuming it to be 99.999% automated) is often ran (almost) immediately following the BVT, and generally using the same lab machines as used for the BVT, (or more if necessary to distribute the load of the regression test suite.) I should also add here, tests in the BVT suite were not included in the regression suite (that would simply be redundant). When I designed the BVT test suite for the Windows 95 international versions produced in Redmond I had a constraint of 30 minutes to validate the integrity of 4 language versions of each new weekly build of the operating system (This is before the single world-wide binary devleopment models often used today when each language version was recompiled with #ifdef's - meaning there was functional differences in each language version.) At the pinicle the intl. BVTs were distributed across 12 machines in my office. (I still remember the warmth and constant whir of the fans when curling up to catch some sleep under the bench.) As I developed the BVT suite from mostly manual tests to 99% automated my manager would occasionally ask me the status of the BVT and I would reply with vague, non-commital answers such as "well, it's pretty good, but...blah blah blah." To which he replied I must make a firm decision...no sitting on the fence post...it was my responsiblity to make a decision. My manager made it very clear that the purpose of the BVT was to establish 1 of 3 possible outcomes. 1) If the build failed pre-determined critical tests it was rejected and kicked back to development. Now this was a decision not to be taken lightly, because this meant that not only were the dev's going to be working around the clock to get the weekly build out, it probably also meant most of the team would have to work the weekend to make up for lost time. 2) The second outcome could be what we referred to as Release for Test Only. This meant the build had some problems detected by the BVT, but some of the problems could be "worked around" and the build was stable enough to regress fixed defects, and continue testing large areas of the product. Some minor areas may not have been testable, but they were not usually critical areas. For example, in one build the menu links in the Start menu failed to launch the applets (Wordpad, Notepad, etc.) but a work-around to the problem was to launch the applet via the Run dialog or command prompt, or by double clicking the executable, so the build was released to the test team only. 3) The third option was to determine whether or not the build was Released for self-host (this was later called dog-fooding). This typically meant the new build passed the BVT (at least above 90%) and was deemed stable enough for everyone on the team (including managers) to format their main machines and install the latest build to conduct their day to day work. Trust me, when I got this wrong I was in the dog house for the entire week.  I should take this and make a new post about BVTs because I have more to add here. But, I hope you get the gist of the BVT and the distinction between the BVT and the regression suite.  By the way Vinayak, I like your blog...the last few posts are indeed interesting.

  • Anonymous
    January 29, 2007
    The comment has been removed

  • Anonymous
    January 30, 2007
    The comment has been removed

  • Anonymous
    February 08, 2007
    >The difference between you and I is that I take a position and attempt to put forth clear, unambiguous, and concise arguments to explain or defend my position. The inference that Shrini does not do this strikes me as fairly odious. ---Michael B.

  • Anonymous
    February 08, 2007
    Pardon me; I meant implication, rather than inference. ---Michael B.

  • Anonymous
    February 10, 2007
    Well, everyone is entitled to their own opinion. Personally, I find rhetorical questions such as "How do you qualify "non-functioning state" or "error-condition", or "what if there are multiple (infinite) ways to define these?" and statements without a basis of fact such as "If we knew "complete product features and ALL error conditions" - there would not be any bugs slipped out of testing" preposterous. But, I think Shrini has some good points to make and encourage him to refute my writings with strong arguments based on factual evidence and specific examples or suggestions.

  • Anonymous
    October 17, 2011
    compare the tables before and after the change.  any rows/columns that have unexpectedly changed were either an error before the change, or an error after the change, or both.   If it was an error before the change, then you are missing a test case.  If it was an error after the chagne, then you need to check if you have a test case. in any case, must faster to use table before/after tests to establish regression than trying to build exhaustive regression test.