Combinatorial testing is the use of tests that cover *t*-way combinations of parameter values, up to some specified criterion of coverage. For example, if we have three boolean parameters, *P*_{1}, *P*_{2}, and *P*_{3}, then 2-way coverage can be achieved if we cover all four combinations of values (00, 01, 10, 11) for every pair of these parameters. There are three pairs in this example: (*P*_{1}, *P*_{2}), (*P*_{1}, *P*_{3}), and (*P*_{2}, *P*_{3}). A structure called a *covering array* can compress all *t*-way combinations of values into an amazingly small set of tests. For example, there are 1,329,336,000 3-way combinations of 1,000 boolean variables. The ACTS tool generates covering arrays, and produces a test set of 71 tests that cover all of these 1,329,336,000 combinations.

The key insight underlying combinatorial testingâ€™s effectiveness is that most software bugs and failures are caused by one or two parameters, with progressively fewer by three or more. That is, they were only revealed when multiple conditions were true. For example, a 2-way interaction fault could be "altitude = 0 AND volume < 2.2". So testing all 2-way combinations of parameter values could detect this problem. __But it is not enough to test all pairs of values, because many failures are only revealed when more than two conditions are true.__ The distribution of failures by number of interacting factors (X axis) is shown below for a range of applications.

As can be seen in the graph, most failures were caused by one or two parameters, with progressively fewer by three or more. This finding, referred to as the *interaction rule*, has important implications for software testing because

- it is nearly always impossible to do exhaustive testing, but
__we don't have to test all possible combinations of inputs;__- we only have to test all of the combinations that trigger faults

__We can't do exhaustive testing, but the interaction rule says we don't have to, within reason__; we can still provide very strong assurance by testing all 4-way to 6-way combinations. Multiple studies have found 4-way to 6-way combination coverage was able to detect all faults found with exhaustive testing. Thus we can refer to this type of testing as "effectively exhaustive" (within reason).

This depends on the level of assurance required. As can be seen in the graph above, most failures are triggered by a single factor or the interaction of two factors, with progressively fewer by three or more factors. We have not seen more than six factors involved in a failure, so 7-way or higher faults appear to be extremely rare. Note also:

- Variability decreases rapidly with interaction strength, from more than 50 percentage points difference between high and low values for 1-way and 2-way faults, to less than 25 points for 4-way, to less than 5 points difference for 5-way faults. In other words, testing effectiveness may be more predictable with higher strength tests (4-way or higher).
- Easily found faults decline with use and testing. As might be expected, the more software is used and tested, the harder it becomes to find remaining errors. We refer to the distribution of faults as the
*fault profile*. The best fault profiles in the graph below are for the Apache server and NeoKylin, a Linux variant, probably because these both have hundreds of millions of users. The poorest fault profiles are the DBMS, which was from initial testing, and FDA medical device recalls, for devices that may have only a few thousand users. We have developed a mathematical model that is closely consistent with the empirical data (see Kuhn, D.R., Kacker, R.N. and Lei, Y., 2017, March. A Model for T-Way Fault Profile Evolution during Testing. In*2017 IEEE International Conference on Software Testing, Verification and Validation Workshops*).

As with all test methods, it will not be possible to include billions of values in tests, so parameter values should be partitioned into subsets that are relevant to the system requirements. In general, it is best to keep the partition of test parameters to no more than 10 or so values per parameter. Standard practices for equivalence partitions and boundary value analysis should be applied for determining representative parameter values.

*proportional* to *v ^{t}* log

*t*) covering array can be used. The question below on combinatorial coverage explains why this heuristic is important.

No, combinatorial coverage is a completely different concept. It measures the degree to which *combinations of input values* have been covered in tests, which is a static property of the test set. Measures such as statement or branch coverage are dynamic properties, as they measure the proportion of statements and branches covered when the program is running.

There is a relationship between combinatorial coverage and structural coverage, captured in the *branch coverage condition theorem*. Where *M _{t} *is the proportion of input combinations covered, and

* M _{t }*+

The reason this condition is important in combinatorial testing is that we will normally be using covering arrays, which by definition have *M _{t} *= 100%, for a

`if (A&&B) line1; if (C) line2;`

, where A and B are Boolean variables, If we take parameters with *v* values each, and form *t*-way combinations, each combination can have *v ^{t}* possible settings. The number of these combinations taken from

Consider a small example, with five parameters - *a, b, c, d, e* - of two values each: 0 or 1. If we take any two, such as (b,e), the possible value settings are 00, 01, 10, 11. We can systematically list the 2-way combinations: (*a,b*), (*a,c*), (*a,d*), (*a,e*), (*b,c*), (*b,d*), (*b,e*), (*c,d*), (*c,e*), (*d,e*). The number of 2-way combinations is thus 2^{2} * C(5,2) = 40.

Hundreds of corporations and organizations use the ACTS combinatorial testing tools. See our** **case studies for published examples from some of the world's largest organizations, including Adobe, Avaya, Daimler AG, IBM, Jaguar Land Rover, Lockheed Martin, Red Hat, Rockwell Collins, Siemens, the US Air Force, the US Marine Corps, and others. Many universities also use the tools in software testing courses.

Software on this site is free of charge and will remain free in the future. It is public domain; no license is required and there are no restrictions on use. You are free to include it and redistribute it in commercial products if desired. NIST is an agency of the United States Government, conducting research in advanced measurement and test methods.

To obtain the tools, please send a request to Rick Kuhn - kuhn@nist.gov including your name and the name of your organization. No other information is required, but we like to have a list of organizations so that we can show our management where the software is being used. We will send you a download link.

Created May 24, 2016, Updated August 20, 2019