Measuring Combinatorial Coverage
For a given test set, what can we say about the combinatorial coverage it provides?
Some useful measures are the following:
1. Simple t-way combination coverage: of the total number of t-way combinations for a given collection of variables, what percentage are covered by the test set? If the test set is a covering array, then coverage is 100%, by definition, but many test sets not based on covering arrays may still provide significant t-way coverage.
2. t+k-way combination coverage: A test set that provides some precentage of coverage for t-way combinations will also provide some degree of coverage for t+1-way combinations, t+2-way combinations, etc.
3. Configuration-spanning coverage: Josh Maximoff and Mike Trelahave proposed the following measure: (p,q)-spanning is defined as the percentage p of t-way combinations that cover at least qpercent of the possible configurations. For example, in pairwise (2-way) coverage of binary variables, every 2-way combination has four configurations: 00, 01, 10, 11. Here's an example with four binary variables, a, b, c, and d, where each row represents a test.
a b c d
0 0 0 0
0 1 1 0
1 0 0 1
0 1 1 1
For this set, there are 6 possible variable combinations (4 choose 2) and 24 possible variable-value combinations ((4 choose 2) * 22). Of these, 19 variable-value combinations are covered and the only ones missing are ab=11, ac=11, ad=10, bc=01, bc=10. But only two, bd and cd, are covered with all 4 value pairs. So for our basic definition of simple t-way coverage, we have only 33% (2/6) coverage, but 79% (19/24) for the spanning metric. For a better understanding of this test set, we can compute the configuration coverage for each of the six variable combinations:
ab: 00, 01, 10 = .75
ac: 00, 01, 10 = .75
ad: 00, 01, 11 = .75
bc: 00, 11 = .50
bd: 00, 01, 10, 11 = 1.0
cd: 00, 01, 10, 11 = 1.0
So for this test set, 17% of the variable-value configurations are covered at the 50% level, 50% are covered at the 75% level, and 33% are covered at the 100% level. And, as noted above, for the whole set of tests, 79% of variable-value configurations are covered.
Note that simple t-way coverage is (p,100)-spanning, where p is the percentage of simple t-way coverage. A covering array is thus by definition (100,100)-spanning since it includes 100% of all possible t-way configurations.
I've developed a tool to calculate configuration-spanning coverage and applied it to a few test sets. Here is an example of coverage for a 2873245 set of input variables (blue=2-way, pink=3-way, yellow=4-way). This particular test set was not a covering array, but pairwise coverage is still quite good, with about 95% of the variables having all possible 2-way configurations covered. Even for 4-way combinations we see that all variables have at least 28% of their configurations covered, and about 25% of them have about 98% or more of 4-way configurations covered.
A great deal of work needs to be done to develop an understanding of combinatorial coverage and its relationship with software defect detection