Producing Production Quality Software

Producing Production Quality Software

Producing Production Quality Software

Binary Search: Invariants—Repeated Assertions
Prof. Arthur P. Goldberg

Fall, 2004

In the first class we removed the bugs from the following implementation of binary search. You gave good reasons for fixing the bugs in this program. But a lot of us, including me, were uncertain about our fixes and how the program would behave. We had to create examples, like arrays with a small number of possibly particular entries to test. And we couldn’t reason convincingly to prove to ourselves that our fixes were right—we’d only be able to say they worked with our examples.

Invariants are permanent truths, like 0<1. In thinking about programs we can put invariants in our code and try to demonstrate that they are always true. Assuming we succeed with the demonstration then we can reason about the correctness of the program from the invariant.

Here’s the incorrect binary search again (note the nice indentation—Eclipse does this automatically).

class binarySearch

{

static int NotFound = -1;

public static int IncorrectDoBinarySearch(

int SearchValue,

int[] SortedArray)

{

int low = 1; // index of start of array

int high = SortedArray.length; // index of end of array

int mid;

while (low < high)

{

mid = (low + high) / 2;

if (SearchValue == SortedArray[mid])

{

return mid;

}

if (SearchValue < SortedArray[mid])

{

high = mid;

continue;

}

else

{

low = mid;

continue;

}

return NotFound;

}

return NotFound;

}

}

Let’s put some invariants in this binary search and try to convince ourselves of the truth of the program’s properties.

Here I list all the important properties of the search (note that I assume that the array is actually sorted):

  1. Suppose the search value is not in the sorted array
  2. the program will terminate and return NotFound.
  3. the program will take at most log 2 (n) steps.
  4. Alternatively, suppose the search value is in the array
  5. the program will terminate and return the index of the array element holding the search value.
  6. the program will take at most log 2 (n) steps.

What invariants (assertions) might help us demonstrate and prove these properties?

If the size of the array being examined decreases by at least a factor of two each time through the while loop, then we know that the loop will take at most log 2 (n) steps. We codify this in invariant I1:

I1: ‘high – low is at most half the size it was last time through the while loop’

With respect to finding the search value if it is in the array, it seems crucial that the value be in the portion of the array that the program examines as it cycles through the while loop, that is SortedArray[ low ] … SortedArray[ high ]. So we define the invariant I2:

I2: ‘if the SearchValue is in SortedArray then it is in at least one of the elements SortedArray[ low ] … SortedArray[ high ]’

First, let’s reason about I1:

class binarySearch

{

static int NotFound = -1;

public static int DoBinarySearch(int SearchValue, int[] SortedArray)

{

int low = 0; // index of start of array

int high = SortedArray.length - 1; // index of end of array

int mid;

while (low <= high)

{

mid = (low + high) / 2;

if (SearchValue == SortedArray[mid])

{

return mid;

}

if (SearchValue < SortedArray[mid])

{

high = mid - 1;

}

else

{

low = mid + 1;

}

/*

The if statement above is the only code that changes high or low. So if this if statement shrinks high – low by at least a factor of 2, then I1 will be true.

Let’s consider 2 cases that cover the situation

Before the if statement, low + high was even:

then (high – low)/2 == mid – low == high – mid and the second if statement shrinks high – low to ((high – low)/2) – 1 which does not violate I1.

Before the if statement, low + high was odd:

then (high – low + 1)/2 = mid – low + 1 = high – mid and the second if statement shrinks high – low as follows:

if( SearchValue < SortedArray[ mid ] )

high – low shrinks to (high – low - 3)/2

else

high – low shrinks to (high – low - 1)/2

which does not violate I1.

Therefore, since I1 is not violated for any case, I1 is true here.

*/

}

return NotFound;

}

}

Next, let’s reason about I2:

class binarySearch

{

static int NotFound = -1;

public static int DoBinarySearch(int SearchValue, int[] SortedArray)

{

int low = 0; // index of start of array

int high = SortedArray.length - 1; // index of end of array

int mid;

// I2 is obviously true

while (low <= high)

{

/*

Is I2 true here? I2 is certainly true the first time through the while loop. What about subsequent times through the loop?

The first if statement finds an element SortedArray[ mid ] which does not equal SearchValue. Since we assume that the array is indeed sorted, the second if statement changes low and high such that I2 remains true. And since the second if statement is the only place where low or high change (and SearchValue and SortedArray do not change) I2 must be true every time through the while loop.

*/

mid = (low + high) / 2;

if (SearchValue == SortedArray[mid])

{

return mid;

}

if (SearchValue < SortedArray[mid])

{

high = mid - 1;

}

else

{

low = mid + 1;

}

}

return NotFound;

}

}

Now that we’ve shown that I1 and I2 are true, what can we say about the properties?

First, I1 enables us to prove both properties A.b and B.b. In fact, we can simplify them to ‘the binarySearch will take at most log 2 (n) passes through the while loop’. This follows directly from invariant I1.

Do we know enough to conclude that properties A.a and B.a hold? Not quite, because the invariants would still be true if the loop condition was

while ( low + 10 < high )

and then properties A.a and B.a would not hold. So how do we show that properties A.a and B.a are true?[A1]

1

[A1]We need another invariant, which says something like I3: if the SearchValue is not in the array then the program will not return until low < high. Then we can combine this with I2 to prove these properties.