671: DAA II More on Average Behavior
In the previous lecture, we stated the five equivalent formulations of average behavior. We usually assume the uniform distribution on the set In of inputs of size n to an algorithm, so that probability calculations are done by simply counting. Under this assumption, the five equivalent formulations of A(n) = E[τ] are as follows, where |S| denotes the (cardinality) number of elements in the set S, and if Y is a random variable (such as τ), then the set {y : Y(y) = i} is simply denoted by Y = i.
(I)* (the “person on the street” notion of average)
(II)*
(III)*
(IV)* If τ= τ1+ τ2+ … + τm , then A(n) = E[τ] = E[τ1] + E[τ2]+ … + E[τm]
(Note: This is same as (IV))
(V)* If Y is a random variable defined on In, and taking values in {1,2,…,m} then
IMPORTANT NOTE: The last expression in (V)*, while correct, is usually not the form used in practice, since E[τ | Y= i] and P(Y = i) are often easily directly calculated from the fact that the distribution is uniform (see, for example, the analysis of Quicksort given below). We now use these formulations,and the more general formulations (non-uniform distribution),to establish A(n) for some familiar algorithms.
A(n) for Linear Search
Suppose p is the probability that a search element X is in a list L[0:n – 1], and each element in the list is equally likely to be X. Then
A(n) = p(n + 1)/2 + (1 – p)n
Proof: Let Y : In→ {0,1} be defined by Y(L[0:n – 1], X) = 1 if XL[0:n – 1], Y = 0 otherwise. Then, by Formulation (V), we have
A(n) = (E[τ | Y = 1])P(Y = 1) + (E[τ | Y = 0])P(Y = 0)
= (E[τ | Y = 1])p + (E[τ | Y = 0])(1 – p)
= ((1/n)(1 + 2 + … + n))p + n(1 – p) = p(n + 1)/2 + (1 – p)n.
A(n) for Quicksort
We use formulations (IV)* and (V)*, where In is the set of all permutations of {1,2,…,n}, with each permutation equally likely. Let τ= τ1+ τ2 , where τ1is the number of comparisons performed by the call to Partition, and τ2 is the number of comparisons done by the recursive calls. Then
A(n) = E[τ] = E[τ1] + E[τ2] = n + 1 + E[τ2].
Define Y:In→ {1,2,…,n} by Y(π) = π(1). Then by (V)*,
= (1/n)[A(0) + A(n – 1) + A(1) + … + A(n – 1 ) + A(0)]
= (2/n)[ A(0) + A(1) + … + A(n – 1 )].
Hence,
A(n) = n + 1 + (2/n)[ A(0) + A(1) + … + A(n – 1 )] (a so-called “full-history” recurrence)
Where A(0) = A(1) = 0.
The rest is algebra, (see p. 197 in the text), yielding
A(n)/(n + 1) = 2/(n + 1) + A(n – 1 )/n,
which unpeels to yield A(n) = 2(n + 1)Hn+1 – 3(n + 1) ~ 2nlnn.
A(n) for MaxMin2
Recall the following simple algorithm for finding the maximum and minimum values in a list L[0:n – 1].
procedureMaxMin2(L[0:n – 1], Max, Min)
Input: L[0:n – 1] (a list of size n)
Output: Max, Min (the maximum and minimum values in L[0:n – 1], respectively)
Max ← L[0]
Min ← L[0]
fori ← 1 ton – 1 do
Temp← L[i]
ifTempMaxthen
Max←Temp
else
ifTempMinthen
Min←Temp
endif
endif
endfor
endMaxMin2
Note that W(n) = 2n – 2, and occurs for a decreasing list. Hence, the worst case is no better than simply making a linear scan twice through the list. On the other hand, the best case B(n) = n – 1 occurs for a decreasing list, which is a lower bound for any comparison-based algorithm for merely finding the maximum value alone. A natural question is, what is A(n)? The answer might surprise you!
Let τ= τ1+ τ2 , where τ1is the number of time the comparison TempMaxis performed, and τ2is the number of time the comparison TempMin is performed. Then
A(n) = E[τ] = E[τ1] + E[τ2] = n – 1 + E[τ2].
Note that the second comparison is performed precisely when the comparison TempMaxisfalse. Thus, τ2 = n – 1 – τ3 , where τ3 is the number of times that Max is updated.
Let A*(n) = E[τ3], and define Y: In→ {1,2,…,n} by Y(π) = π(n). Then by (V)*,
Now it is easy to see that E[τ3 | Y = n] = A*(n – 1) + 1, and E[τ3 | Y = i ≠ n] = A*(n – 1), i = 1, 2, …, n – 1. Thus, we have
Unwinding this recurrence yields A*(n) = E[τ3] = Hn. Hence,
A(n) = E[τ] = E[τ1] + E[τ2] = n – 1 + E[τ2] = 2n – 2 – Hn. Thus, A(n) ~ 2n – 2 = W(n).
In other words, the average behavior of MaxMin2 is no better than the simple algorithm of making two independent linear scans to find Max and Min.
Remarks. It turns out that ceiling(3n/2) – 2 is a worst-case lower bound for the problem of finding the maximum and minimum values in a list, and the algorithm MaxMin3 given on p. 40 in the text is an algorithm achieving this lower bound.
A(n) for Link-Ordered Search
Suppose we have a list L[0: n – 1] and a link array Link[0: n – 1] and variable Start pointing to the list L[0: n – 1] in increasing order, that is,
L[Start] ≤ L[Link[Start]] ≤ L[Link[Link[Start]]] ≤ … ≤ L[Link(n-1)[Start]].
Note that we can find a search element with no more than log2n comparisons by emulating binary search. However, it takes n/2 link updates to get to the midpoint element in a sorting of L. So comparisons is not the appropriate basic operation for an algorithm emulating binary search, but rather link updates. The question is, while we must do n link updates when searching for an element X in the worst case, can we do better on average than simply following the Link array to find X? The answer, somewhat surprisingly, is yes. The idea is to look for a better place (if any) in the sublist L[0: k – 1] to start the search than at L[Start], where k is appropriately chosen. Note that L[i] is a better start if L[i] ≤ X, and the best better start in L[0: k – 1] would be the maximum better start, if there are any. Note also that k can’t be a linear function of n, otherwise, we won’t improve on linear behavior of the algorithm. It turns out that if we take k = n1/2, then we obtain an O(n1/2) algorithm! The performance of our algorithm will be dominated by comparisons of list elements to the search element X, so we will take list comparisons to X as our basic operation.
We will analyze the average behavior of our algorithm as a function of k. We assume that the search element X is in the list L[0: n – 1], and that each list element has an equal chance of 1/n of being X. We execute the following code segment as we look for a better start Probe in L[0: k – 1].
Max ← L[Start]
Probe ← Start
fori ← 0 to k – 1 do
ifL[i] ≤Xthen
if L[i] > Maxthen
Max ← L[i]
Probe ← i
endif
endif
endfor
Note that this code sequence performs k list comparisons to X. We then perform the following code sequence to search for X.
W← Probe
whileXL[W] do
W ← Link[W]
ifW = -1 then return(-1) endif
endwhile
ifX = L[W] then
return(W)
else
return(-1)
endif
Then A(n) = E[τ] = k + 1 + E[τ2], where τ2 is the number of comparisons made by the above while loop. To compute E[τ2], let Y: In→ {1,2,…,n} be defined by Y(L[0: n – 1], xm) = m, where X = xm is the mth smallest element in L[0: n – 1]. Then
To simplify the notation, let βm denote the mapping which denotes the number of comparisons done by the while loop when Y = m, so that E[βm] = E[τ2 | Y = m].
We compute E[βm] using
E[βm] = q1 + q2 + … + qm, where qiis the probability that βm≥ i. We have
q1= 1
q2 = probability that xm is not in
q3 = probability that xm-1 , xm are not in
. . .
qi= probability that xm-i+2 ,..., xm-1, xm are not in
Hence, (by Proposition 3.2.2 in text).
Since the above final inequality is independent of m, we have
, and
A(n) = E[τ] = k + 1 + E[τ2] ≤ k + 1 + n/(k + 1) + O(1).
Using calculus, we see immediately that A(n) is minimized when k = n1/2 – 1, so if we take k = floor(n1/2), we will obtain an O(n1/2) algorithm.