






Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
An asymptotically efficient algorithm for sorting a binary array a of size n, achieving the best possible asymptotic running time. It also argues that the lomuto quicksort algorithm has an average running time of θ(n^2) on binary arrays of size n, despite its average running time being θ(n log n for general cases.
Typology: Exams
1 / 11
This page cannot be seen from the preview
Don't miss anything!
CSE3358 Problem Set 4 Solution
Problem 1: S0rt1ng
(a) (10 points) Consider an array A of n numbers each of which is either 0 or 1. We will refer to such an array as binary. Describe an asymptotically efficient algorithm for sorting a binary array A of size n. For full credit you should:
ANSWER: Since the array contains only zeros and ones, we can count the number of ones by scanning the array and adding the entries. If we find c ones, we can set A[1]...A[n − c] to 0 and A[n − c +1]...A[n] to 1. Note that this would not work if the array contains objects with keys in { 0 , 1 }.
Here’s a pseudocode for the algorithm.
c← 0 for i←1 to n do c←c + A[i] for i←1 to n − c do A[i]← 0 for i←n − c + 1 to n do A[i]← 1
The running time of this algorithm is dominated by the three for loops. Each loop performs a contant amount of work. Therefore, the running time is Θ(n) + Θ(n − c) + Θ(c) = Θ(2n) = Θ(n). This is the best asymptotically achievable bound since any sorting algorithm has to read the entire input A[1]...A[n] which takes Ω(n) time.
(b) (10 points) We define a Lomuto Quicksort algorithm as a Quicksort algorithm that (recursively) partitions an array into a left sub-array with all elements ≤ Pivot, and a right sub-array with all elements > Pivot.
Argue (your argument must be very clear, organized, and convincing, if not a formal proof) that any Lomuto Quicksort algorithm will have an average running time of Θ(n^2 ) on binary arrays of size n (i.e. even when pivot is chosen randomly). Explain why this does not contradict the result that QUICKSORT has an average running time of Θ(n log n).
ANSWER: The first key observation is that if the pivot is x = 1, then the right partition will be empty because there is no element greater than 1. Consider the first time t when a pivot x = 0 was chosen. Therefore, the 1st, 2nd, ..., tth^ partitions are all left partitions where each partition contains one less element than the previous partition. Since the partitionning algorithm takes linear time, we spend
∑t i=0 Θ(n^ −^ i) time on those partitions.
The tth^ partition with a pivot x = 0 produces a left partition L with all zeros and a right partition R with all ones. The second key observation is that parition L will always produce empty right partitions because all elements in L are ≤ pivot x = 0. Similarly, partition R will always prodcude empty right partitions because all elements in R are ≤ pivot x = 1. Pick the largest partition among L and R which will have a size of at least dn− 2 te = n′. Since the partitionning algorithm takes linear time, we
spend
∑n′ i=0 Ω(n ′ (^) − i).
The total running time will be
∑t i=0 Θ(n^ −^ i) +^
∑n′ i=0 Ω(n ′ (^) − i) where n′ (^) = dn−t 2 e. Therefore, the running time T (n) = Ω(
∑t i=0 n^ −^ i) and^ T^ (n) = Ω(
∑n′ i=0 n ′ (^) − i). If t ≥ n/2, then T (n) = Ω(∑t i=0 n^ −^ i) = Ω( ∑dn/ 2 e i=0 n^ −^ i) = Ω(n (^2) ). If t ≤ n/2, then n′ (^) ≥ n/4. Therefore, T (n) = Ω(∑n′ i=0 n ′ (^) − i) = Ω(∑dn/^4 e i=0 n) = Ω(n^2 ). Since T (n) = O(n^2 ), then T (n) = Θ(n^2 ).
Here’s an example when the pivot is the last element.
1
1
1
1
0
0 0 0 0 0 0 0 1 1 1 1
0 0 0 0 0
0 0 0 0
0 0 0
0 0
0
0 0 0 0 0 0 1 1 1
1 1
1
1 st
t th
0 0 0 0 0 0 0 0
struct Element { Element * prev; Element * next; void * data; //anything you need for the specific application };
struct Element { Element * prev; Element * next; void * data;
struct List { Element * head;
List() { head=NULL; } };
void insert(List * L, Element * x) { x->next=L->head; if (L->head!=NULL) L->head->prev=x; L->head=x; x->prev=NULL; }
void del(List * L, Element * x) { if (x->prev!=NULL) x->prev->next=x->next; else L->head=x->next; if (x->next!=NULL) x->next->prev=x->prev; }
(c) (5 points) Using the linked list of part (b), implement a stack that keeps track of its depth. The stack should provide the following functionality:
ANSWER:
class Stack { List * L; int current_depth; int max_depth;
public:
Stack() { L=new List(); current_depth=0; max_depth=0; }
void push(Element * x) { current_depth++; if (current_depth>max_depth) max_depth=current_depth; insert(L,x); }
Element * pop() { if (current_depth>0) { Element * temp=L->head; del(L,L->head); current_depth--; return temp; } return NULL; }
int depth() { return current_depth; }
int max() { return max_depth; } };
t->r=r; e->data=&t; s.push(e); e=new Element(); t=new Tuple(); t->p=p; t->r=q; e->data=&t; s.push(e); } } }
Obtain the maximum stack depth for several examples including:
and report the maximum depth of the stack ( as returned by s.max() ) as well as the running time in Θ notation.
ANSWER: max. stack depth running time sorted array Θ(1) Θ(n^2 ) array with all elements equal Θ(log n) Θ(n log n) reverse sorted array Θ(n) Θ(n^2 )
(e) (5 points) Can you see that the depth of the stack is Θ(n) in the worst-case? Which scenarios lead to such a worst-case?
ANSWER: The case of a reverse sorted array. If the order of the two PUSH operations in the code above is changed, the table will be the symmetric one, so the worst-case in terms of stack depth will be the case of a sorted array.
The stack depth that you explicitly observed in part (d) is the stack depth that the compiler uses for recursion. A deep stack means more internal stack operations and more memory usage. It is possible to modify the QUICKSORT code to make the stack depth always Θ(log n) even when the worst case occurs, without affection its everage case performance of Θ(n log n). We can do this by first reducing the number of recursive calls using a technique called tail recursion When the recursive call is the last instruction in the function it can be eliminated. Here’s how:
QUICKSORT(A, p, r) while p < r do q ←PARTITION(A, p, r) QUICKSORT(A, p, q) p ←q + 1
(f) (5 points) Argue that this new code for QUICKSORT still correctly sorts the array A.
ANSWER: When the first recursive call returns, all the values for p, r, and q that were before the initiation of the call will be restored. Since p is set to q+1, the while loop guarantees that QUICKSORT is executed with arguments q + 1 and r, which is identical to having another recursive call with these parameters.
(g) (5 points) Describe a scenario where the depth of the internal stack is still Θ(n) even with the above modification.
ANSWER: The stack depth will be Θ(n) if there are Θ(n) recursive calls to QUICKSORT (so this has to be on left partitions now). If the array is sorted, then left partitions will be one element partitions always (try it) and therefore recursion will stop immediately. So we do not go beyond one level of recursion. However, if the array is reverse sorted, we will have Θ(n) recursive calls on left partitions.
QUICKSORT(A,p,r) QUICKSORT(A,p,r-1) QUICKSORT(A,p+1,r-2) QUICKSORT(A,p+2,r-3) ...
Therefore, we have almost n/2 recursive calls. To see this, try to QUICKSORT a small array, say A = [8, 7 , 6 , 5 , 4 , 3 , 2 , 1].
(h) (5 points) How can you further modify the code to make the stack depth Θ(log n) without affecting the average case performance of QUICKSORT? Hint: since we use recursion on one part of the array only, why not use it on the smaller part?
ANSWER: The problem demonstrated by the above scenario is that each invocation of QUICK- SORT calls QUICKSORT again with almost the same range. To avoid such behavor, we must change QUICKSORT so that the recursive call is on a smaller interval of the array. The following variation of QUICKSORT checks which of the two subarrays returned from PARTITION is smaller and recurses on the smaller subarray, which is at most half the size of the current array. Since the array size is reduced by at least half on each recursive call, the number of recursive callas, and hence the stack depth, is Θ(log n) in the worst case. The expected running time is not affected, because exactly the same work is done as before: the same partitions are produced, and the same subarrays are sorted.
QUICKSORT(A, p, r) while p < r do q ←PARTITION(A, p, r) if q − p + 1 < r − q then QUICKSORT(A, p, q) p ←q + 1 else QUICKSORT(A, q + 1, r) r ←q
Another way of building a heap (inplace) is to sort A using any inplace sorting algorithm like Inser- tionsort (Θ(n^2 ) time) or Quicksort (Θ(n log n) time on average).
(b) (5 points) Does your algorithm produce the same result as the one based on repeated HEAPIFY? If yes, provide an argument. If no, provide a counter example.
ANSWER: Not necessarily. Consider the array A = [1, 2 , 3]. BUILD-HEAP with HEAPIFY will produce A = [3, 2 , 1]. BUILD-HEAP with inserts will produce A = [3, 1 , 2].
(c) (5 points) Given n distinct elements, how many different heaps of these elements can you build (I have no idea now)? How does this number change with n asymptotically?
ANSWER: Let H(n)=number of heaps of n distinct elements. Given a heap of n elements, the root is always the maximum element. Therefore, we have to consider variations on the remaining n − 1 elements. Let x be the number of elements in the root’s left subtree. The left subtree is also a heap; therefore, the left subtree is a heap of x elements. Similarly, the right subtree is a heap of n − 1 − x elements. Since the heap is a nearly complete binary tree, x can always be determined from n, and for a given n, x is fixed. The x elements in the left subtree can be any x elements among the n − 1 elements. Therefore, we can write the expression of H(n) as follows:
H(n) = Cn x −^1 H(x)H(n − 1 − x)
Let’s not worry about how to compute x and focus on complete heaps only. In a complete heap, x = n− 2 1 = bn/ 2 c. Therefore, for n = 2p^ − 1 for some p ≥ 0, we have the following formula:
H(n) = C bnn/−^12 cH^2 (bn/ 2 c)
Using the expression for C bnn/−^12 c = (^) bn/( 2 nc−!b1)!n/ 2 c! and Stirling’s approximation for n! = Θ(n
n+1/ 2 en^ ), we get C bnn/−^12 c = Θ( 2 √n n ). Let^ h(n) = log^ H(n). Then (ignoring floors)
h(n) = log C bnn/−^12 c + 2h(n/2) = log(Θ(
2 n √ n
)) + 2h(n/2) = Θ(n) + 2h(n/2)
Using the Master theorem, h(n) = Θ(n log n). Therefore, H(n) = 2h(n)^ = nΘ(n). Here’s a table showing the number of heaps for n = 0, 1 , ..., 20.
n #heaps