Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Asymptotically Efficient Algorithm for Sorting Binary Arrays and Lomuto Quicksort Analysis, Exams of Data Structures and Algorithms

An asymptotically efficient algorithm for sorting a binary array a of size n, achieving the best possible asymptotic running time. It also argues that the lomuto quicksort algorithm has an average running time of θ(n^2) on binary arrays of size n, despite its average running time being θ(n log n for general cases.

Typology: Exams

2012/2013

Uploaded on 04/07/2013

seshu_lin3
seshu_lin3 🇮🇳

4

(3)

59 documents

1 / 11

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CSE3358 Problem Set 4
Solution
Problem 1: S0rt1ng
(a) (10 points) Consider an array Aof nnumbers each of which is either 0or 1. We will refer to such
an array as binary. Describe an asymptotically efficient algorithm for sorting a binary array Aof size
n. For full credit you should:
Desrcibe your algorithm in english
Provide a clear, neat, and complete pseudocode for your algorithm similar to the way we do it
in class
Analyze correctly the running time of your algorithm and express the running time in either a
Θ or an Onotation.
Achieve a running time that is the best possible asymptotically
ANSWER: Since the array contains only zeros and ones, we can count the number of ones by scanning
the array and adding the entries. If we find cones, we can set A[1]...A[nc] to 0 and A[nc+1]...A[n]
to 1. Note that this would not work if the array contains objects with keys in {0,1}.
Here’s a pseudocode for the algorithm.
c0
for i1 to n
do cc+A[i]
for i1 to nc
do A[i]0
for inc+ 1 to n
do A[i]1
The running time of this algorithm is dominated by the three for loops. Each loop performs a contant
amount of work. Therefore, the running time is Θ(n) + Θ(nc) + Θ(c) = Θ(2n) = Θ(n). This
is the best asymptotically achievable bound since any sorting algorithm has to read the entire input
A[1]...A[n] which takes Ω(n) time.
(b) (10 points) We define a Lomuto Quicksort algorithm as a Quicksort algorithm that (recursively)
partitions an array into a left sub-array with all elements Pivot, and a right sub-array with all
elements >Pivot.
Argue (your argument must be very clear, organized, and convincing, if not a formal proof) that
any Lomuto Quicksort algorithm will have an average running time of Θ(n2) on binary arrays of size
n(i.e. even when pivot is chosen randomly). Explain why this does not contradict the result that
QUICKSORT has an average running time of Θ(nlog n).
1
pf3
pf4
pf5
pf8
pf9
pfa

Partial preview of the text

Download Asymptotically Efficient Algorithm for Sorting Binary Arrays and Lomuto Quicksort Analysis and more Exams Data Structures and Algorithms in PDF only on Docsity!

CSE3358 Problem Set 4 Solution

Problem 1: S0rt1ng

(a) (10 points) Consider an array A of n numbers each of which is either 0 or 1. We will refer to such an array as binary. Describe an asymptotically efficient algorithm for sorting a binary array A of size n. For full credit you should:

  • Desrcibe your algorithm in english
  • Provide a clear, neat, and complete pseudocode for your algorithm similar to the way we do it in class
  • Analyze correctly the running time of your algorithm and express the running time in either a Θ or an O notation.
  • Achieve a running time that is the best possible asymptotically

ANSWER: Since the array contains only zeros and ones, we can count the number of ones by scanning the array and adding the entries. If we find c ones, we can set A[1]...A[n − c] to 0 and A[n − c +1]...A[n] to 1. Note that this would not work if the array contains objects with keys in { 0 , 1 }.

Here’s a pseudocode for the algorithm.

c← 0 for i←1 to n do c←c + A[i] for i←1 to n − c do A[i]← 0 for i←n − c + 1 to n do A[i]← 1

The running time of this algorithm is dominated by the three for loops. Each loop performs a contant amount of work. Therefore, the running time is Θ(n) + Θ(n − c) + Θ(c) = Θ(2n) = Θ(n). This is the best asymptotically achievable bound since any sorting algorithm has to read the entire input A[1]...A[n] which takes Ω(n) time.

(b) (10 points) We define a Lomuto Quicksort algorithm as a Quicksort algorithm that (recursively) partitions an array into a left sub-array with all elements ≤ Pivot, and a right sub-array with all elements > Pivot.

Argue (your argument must be very clear, organized, and convincing, if not a formal proof) that any Lomuto Quicksort algorithm will have an average running time of Θ(n^2 ) on binary arrays of size n (i.e. even when pivot is chosen randomly). Explain why this does not contradict the result that QUICKSORT has an average running time of Θ(n log n).

ANSWER: The first key observation is that if the pivot is x = 1, then the right partition will be empty because there is no element greater than 1. Consider the first time t when a pivot x = 0 was chosen. Therefore, the 1st, 2nd, ..., tth^ partitions are all left partitions where each partition contains one less element than the previous partition. Since the partitionning algorithm takes linear time, we spend

∑t i=0 Θ(n^ −^ i) time on those partitions.

The tth^ partition with a pivot x = 0 produces a left partition L with all zeros and a right partition R with all ones. The second key observation is that parition L will always produce empty right partitions because all elements in L are ≤ pivot x = 0. Similarly, partition R will always prodcude empty right partitions because all elements in R are ≤ pivot x = 1. Pick the largest partition among L and R which will have a size of at least dn− 2 te = n′. Since the partitionning algorithm takes linear time, we

spend

∑n′ i=0 Ω(n ′ (^) − i).

The total running time will be

∑t i=0 Θ(n^ −^ i) +^

∑n′ i=0 Ω(n ′ (^) − i) where n′ (^) = dn−t 2 e. Therefore, the running time T (n) = Ω(

∑t i=0 n^ −^ i) and^ T^ (n) = Ω(

∑n′ i=0 n ′ (^) − i). If t ≥ n/2, then T (n) = Ω(∑t i=0 n^ −^ i) = Ω( ∑dn/ 2 e i=0 n^ −^ i) = Ω(n (^2) ). If t ≤ n/2, then n′ (^) ≥ n/4. Therefore, T (n) = Ω(∑n′ i=0 n ′ (^) − i) = Ω(∑dn/^4 e i=0 n) = Ω(n^2 ). Since T (n) = O(n^2 ), then T (n) = Θ(n^2 ).

Here’s an example when the pivot is the last element.

1

1

1

1

0

0 0 0 0 0 0 0 1 1 1 1

0 0 0 0 0

0 0 0 0

0 0 0

0 0

0

0 0 0 0 0 0 1 1 1

1 1

1

1 st

t th

0 0 0 0 0 0 0 0

struct Element { Element * prev; Element * next; void * data; //anything you need for the specific application };

ANSWER:

struct Element { Element * prev; Element * next; void * data;

struct List { Element * head;

List() { head=NULL; } };

void insert(List * L, Element * x) { x->next=L->head; if (L->head!=NULL) L->head->prev=x; L->head=x; x->prev=NULL; }

void del(List * L, Element * x) { if (x->prev!=NULL) x->prev->next=x->next; else L->head=x->next; if (x->next!=NULL) x->next->prev=x->prev; }

(c) (5 points) Using the linked list of part (b), implement a stack that keeps track of its depth. The stack should provide the following functionality:

  • s.push(x): pushes an Element x onto the stack
  • s.pop(): pops an Element off the stack and returns it
  • s.depth(): returns the number of Elements in the stack
  • s.max(): returns the maximum depth ever attained by the stack

ANSWER:

class Stack { List * L; int current_depth; int max_depth;

public:

Stack() { L=new List(); current_depth=0; max_depth=0; }

void push(Element * x) { current_depth++; if (current_depth>max_depth) max_depth=current_depth; insert(L,x); }

Element * pop() { if (current_depth>0) { Element * temp=L->head; del(L,L->head); current_depth--; return temp; } return NULL; }

int depth() { return current_depth; }

int max() { return max_depth; } };

t->r=r; e->data=&t; s.push(e); e=new Element(); t=new Tuple(); t->p=p; t->r=q; e->data=&t; s.push(e); } } }

Obtain the maximum stack depth for several examples including:

  • large sorted array
  • large array with all elements equal
  • large reverse sorted array

and report the maximum depth of the stack ( as returned by s.max() ) as well as the running time in Θ notation.

ANSWER: max. stack depth running time sorted array Θ(1) Θ(n^2 ) array with all elements equal Θ(log n) Θ(n log n) reverse sorted array Θ(n) Θ(n^2 )

(e) (5 points) Can you see that the depth of the stack is Θ(n) in the worst-case? Which scenarios lead to such a worst-case?

ANSWER: The case of a reverse sorted array. If the order of the two PUSH operations in the code above is changed, the table will be the symmetric one, so the worst-case in terms of stack depth will be the case of a sorted array.

The stack depth that you explicitly observed in part (d) is the stack depth that the compiler uses for recursion. A deep stack means more internal stack operations and more memory usage. It is possible to modify the QUICKSORT code to make the stack depth always Θ(log n) even when the worst case occurs, without affection its everage case performance of Θ(n log n). We can do this by first reducing the number of recursive calls using a technique called tail recursion When the recursive call is the last instruction in the function it can be eliminated. Here’s how:

QUICKSORT(A, p, r) while p < r do q ←PARTITION(A, p, r) QUICKSORT(A, p, q) p ←q + 1

(f) (5 points) Argue that this new code for QUICKSORT still correctly sorts the array A.

ANSWER: When the first recursive call returns, all the values for p, r, and q that were before the initiation of the call will be restored. Since p is set to q+1, the while loop guarantees that QUICKSORT is executed with arguments q + 1 and r, which is identical to having another recursive call with these parameters.

(g) (5 points) Describe a scenario where the depth of the internal stack is still Θ(n) even with the above modification.

ANSWER: The stack depth will be Θ(n) if there are Θ(n) recursive calls to QUICKSORT (so this has to be on left partitions now). If the array is sorted, then left partitions will be one element partitions always (try it) and therefore recursion will stop immediately. So we do not go beyond one level of recursion. However, if the array is reverse sorted, we will have Θ(n) recursive calls on left partitions.

QUICKSORT(A,p,r) QUICKSORT(A,p,r-1) QUICKSORT(A,p+1,r-2) QUICKSORT(A,p+2,r-3) ...

Therefore, we have almost n/2 recursive calls. To see this, try to QUICKSORT a small array, say A = [8, 7 , 6 , 5 , 4 , 3 , 2 , 1].

(h) (5 points) How can you further modify the code to make the stack depth Θ(log n) without affecting the average case performance of QUICKSORT? Hint: since we use recursion on one part of the array only, why not use it on the smaller part?

ANSWER: The problem demonstrated by the above scenario is that each invocation of QUICK- SORT calls QUICKSORT again with almost the same range. To avoid such behavor, we must change QUICKSORT so that the recursive call is on a smaller interval of the array. The following variation of QUICKSORT checks which of the two subarrays returned from PARTITION is smaller and recurses on the smaller subarray, which is at most half the size of the current array. Since the array size is reduced by at least half on each recursive call, the number of recursive callas, and hence the stack depth, is Θ(log n) in the worst case. The expected running time is not affected, because exactly the same work is done as before: the same partitions are produced, and the same subarrays are sorted.

QUICKSORT(A, p, r) while p < r do q ←PARTITION(A, p, r) if q − p + 1 < r − q then QUICKSORT(A, p, q) p ←q + 1 else QUICKSORT(A, q + 1, r) r ←q

Another way of building a heap (inplace) is to sort A using any inplace sorting algorithm like Inser- tionsort (Θ(n^2 ) time) or Quicksort (Θ(n log n) time on average).

(b) (5 points) Does your algorithm produce the same result as the one based on repeated HEAPIFY? If yes, provide an argument. If no, provide a counter example.

ANSWER: Not necessarily. Consider the array A = [1, 2 , 3]. BUILD-HEAP with HEAPIFY will produce A = [3, 2 , 1]. BUILD-HEAP with inserts will produce A = [3, 1 , 2].

(c) (5 points) Given n distinct elements, how many different heaps of these elements can you build (I have no idea now)? How does this number change with n asymptotically?

ANSWER: Let H(n)=number of heaps of n distinct elements. Given a heap of n elements, the root is always the maximum element. Therefore, we have to consider variations on the remaining n − 1 elements. Let x be the number of elements in the root’s left subtree. The left subtree is also a heap; therefore, the left subtree is a heap of x elements. Similarly, the right subtree is a heap of n − 1 − x elements. Since the heap is a nearly complete binary tree, x can always be determined from n, and for a given n, x is fixed. The x elements in the left subtree can be any x elements among the n − 1 elements. Therefore, we can write the expression of H(n) as follows:

H(n) = Cn x −^1 H(x)H(n − 1 − x)

Let’s not worry about how to compute x and focus on complete heaps only. In a complete heap, x = n− 2 1 = bn/ 2 c. Therefore, for n = 2p^ − 1 for some p ≥ 0, we have the following formula:

H(n) = C bnn/−^12 cH^2 (bn/ 2 c)

Using the expression for C bnn/−^12 c = (^) bn/( 2 nc−!b1)!n/ 2 c! and Stirling’s approximation for n! = Θ(n

n+1/ 2 en^ ), we get C bnn/−^12 c = Θ( 2 √n n ). Let^ h(n) = log^ H(n). Then (ignoring floors)

h(n) = log C bnn/−^12 c + 2h(n/2) = log(Θ(

2 n √ n

)) + 2h(n/2) = Θ(n) + 2h(n/2)

Using the Master theorem, h(n) = Θ(n log n). Therefore, H(n) = 2h(n)^ = nΘ(n). Here’s a table showing the number of heaps for n = 0, 1 , ..., 20.

n #heaps