Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

CSE3358 Problem Set 6: K-Way Merging, Sorting, and Hashing, Exams of Data Structures and Algorithms

Problem set 6 from a computer science engineering course focusing on algorithms and data structures. The problems cover various topics including k-way merging using a heap data structure, sorting unsorted lists, counting inversions, sorting almost sorted arrays, heap height, heap sort, worst-case bucketsort, open addressing with uniform hashing, and constructing a universal family of hash functions.

Typology: Exams

2012/2013

Uploaded on 04/07/2013

seshu_lin3
seshu_lin3 🇮🇳

4

(3)

59 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
CSE3358 Problem Set 6
Practice Problems
02/22/05
Due 02/25/05
Decision trees and lower bounds
Problem 1: Merging ksorted lists
We have so far considered a number of variations to this problem. Through our study of merge sort, we
saw how to merge two sorted lists containing a total of nelements in Θ(n) time. Moreover, Problem
Set 2 introduced a modification to merge sort that calls insertion sort on small inputs (kin size).
This modification of merge sort can be viewed essentially as merging n/k sorted lists of size keach.
We have seen how to do this in Θ(nlog n/k) time.
Now we will revisit the problem in a generalized form: We consider having klists containing a total of
nelements. We need to merge them into one sorted list. One could generalize the two-way merging
to k-way merging in the following way: We start with an empty list which will eventually contain
all nelements in order. We keep kpointers, one per list. Each pointer originally points to the first
element of the corresponding list. By comparing all kelements, the smallest among these is chosen
and placed in the big list and the corresponding pointer is incremented. This procedure is repeated
until all elements are taken. The drawback of this appraoch is that determining the smallest among
kelements takes O(k) time, leading to a O(nk) running time for the merging. A better way is to
perform 2-way merging of two lists in a tree-like form as we did previously for the modified merge sort.
Here’s another way:
(a) Using a heap data structure, describe a O(nlog k) time algorithm for performing k-way merging
of klists containing a total of nelements.
(b) Show that any algorithm for merging ksorted lists containing a total of nelements and that uses
those elements in comparisons only, has to run in Ω(nlog k) time. To do this, obtain all possible
interleavings of ksorted lists and use a decision tree argument similar to the one we used to obtain
the sorting lower bound.
Note: Merge sort is nothing but merging nlists of size 1 each.
Problem 2: Yet another variation
Consider klists that are not necessarily sorted containing a total of nelements. In this variation, all
the elements in the first list are less than or equal to all the elements in the second list, and so on...
One possible method for sorting the elements is to sort the individual lists independently and then
concatenate the sorted results. This will take Θ(Pinilog ni) where niis the number of elements in
list i.
Algorithm OBVIOUS
for i1 to k
do sort list i . for example using merge sort
concatenate the lists
Althouhg the above algorithm is OBVIOUS, nothing better can be done. Regardless of what algorithm
we use for this variation of the sorting problem, show that Ω(Pinilog ni) time is needed. Note that
1
pf3
pf4

Partial preview of the text

Download CSE3358 Problem Set 6: K-Way Merging, Sorting, and Hashing and more Exams Data Structures and Algorithms in PDF only on Docsity!

CSE3358 Problem Set 6 Practice Problems 02/22/ Due 02/25/ Decision trees and lower bounds

Problem 1: Merging k sorted lists We have so far considered a number of variations to this problem. Through our study of merge sort, we saw how to merge two sorted lists containing a total of n elements in Θ(n) time. Moreover, Problem Set 2 introduced a modification to merge sort that calls insertion sort on small inputs (≤ k in size). This modification of merge sort can be viewed essentially as merging n/k sorted lists of size k each. We have seen how to do this in Θ(n log n/k) time.

Now we will revisit the problem in a generalized form: We consider having k lists containing a total of n elements. We need to merge them into one sorted list. One could generalize the two-way merging to k-way merging in the following way: We start with an empty list which will eventually contain all n elements in order. We keep k pointers, one per list. Each pointer originally points to the first element of the corresponding list. By comparing all k elements, the smallest among these is chosen and placed in the big list and the corresponding pointer is incremented. This procedure is repeated until all elements are taken. The drawback of this appraoch is that determining the smallest among k elements takes O(k) time, leading to a O(nk) running time for the merging. A better way is to perform 2-way merging of two lists in a tree-like form as we did previously for the modified merge sort. Here’s another way:

(a) Using a heap data structure, describe a O(n log k) time algorithm for performing k-way merging of k lists containing a total of n elements.

(b) Show that any algorithm for merging k sorted lists containing a total of n elements and that uses those elements in comparisons only, has to run in Ω(n log k) time. To do this, obtain all possible interleavings of k sorted lists and use a decision tree argument similar to the one we used to obtain the sorting lower bound.

Note: Merge sort is nothing but merging n lists of size 1 each.

Problem 2: Yet another variation Consider k lists that are not necessarily sorted containing a total of n elements. In this variation, all the elements in the first list are less than or equal to all the elements in the second list, and so on... One possible method for sorting the elements is to sort the individual lists independently and then concatenate the sorted results. This will take Θ(

∑ i ni^ log^ ni) where^ ni^ is the number of elements in list i.

Algorithm OBVIOUS for i←1 to k do sort list i. for example using merge sort concatenate the lists

Althouhg the above algorithm is OBVIOUS, nothing better can be done. Regardless of what algorithm we use for this variation of the sorting problem, show that Ω(

∑ i ni^ log^ ni) time is needed. Note that

it is not rigorous to sum the individual lower bounds because an algorithm does not necessarily work like the OBVIOUS algorithm.

Note: If all lists have the same size, this bound will be Ω(n log n/k).

Sorting

Problem 3: Counting inversions Given an array A, we define (i, j) to be an inversion of A if i < j and A[i] > A[j]. Therefore, a sorted array has 0 inversions.

(a) Find the inversions of A = [2, 3 , 8 , 6 , 1].

(b) What is the maximum number of inversions that an array of length n can have? Which arrays achieve that maximum?

(c) Show that the running time of insertion sort on an array A[1..n] is Θ(n + I) where I is the number of inversions of A.

Problem 4: Almost sorted Consider an array A[1..n] satisfying the following condition

|A[i] − i| ≤ k

for some positive constant k.

(a) Give an algorithm for sorting array A that satisfies the following properties:

  • the running time of the algorithm is Θ(n)
  • the algorithm is inplace
  • the algorithm is stable

Note: If the elements are integers, counting sort would run in linear time because A[i] ≤ i + k ≤ n + k = O(n). However, counting sort is not inplace. Therefore, counting sort cannot be used.

(b) Show that any algorithm to sort A has to run in Ω(n log k) time.

Heaps

Problem 5: Height of a heap We have been using the fact that the height of a heap of n elements is Θ(log n). This is definitely true because a heap is a nearly complete binray tree (i.e. all levels are filled except possibly for the last one which is filled from left to right). Prove this fact by finding both lower and upper bounds on the number of nodes in a heap of height h.

(a) What is the size of H?

(b) What needs to be done to pick a hash function from H uniformly at random?

Now we want to see if H is universal. This means we need to verify whether Skl ≤ |H m| for every pair of keys k and l. Let’s assume that k = x 1 ...xn and l = y 1 ...yn collide. This means

∑^ n i=

aixi ≡

∑^ n i=

aiyi mod m k 6 = l

Since k 6 = l they must differ in at least one digit. Without loss of generality, let’s say they differ in the first digit, i.e. x 1 6 = y 1. Therefore, we can rewrite the above as:

a 1 (x 1 − y 1 ) ≡

∑^ n i=

ai(yi − xi) mod m x 1 − y 1 6 = 0

This is where the choice of m being prime becomes important. From number theory, if m is prime, then every integer z ∈ { 1 , 2 , ..., m − 1 } has an unique inverse z−^1 ∈ { 1 , 2 , ..., m − 1 } such that zz−^1 ≡ 1 mod m.

(c) Argue that x 1 − y 1 has an unique inverse (x 1 − y 1 )−^1 such that (x 1 − y 1 )(x 1 − y 1 )−^1 ≡ 1 mod m.

(d) By multiplying both sides of the equation above by (x 1 −y 1 )−^1 , show that a 1 is uniquely determined from a 2 , a 3 , ..., an.

(e) Using part (d), find the number of hash functions |Skl| that can cause k and l to collide, and verify that H is universal.