Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Order Statistics and Selection Algorithm: Finding the Kth Order Statistic, Thesis of Engineering

The concept of order statistics and the selection problem, which involves finding the kth smallest element in a data set. the partition algorithm and its runtime analysis, as well as the importance of choosing a good pivot. The document also introduces the median-of-medians algorithm for finding a good pivot and analyzes its recurrence.

What you will learn

  • How does the selection problem differ from searching and sorting?
  • What is the kth order statistic in a data set?
  • How does the median-of-medians algorithm improve the worst-case bounds of the selection algorithm?
  • Why is it important to choose a good pivot in the partition algorithm?
  • What is the role of the partition algorithm in finding the kth order statistic?

Typology: Thesis

2020/2021

Uploaded on 12/28/2022

IshaShadija
IshaShadija ๐Ÿ‡ฎ๐Ÿ‡ณ

3 documents

1 / 40

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Divide-and-Conquer Algorithms
Part Four
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28

Partial preview of the text

Download Order Statistics and Selection Algorithm: Finding the Kth Order Statistic and more Thesis Engineering in PDF only on Docsity!

Divide-and-Conquer Algorithms

Part Four

Announcements

โ—

Problem Set 2 due right now.

โ— Can submit by Monday at 2:15PM using one late period. โ—

Problem Set 3 out, due July 22.

โ— (^) Play around with divide-and-conquer algorithms and recurrence relations! โ— Covers material up through and including today's lecture.

Outline for Today

โ—

The Selection Problem

โ— A problem halfway between searching and sorting. โ—

A Linear-Time Selection Algorithm

โ— (^) A nonobvious algorithm with a nontrivial runtime. โ—

The Substitution Method

โ— (^) Solving recurrences the Master Theorem can't handle.

Order Statistics

โ— Given a collection of data, the k th order statistic is the k th smallest value in the data set. โ— For the purposes of this course, we'll use zero-indexing, so the smallest element would be given by the 0 th order statistic. โ— To give a robust definition: the k th order statistic is the element that would appear at position k if the data were sorted.

An Initial Solution

โ—

Any ideas how to solve this?

โ—

Here is one simple solution:

โ— Sort the array. โ— Return the element at the k th position. โ—

Unless we know something special about

the array, this will run in time O( n log n ).

โ—

Can we do better?

A Useful Subroutine: Partition

โ— Given an input array, a partition algorithm chooses some element p (called the pivot ), then rearranges the array so that โ— All elements less than or equal to p are before p. โ— (^) All elements greater p are after p. โ— p is in the position it would occupy if the array were sorted. โ— The algorithm then returns the index of p. โ— We'll talk about how to choose which element should be the pivot later; right now, assume the algorithm chooses one arbitrarily.

Partitioning an Array

Partitioning and Selection

โ— There is a close connection between partitioning and the selection problem. โ— Let k be the desired index and p be the pivot index after a partition step. Then: โ— If p = k , return A[ k ]. โ— If p > k , recursively select element k from the elements before the pivot. โ— If p < k , recursively select element ( k โ€“ p โ€“ 1) from the elements after the pivot.

Some Facts

โ—

The partitioning algorithm on an array of

length n can be made to run in time ฮ˜( n ).

โ— Check the Problem Set Advice handout for an outline of an algorithm to do this. โ—

Partitioning algorithms give no

guarantee about which element is

selected as the pivot.

โ—

Each recursive call does ฮ˜( n ) work, then

makes a recursive call on a smaller array.

Analyzing the Runtime

โ— The runtime of our algorithm depends on our choice of pivot. โ— In the best-case, if we pick a pivot that ends up at position k , the runtime is ฮ˜( n ). โ— In the worst case, we pick always pick pivot that is the minimum or maximum value in the array. The runtime is given by this recurrence:

T(1) = ฮ˜(1)

T( n ) = T( n โ€“ 1) + ฮ˜( n )

T(1) = ฮ˜(1)

T( n ) = T( n โ€“ 1) + ฮ˜( n )

The Story So Far

โ—

If we have no control over the pivot in

the partition step, our algorithm has

runtime ฮฉ( n ) and O( n

2

โ—

Using heapsort, we could guarantee

O( n log n ) behavior.

โ—

Can we improve our worst-case bounds?

Finding a Good Pivot

โ—

Recall: We recurse on one of the two

pieces of the array if we don't

immediately find the element we want.

โ—

A good pivot should split the array so

that each piece is some constant fraction

of the size of the array.

โ— (^) (Those sizes don't have to be the same, though.)

Analyzing the Runtime

โ—

Our algorithm

โ— Recursively calls itself on the first 2/3 of the array. โ— Runs a partition step. โ— Then, either immediately terminates, or recurses in a piece of size n / 3 or a piece of size 2 n / 3. โ—

This gives the following recurrence:

T(1) = ฮ˜(1)

T( n ) โ‰ค 2T(2 n / 3) + ฮ˜( n )

T(1) = ฮ˜(1)

T( n ) โ‰ค 2T(2 n / 3) + ฮ˜( n )

Analyzing the Runtime

โ—

We have the following recurrence:

โ—

Can we apply the Master Theorem?

โ—

What are a , b , and d?

โ— a = 2 , b = 3 / 2 , and d = 1.

โ— Since log

3 / 2

2 > 1, the runtime is

T(1) = ฮ˜(1)

T( n ) โ‰ค 2T(2 n / 3) + ฮ˜( n )

T(1) = ฮ˜(1)

T( n ) โ‰ค 2T(2 n / 3) + ฮ˜( n )

O( n

log 3 / 2 2

) โ‰ˆ O( n