

































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
A comprehensive guide to various sorting algorithms, including linear search, binary search, bubble sort, selection sort, insertion sort, merge sort, quick sort, and radix sort. It explains the concepts behind each algorithm, provides code examples in c, and analyzes their time complexity. Suitable for students learning about data structures and algorithms.
Typology: Schemes and Mind Maps
1 / 41
This page cannot be seen from the preview
Don't miss anything!
1. Linear search, Binary search, Searching: Searching means to find whether a particular value is present in an array or not. If the value is present in the array, then searching is said to be successful and the searching process gives the location of that value in the array. However, if the value is not present in the array, the searching process displays an appropriate message and in this case searching is said to be unsuccessful. There are two popular methods for searching the array elements: linear search and binary search. The algorithm that should be used depends entirely on how the values are organized in the array. For example, if the elements of the array are arranged in ascending order, then binary search should be used, as it is more efficient for sorted lists in terms of complexity. Linear search: Linear search, also called as sequential search, is a very simple method used for searching an array for a particular value. It works by comparing the value to be searched with every element of the array one by one in a sequence until a match is found. Linear search is mostly used to search an unordered list of elements (array in which data elements are not sorted). For example, if an array A[] is declared and initialized as, int A[] = {10, 8, 2, 7, 3, 4, 9, 1, 6, 5}; and the value to be searched is VAL = 7, then searching means to find whether the value ‘ 7 ’ is present in the array or not. If yes, then it returns the position of its occurrence. Here, POS = 3 (index starting from 0).
Algorithm: Linear Search Input: A: List of Element VAL: Search Element Output: POS In Steps 1 and 2 of the algorithm, we initialize the value of POS and I. In Step 3, a while loop is executed that would be executed till I is less than N (total number of elements in the array). In Step 4, a check is made to see if a match is found between the current array element and VAL. If a match is found, then the position of the array element is printed, else the value of I is incremented to match the next element with VAL. However, if all the array elements have been compared with VAL and no match is found, then it means that VAL is not present in the array. Complexity of Linear Search Algorithm: Linear search executes in O(n) time where n is the number of elements in the array. Obviously, the best case of linear search is when VAL is equal to the first element of the array. In this case, only one comparison will be made. Likewise, the worst case will happen when either VAL is not present in the array or it is equal to the last element of the array. In both the cases, n comparisons will have to be made. However, the performance of the linear search algorithm can be improved by using a sorted array.
the second part of the directory. Again, we open some page in the middle and the whole process is repeated until we finally find the right name. Take another analogy. How do we find words in a dictionary? We first open the dictionary somewhere in the middle. Then, we compare the first word on that page with the desired word whose meaning we are looking for. If the desired word comes before the word on the page, we look in the first half of the dictionary, else we look in the second half. Again, we open a page in the first half of the dictionary and compare the first word on that page with the desired word and repeat the same procedure until we finally get the word. The same mechanism is applied in the binary search. Now, let us consider how this mechanism is applied to search for a value in a sorted array. Consider an array A[] that is declared and initialized as int A[] = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; and the value to be searched is VAL = 9. The algorithm will proceed in the following manner. BEG = 0, END = 10, MID = (0 + 10)/2 = 5 Now, VAL = 9 and A[MID] = A[5] = 5 A[5] is less than VAL, therefore, we now search for the value in the second half of the array. So, we change the values of BEG and MID. Now, BEG = MID + 1 = 6, END = 10, MID = (6 + 10)/2 =16/2 = 8 VAL = 9 and A[MID] = A[8] = 8 A[8] is less than VAL, therefore, we now search for the value in the second half of the segment.
So, again we change the values of BEG and MID. Now, BEG = MID + 1 = 9, END = 10, MID = (9 + 10)/2 = 9 Now, VAL = 9 and A[MID] = 9. In this algorithm, we see that BEG and END are the beginning and ending positions of the segment that we are looking to search for the element. MID is calculated as (BEG + END)/2. Initially, BEG =lower_bound and END = upper_bound. The algorithm will terminate when A[MID] = VAL. When the algorithm ends, we will set POS = MID. POS is the position at which the value is present in the array. However, if VAL is not equal to A[MID], then the values of BEG, END, and MID will be changed depending on whether VAL is smaller or greater than A[MID]. (a) If VAL < A[MID], then VAL will be present in the left segment of the array. So, the value of END will be changed as END = MID – 1. (b) If VAL > A[MID], then VAL will be present in the right segment of the array. So, the value of BEG will be changed as BEG = MID + 1. Finally, if VAL is not present in the array, then eventually, END will be less than BEG. When this happens, the algorithm will terminate and the search will be unsuccessful. Algorithm: BINARY_SEARCH Input: A: Element List VAL: Search element Output: POS
Write a program to search an element in an array using binary search. #include <stdio.h> #define size 10 // Added to make changing size of array easier int smallest(int arr[], int k, int n); // Added to sort array void selection_sort(int arr[], int n); // Added to sort array int main(int argc, char *argv[]) { int arr[size], num, i, n, beg, end, mid, found=0; printf("\n Enter the number of elements in the array: "); scanf("%d", &n); printf("\n Enter the elements: "); for(i=0;i<n;i++) { scanf("%d", &arr[i]); } selection_sort(arr, n); // Added to sort the array printf("\n The sorted array is: \n"); for(i=0;i<n;i++) printf(" %d\t", arr[i]); printf("\n\n Enter the number that has to be searched: "); scanf("%d", &num); beg = 0, end = n-1; while(beg<=end) { mid = (beg + end)/2; if (arr[mid] == num) { printf("\n %d is present in the array at position %d", num, mid+1); found =1; break; } else if (arr[mid]>num) end = mid-1; else beg = mid+1; } if (beg > end && found == 0) printf("\n %d does not exist in the array", num); return 0; } int smallest(int arr[], int k, int n) { int pos = k, small=arr[k], i; for(i=k+1;i<n;i++)
{ if(arr[i]< small) { small = arr[i]; pos = i; } } return pos; } void selection_sort(int arr[],int n) { int k, pos, temp; for(k=0;k<n;k++) { pos = smallest(arr, k, n); temp = arr[k]; arr[k] = arr[pos]; arr[pos] = temp; } } 2. Hashing – Definition, hash functions, Collision: The elements are not stored according to the value of the key. So in this case, we need a way to convert a five-digit key number to a two-digit array index. We need a function, which will do the transformation. In this case, we will use the term hash table for an array and the function that will carry out the transformation will be called a hash function. HASH TABLEs: Hash table is a data structure in which keys are mapped to array positions by a hash function. In the example discussed here we will use a hash function that extracts the last two digits of the key. Therefore, we map the keys to array locations or array indices. A value stored in a hash table can be searched in O(1) time by using a hash function which generates an address from the key (by producing the index of the array where the value is stored). Figure 15.3 shows a direct correspondence between the keys and the indices of the array. This concept is useful when the total universe of keys is small and when most of the keys are actually used from the whole set of keys. This is equivalent to our first example, where there are 100 keys
HASH FUNCTIONs: A hash function is a mathematical formula which, when applied to a key, produces an integer which can be used as an index for the key in the hash table. The main aim of a hash function is that elements should be relatively, randomly, and uniformly distributed. It produces a unique set of integers within some suitable range in order to reduce the number of collisions. In practice, there is no hash function that eliminates collisions completely. A good hash function can only minimize the number of collisions by spreading the elements uniformly throughout the array. In this section, we will discuss the popular hash functions which help to minimize collisions. But before that, let us first look at the properties of a good hash function. Properties of a Good Hash Function Low cost The cost of executing a hash function must be small, so that using the hashing technique becomes preferable over other approaches. For example, if binary search algorithm can search an element from a sorted table of n items with log 2 n key comparisons, then the hash function must cost less than performing log 2 n key comparisons. Determinism A hash procedure must be deterministic. This means that the same hash value must be generated for a given input value. However, this criteria excludes hash functions that depend on external variable parameters (such as the time of day) and on the memory address of the object being hashed (because address of the object may change during processing). Uniformity A good hash function must map the keys as evenly as possible over its output range. This means that the probability of generating every hash value in the output range should roughly be the same. The property of uniformity also minimizes the number of collisions. DIFFERENT HASH FUNCTIONS In this section, we will discuss the hash functions which use numeric keys. However, there can be cases in real-world applications where we can have alphanumeric keys rather than simple numeric keys. In such cases, the ASCII value of the character can be used to transform it into its equivalent numeric key. Once this transformation is done, any of the hash functions given below can be applied to generate the hash value. 1 Division Method: It is the most simple method of hashing an integer x_._ This method divides x by M and then uses the remainder obtained. In this case, the hash function can be given as
h(x) = x mod M The division method is quite good for just about any value of M and since it requires only a single division operation, the method works very fast. However, extra care should be taken to select a suitable value for M. For example, suppose M is an even number then h(x) is even if x is even and h(x) is odd if x is odd. If all possible keys are equi-probable, then this is not a problem. But if even keys are more likely than odd keys, then the division method will not spread the hashed values uniformly. Generally, it is best to choose M to be a prime number because making M a prime number increases the likelihood that the keys are mapped with a uniformity in the output range of values. M should also be not too close to the exact powers of 2. If we have h(x) = x mod 2k then the function will simply extract the lowest k bits of the binary representation of x. The division method is extremely simple to implement. The following code segment illustrates how to do this: int const M = 97; // a prime number int h (int x) { return (x % M); } A potential drawback of the division method is that while using this method, consecutive keys map to consecutive hash values. On one hand, this is good as it ensures that consecutive keys do not collide, but on the other, it also means that consecutive array locations will be occupied. This may lead to degradation in performance. 2 Multiplication Method: The steps involved in the multiplication method are as follows: Step 1 : Choose a constant A such that 0 < A < 1. Step2: Multiply the key k by A. Step 3 : Extract the fractional part of kA. Step 4 : Multiply the result of Step 3 by the size of hash table (m).
4 Folding Method: The folding method works in the following two steps: Step 1 : Divide the key value into a number of parts. That is, divide k into parts k 1 , k 2 , ..., kn, where each part has the same number of digits except the last part which may have lesser digits than the other parts. Step 2 : Add the individual parts. That is, obtain the sum of k 1 + k 2 + ... + kn. The hash value is produced by ignoring the last carry, if any. Note that the number of digits in each part of the key will vary depending upon the size of the hash table. For example, if the hash table has a size of 1000, then there are 1000 locations in the hash table. To address these 1000 locations, we need at least three digits; therefore, each part of the key must have three digits except the last part which may have lesser digits. COLLISIONS: As discussed earlier in this chapter, collisions occur when the hash function maps two different keys to the same location. Obviously, two records cannot be stored in the same location. Therefore, a method used to solve the problem of collision, also called collision resolution technique , is applied. The two most popular methods of resolving collisions are:
presence of a sentinel value indicates that the location contains no data value at present but can be used to hold a value. When a key is mapped to a particular memory location, then the value it holds is checked. If it contains a sentinel value, then the location is free and the data value can be stored in it. However, if the location already has some data value stored in it, then other slots are examined systematically in the forward direction to find a free slot. If even a single free location is not found, then we have an OVERFLOW condition. The process of examining memory locations in the hash table is called probing. Open addressing technique can be implemented using linear probing, quadratic probing, double hashing, and rehashing. Collision Resolution by Chaining: In chaining, each location in a hash table stores a pointer to a linked list that contains all the key values that were hashed to that location. That is, location l in the hash table points to the head of the linked list of all the key values that hashed to l. However, if no key value hashes to l, then location l in the hash table contains NULL. Figure 15.5 shows how the key values are mapped to a location in the hash table and stored in a linked list that corresponds to that location. APPLICATIONS OF HASHING: Hash tables are widely used in situations where enormous amounts of data have to be accessed to quickly search and retrieve information. A few typical examples where hashing is used are given here. Hashing is used for database indexing. Some database management systems store a separate file known as the index file. When data has to be retrieved from a file, the key information is first searched in the appropriate index file which references the exact record location of the data in the database file. This key information in the index file is often stored as a hashed value.
(c) In Pass 3, A[0] and A[1] are compared, then A[1] is compared with A[2], A[2] is compared with A[3], and so on. Finally, A[N–4] is compared with A[N–3]. Pass 3 involves n–3 comparisons and places the third biggest element at the third highest index of the array. (d) In Pass n–1, A[0] and A[1] are compared so that A[0]<A[1]. After this step, all the elements of the array are arranged in ascending order. Example To discuss bubble sort in detail, let us consider an array A[] that has the following elements: A[] = {30, 52, 29, 87, 63, 27, 19, 54} Pass 1: (a) Compare 30 and 52. Since 30 < 52, no swapping is done. (b) Compare 52 and 29. Since 52 > 29, swapping is done. 30, 29, 52, 87, 63, 27, 19, 54 (c) Compare 52 and 87. Since 52 < 87, no swapping is done. (d) Compare 87 and 63. Since 87 > 63, swapping is done. 30, 29, 52, 63, 87, 27, 19, 54 (e) Compare 87 and 27. Since 87 > 27, swapping is done. 30, 29, 52, 63, 27, 87, 19, 54 (f) Compare 87 and 19. Since 87 > 19, swapping is done. 30, 29, 52, 63, 27, 19, 87, 54 (g) Compare 87 and 54. Since 87 > 54, swapping is done. 30, 29, 52, 63, 27, 19, 54, 87 Observe that after the end of the first pass, the largest element is placed at the highest index of the array. All the other elements are still unsorted. Pass 2: (a) Compare 30 and 29. Since 30 > 29, swapping is done. 29, 30, 52, 63, 27, 19, 54, 87 (b) Compare 30 and 52. Since 30 < 52, no swapping is done.
(c) Compare 52 and 63. Since 52 < 63, no swapping is done. (d) Compare 63 and 27. Since 63 > 27, swapping is done. 29, 30, 52, 27, 63, 19, 54, 87 (e) Compare 63 and 19. Since 63 > 19, swapping is done. 29, 30, 52, 27, 19, 63, 54, 87 (f) Compare 63 and 54. Since 63 > 54, swapping is done. 29, 30, 52, 27, 19, 54, 63, 87 Observe that after the end of the second pass, the second largest element is placed at the second highest index of the array. All the other elements are still unsorted. Pass 3: (a) Compare 29 and 30. Since 29 < 30, no swapping is done. (b) Compare 30 and 52. Since 30 < 52, no swapping is done. (c) Compare 52 and 27. Since 52 > 27, swapping is done. 2 9, 30, 27, 52, 19, 54, 63, 87 (d) Compare 52 and 19. Since 52 > 19, swapping is done. 29, 30, 27, 19, 52, 54, 63, 87 (e) Compare 52 and 54. Since 52 < 54, no swapping is done. Observe that after the end of the third pass, the third largest element is placed at the third highest index of the array. All the other elements are still unsorted. Pass 4: (a) Compare 29 and 30. Since 29 < 30, no swapping is done. (b) Compare 30 and 27. Since 30 > 27, swapping is done. 29, 27, 30, 19, 52, 54, 63, 87 (c) Compare 30 and 19. Since 30 > 19, swapping is done. 29, 27, 19, 30, 52, 54, 63, 87 (d) Compare 30 and 52. Since 30 < 52, no swapping is done. Observe that after the end of the fourth pass, the fourth largest element is placed at the fourth highest index of the array. All the other elements are still unsorted. Pass 5:
f(n) = (n – 1) + (n – 2) + (n – 3) + ..... + 3 + 2 + 1 f(n) = n (n – 1)/ f(n) = n2/2 + O(n) = O(n2) Therefore, the complexity of bubble sort algorithm is O(n2). It means the time required to execute bubble sort is proportional to n2, where n is the total number of elements in the array. Programming Example: Write a program to enter n numbers in an array. Redisplay the array with elements being sorted in ascending order. #include <stdio.h> int main() { int i, n, temp, j, arr[10]; printf("\n Enter the number of elements in the array : "); scanf("%d", &n); printf("\n Enter the elements: "); for(i=0;i<n;i++) { scanf("%d", &arr [i]); } for(i=0;i<n;i++) { for(j=0;j<n–i–1;j++) { if(arr[j] > arr[j+1]) { temp = arr[j]; arr[j] = arr[j+1]; arr[j+1] = temp; } } } printf("\n The array sorted in ascending order is :\n"); for(i=0;i<n;i++) printf("%d\t", arr[i]); return 0; }
Output Enter the number of elements in the array : 10 Enter the elements : 8 9 6 7 5 4 2 3 1 10 The array sorted in ascending order is : 1 2 3 4 5 6 7 8 9 10 4. Selection sort: Selection sort is a sorting algorithm that has a quadratic running time complexity of O(n 2 ), there by making it inefficient to be used on large lists. Although selection sort performs worse than insertion sort algorithm, it is noted for its simplicity and also has performance advantages over more complicated algorithms in certain situations. Selection sort is generally used for sorting files with very large objects (records) and small keys. Technique Consider an array ARR with N elements. Selection sort works as follows: First find the smallest value in the array and place it in the first position. Then, find the second smallest value in the array and place it in the second position. Repeat this procedure until the entire array is sorted. Therefore, In Pass 1, find the position POS of the smallest value in the array and then swap ARR[POS] and ARR[0]. Thus, ARR[0] is sorted. In Pass 2, find the position POS of the smallest value in sub-array of N–1 elements. Swap ARR[POS] with ARR[1]. Now, ARR[0] and ARR[1] is sorted. In Pass N–1, find the position POS of the smaller of the elements ARR[N–2] and ARR[N–1]. Swap ARR[POS] and ARR[N–2] so that ARR[0], ARR[1], ..., ARR[N–1] is sorted.