Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Data Structures and Algorithms: Understanding Linear and Binary Searches, Study notes of Data Structures and Algorithms

An introduction to data structures, focusing on linear and binary searches. Data structures are essential for organizing and storing data efficiently. Linear searches involve checking each element in a list until the desired element is found, while binary searches use the divide-and-conquer approach on sorted lists. The advantages and disadvantages of both search algorithms, as well as their time and space complexities.

Typology: Study notes

2023/2024

Available from 04/01/2024

saloni-sharma-11
saloni-sharma-11 🇮🇳

1 document

1 / 14

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Data Structure
Introduction
Data Structure can be defined as the group of data elements which
provides an efficient way of storing and organising data in the computer
so that it can be used efficiently. Some examples of Data Structures are
arrays, Linked List, Stack, Queue, etc. Data Structures are widely used in
almost every aspect of Computer Science i.e. Operating System, Compiler
Design, Artifical intelligence, Graphics and many more.
Data Structures are the main part of many computer science algorithms
as they enable the programmers to handle the data in an efficient way.
Basic Terminology
Data structures are the building blocks of any program or the software.
Choosing the appropriate data structure for a program is the most difficult
task for a programmer. Following terminology is used as far as data
structures are concerned.
Data: Data can be defined as an elementary value or the collection of
values, for example, student's name and its id are the data about the
student.
Group Items: Data items which have subordinate data items are called
Group item, for example, name of a student can have first name and the
last name.
Record: Record can be defined as the collection of various data items, for
example, if we talk about the student entity, then its name, address,
course and marks can be grouped together to form the record for the
student.
File: A File is a collection of various records of one type of entity, for
example, if there are 60 employees in the class, then there will be 20
records in the related file where each record contains the data about each
employee.
Attribute and Entity: An entity represents the class of certain objects. it
contains various attributes. Each attribute represents the particular
property of that entity.
Field: Field is a single elementary unit of information representing the
attribute of an entity.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe

Partial preview of the text

Download Data Structures and Algorithms: Understanding Linear and Binary Searches and more Study notes Data Structures and Algorithms in PDF only on Docsity!

Data Structure

Introduction

Data Structure can be defined as the group of data elements which provides an efficient way of storing and organising data in the computer so that it can be used efficiently. Some examples of Data Structures are arrays, Linked List, Stack, Queue, etc. Data Structures are widely used in almost every aspect of Computer Science i.e. Operating System, Compiler Design, Artifical intelligence, Graphics and many more. Data Structures are the main part of many computer science algorithms as they enable the programmers to handle the data in an efficient way.

Basic Terminology

Data structures are the building blocks of any program or the software. Choosing the appropriate data structure for a program is the most difficult task for a programmer. Following terminology is used as far as data structures are concerned. Data: Data can be defined as an elementary value or the collection of values, for example, student's name and its id are the data about the student. Group Items: Data items which have subordinate data items are called Group item, for example, name of a student can have first name and the last name. Record: Record can be defined as the collection of various data items, for example, if we talk about the student entity, then its name, address, course and marks can be grouped together to form the record for the student. File: A File is a collection of various records of one type of entity, for example, if there are 60 employees in the class, then there will be 20 records in the related file where each record contains the data about each employee. Attribute and Entity: An entity represents the class of certain objects. it contains various attributes. Each attribute represents the particular property of that entity. Field: Field is a single elementary unit of information representing the attribute of an entity.

Need of Data Structures As applications are getting complexed and amount of data is increasing day by day, there may arrise the following problems: Processor speed: To handle very large amout of data, high speed processing is required, but as the data is growing day by day to the billions of files per entity, processor may fail to deal with that much amount of data. Data Search: Consider an inventory size of 106 items in a store, If our application needs to search for a particular item, it needs to traverse 106 items every time, results in slowing down the search process. Multiple requests: If thousands of users are searching the data simultaneously on a web server, then there are the chances that a very large server can be failed during that process in order to solve the above problems, data structures are used. Data is organized to form a data structure in such a way that all items are not required to be searched and required data can be searched instantly. Advantages of Data Structures Efficiency: Efficiency of a program depends upon the choice of data structures. For example: suppose, we have some data and we need to perform the search for a perticular record. In that case, if we organize our data in an array, we will have to search sequentially element by element. hence, using array may not be very efficient here. There are better data structures which can make the search process efficient like ordered array, binary search tree or hash tables. Reusability: Data structures are reusable, i.e. once we have implemented a particular data structure, we can use it at any other place. Implementation of data structures can be compiled into libraries which can be used by different clients. Abstraction: Data structure is specified by the ADT which provides a level of abstraction. The client program uses the data structure through interface only, without getting into the implementation details.

It is an abstract data structure, similar to stack. Queue is opened at both end therefore it follows First-In-First-Out (FIFO) methodology for storing the data items. Non Linear Data Structures: This data structure does not form a sequence i.e. each item or element is connected with two or more other items in a non-linear arrangement. The data elements are not arranged in sequential structure. Types of Non Linear Data Structures are given below: Trees: Trees are multilevel data structures with a hierarchical relationship among its elements known as nodes. The bottommost nodes in the herierchy are called leaf node while the topmost node is called root node. Each node contains pointers to point adjacent nodes. Tree data structure is based on the parent-child relationship among the nodes. Each node in the tree can have more than one children except the leaf nodes whereas each node can have atmost one parent except the root node. Trees can be classfied into many categories which will be discussed later in this tutorial. Graphs: Graphs can be defined as the pictorial representation of the set of elements (represented by vertices) connected by the links known as edges. A graph is different from tree in the sense that a graph can have cycle while the tree can not have the one.

Operations on data structure

  1. Traversing: Every data structure contains the set of data elements. Traversing the data structure means visiting each element of the data structure in order to perform some specific operation like searching or sorting. Example: If we need to calculate the average of the marks obtained by a student in 6 different subject, we need to traverse the complete array of marks and calculate the total sum, then we will devide that sum by the number of subjects i.e. 6, in order to find the average.
  2. Insertion: Insertion can be defined as the process of adding the elements to the data structure at any location. If the size of data structure is n then we can only insert n-1 data elements into it.
  3. Deletion: The process of removing an element from the data structure is called Deletion. We can delete an element from the data structure at any random location. If we try to delete an element from an empty data structure then underflow occurs.
  4. Searching: The process of finding the location of an element within the data structure is called Searching. There are two algorithms to

perform searching, Linear Search and Binary Search. We will discuss each one of them later in this tutorial.

  1. Sorting: The process of arranging the data structure in a specific order is known as Sorting. There are many algorithms that can be used to perform sorting, for example, insertion sort, selection sort, bubble sort, etc.
  2. Merging: When two lists List A and List B of size M and N respectively, of similar type of elements, clubbed or joined to produce the third list, List C of size (M+N), then this process is called merging

Algorithm

What is an Algorithm?

An algorithm is a process or a set of rules required to perform calculations or some other problem-solving operations especially by a computer. The formal definition of an algorithm is that it contains the finite set of instructions which are being carried in a specific order to perform the specific task. It is not the complete program or code; it is just a solution (logic) of a problem, which can be represented either as an informal description using a Flowchart or Pseudocode. Characteristics of an Algorithm The following are the characteristics of an algorithm:Input: An algorithm has some input values. We can pass 0 or some input value to an algorithm.  Output: We will get 1 or more output at the end of an algorithm.  Unambiguity: An algorithm should be unambiguous which means that the instructions in an algorithm should be clear and simple.  Finiteness: An algorithm should have finiteness. Here, finiteness means that the algorithm should contain a limited number of instructions, i.e., the instructions should be countable.  Effectiveness: An algorithm should be effective as each instruction in an algorithm affects the overall process.  Language independent: An algorithm must be language-independent so that the instructions in an algorithm can be implemented in any of the languages with the same output. The following are the steps required to add two numbers entered by the user: Step 1: Start Step 2: Declare three variables a, b, and sum. Step 3: Enter the values of a and b.

In the above code, the time complexity of the loop statement will be atleast n, and if the value of n increases, then the time complexity also increases. While the complexity of the code, i.e., return sum will be constant as its value is not dependent on the value of n and will provide the result in one step only. We generally consider the worst-time complexity as it is the maximum time taken for any given input size.  Space complexity: An algorithm's space complexity is the amount of space required to solve a problem and produce an output. Similar to the time complexity, space complexity is also expressed in big O notation. For an algorithm, the space is required for the following purposes: To store program instructions

  1. To store constant values
  2. To store variable values
  3. To track the function calls, jumping statements, etc. Auxiliary space: The extra space required by the algorithm, excluding the input size, is known as an auxiliary space. The space complexity considers both the spaces, i.e., auxiliary space, and space used by the input. So, Space complexity = Auxiliary space + Input size. Types of Algorithm Analysis:Best case: Define the input for which algorithm takes less time or minimum time. In the best case calculate the lower bound of an algorithm. Example: In the linear search when search data is present at the first location of large data then the best case occurs.  Worst Case : Define the input for which algorithm takes a long time or maximum time. In the worst calculate the upper bound of an algorithm. Example: In the linear search when search data is not present at all then the worst case occurs.  Average case : In the average case take all random inputs and calculate the computation time for all inputs. And then we divide it by the total number of inputs. Average case = all random case time / total no of case

Asymptotic notation

Asymptotic notation is a way to describe the running time or space complexity of an algorithm based on the input size. It is commonly used in complexity analysis to describe how an algorithm performs as the size of the input grows. The three most commonly used notations are Big O, Omega, and Theta.

Complexity Analysis of Algorithms

1. Theta Notation (Θ-Notation):

Since it represents the upper and the lower bound of the running time of an algorithm, it is used for analyzing the average-case complexity of an algorithm. Theta (Average Case) You add the running times for each possible input combination and take the average in the average case. Mathematical Representation of Theta notation: Θ (g(n)) = {f(n): there exist positive constants c1, c2 and n0 such that 0 ≤ c1 * g(n) ≤ f(n) ≤ c2 * g(n) for all n ≥ n0} Note: Θ(g) is a set Theta notation The above expression can be described as if f(n) is theta of g(n), then the value f(n) is always between c1 * g(n) and c2 * g(n) for large values of n (n ≥ n0). The definition of theta also requires that f(n) must be non- negative for values of n greater than n0. Consider the expression 3n 3 + 6n 2 + 6000 = Θ(n 3 ) , the dropping lower order terms is always fine because there will always be a number(n) after which Θ(n 3 ) has higher values than Θ(n 2 ) irrespective of the constants involved. For a given function g(n), we denote Θ(g(n)) is following set of functions. Examples : { 100 , log (2000) , 10^4 } belongs to Θ(1) { (n/4) , (2n+3) , (n/100 + log(n)) } belongs to Θ(n) { (n^2+n) , (2n^2) , (n^2+log(n))} belongs to Θ( n 2 ) Note: Θ provides exact bounds.

2. Big-O Notation (O-notation):

Big-O notation represents the upper bound of the running time of an algorithm. Therefore, it gives the worst-case complexity of an algorithm .It is the most widely used notation for Asymptotic analysis. It specifies the upper bound of a function. The maximum time required by an algorithm or the worst-case time complexity. It returns the highest possible output value(big-O) for a given input.

Let us consider the same Insertion sort example here. The time complexity of Insertion Sort can be written as Ω(n), but it is not very useful information about insertion sort, as we are generally interested in worst-case and sometimes in the average case. Examples : { (n^2+n) , (2n^2) , (n^2+log(n))} belongs to Ω( n^2) U { (n/4) , (2n+3) , (n/100 + log(n)) } belongs to Ω(n) U { 100 , log (2000) , 10^4 } belongs to Ω(1) Note: Here, U represents union, we can write it in these manner because Ω provides exact or lower bounds.

Time-Space Trade-Off in Algorithms

A tradeoff is a situation where one thing increases and another thing decreases. It is a way to solve a problem in:  Either in less time and by using more space, or  In very little space by spending a long amount of time. The best Algorithm is that which helps to solve a problem that requires less space in memory and also takes less time to generate the output. But in general, it is not always possible to achieve both of these conditions at the same time. Types of Space-Time Trade-offCompressed or Uncompressed data:- A space-time trade-off can be applied to the problem of data storage. If data stored is uncompressed, it takes more space but less time. But if the data is stored compressed, it takes less space but more time to run the decompression algorithm.  Smaller code or loop unrolling:- Smaller code occupies less space in memory but it requires high computation time that is required for jumping back to the beginning of the loop at the end of each iteration. Loop unrolling can optimize execution speed at the cost of increased binary size. It occupies more space in memory but requires less computation time. Program without loop unrolling (Take Less Space but More Time) int main(void) {

for (int i=0; i<5; i++) printf("Hello\n"); //print hello 5 times return 0; } Program with loop unrolling (Take More Space but Less Time) int main(void) { // unrolled the for loop in program 1 printf("Hello\n"); printf("Hello\n"); printf("Hello\n"); printf("Hello\n"); printf("Hello\n"); return 0; }

Linear Search Algorithm

Linear Search is defined as a sequential search algorithm that starts at one end and goes through each element of a list until the desired element is found, otherwise the search continues till the end of the data set. How Does Linear Search Algorithm Work? In Linear Search Algorithm,  Every element is considered as a potential match for the key and checked for the same.  If any element is found equal to the key, the search is successful and the index of that element is returned.  If no element is found equal to the key, the search yields “No match found”. For example: Consider the array arr[] = {10, 50, 30, 70, 80, 20, 90, 40} and key = 30 Step 1: Start from the first element (index 0) and compare key with each element (arr[i]).Comparing key with first element arr[0]. Since not equal, the iterator moves to the next element as a potential match. Step 2 : Comparing key with next element arr[1]. SInce not equal, the iterator moves to the next element as a potential match.

The binary search algorithm works by comparing the element to be searched by the middle element of the array and based on this comparison follows the required procedure. Case 1 − element = middle, the element is found return the index. Case 2 − element > middle, search for the element in the sub-array starting from middle+1 index to n. Case 3 − element < middle, search for element in the sub-array starting from 0 index to middle -1. Let the elements of array are - Let the element to search is, K = 56 We have to use the below formula to calculate the mid of the array - mid = (beg + end)/ So, in the given array - beg = 0 end = 8 mid = (0 + 8)/2 = 4. So, 4 is the mid of the array.

Now, the element to search is found. So algorithm will return the index of the element matched. Binary Search complexity Now, let's see the time complexity of Binary search in the best case, average case, and worst case. We will also see the space complexity of Binary search.

Time Complexity

o Best Case Complexity - In Binary search, best case occurs when the element to search is found in first comparison, i.e., when the first middle element itself is the element to be searched. The best-case time complexity of Binary search is O(1). o Average Case Complexity - The average case time complexity of Binary search is O(logn). o Worst Case Complexity - In Binary search, the worst case occurs, when we have to keep reducing the search space till it has only one element. The worst-case time complexity of Binary search is O(logn).

Space Complexity

o The space complexity of binary search is O(1).