Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Data Organization and Algorithm Efficiency: A Basic Introduction, Study notes of Data Structures and Algorithms

A fundamental introduction to data organization and algorithm efficiency. It covers basic terminology related to data, including data types, variables, records, and files. It also explores the concept of algorithms, their efficiency, and the use of asymptotic notations (big-o, big-omega, big-theta) to analyze their performance. The document emphasizes the importance of choosing appropriate data structures and algorithms for efficient programming.

Typology: Study notes

2020/2021

Uploaded on 11/07/2024

sai-charan-21
sai-charan-21 🇮🇳

1 document

1 / 164

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
UNIT-1
1.1 BASIC TERMINOLOGY: ELEMENTARY DATA ORGANIZATION
1.1.1 Data and Data Item
Data are simply collection of facts and figures. Data are values or set of values. A data item
refers to a single unit of values. Data items that are divided into sub items are group items; those
that are not are called elementary items. For example, a students name may be divided into three
sub items [first name, middle name and last name] but the ID of a student would normally be
treated as a single item.
In the above example ( ID, Age, Gender, First, Middle, Last, Street, Area ) are elementary data
items, whereas (Name, Address ) are group data items.
1.1.2 Data Type
Data type is a classification identifying one of various types of data, such as floating-point,
integer, or Boolean, that determines the possible values for that type; the operations that can be
done on values of that type; and the way values of that type can be stored. It is of two types:
Primitive and non-primitive data type. Primitive data type is the basic data type that is provided
by the programming language with built-in support. This data type is native to the language and
is supported by machine directly while non-primitive data type is derived from primitive data
type. For example- array, structure etc.
1.1.3 Variable
It is a symbolic name given to some known or unknown quantity or information, for the purpose
of allowing the name to be used independently of the information it represents. A variable name
in computer source code is usually associated with a data storage location and thus also its
contents and these may change during the course of program execution.
1.1.4 Record
Collection of related data items is known as record. The elements of records are usually called
fields or members. Records are distinguished from arrays by the fact that their number of fields
is typically fixed, each field has a name, and that each field may have a different type.
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download Data Organization and Algorithm Efficiency: A Basic Introduction and more Study notes Data Structures and Algorithms in PDF only on Docsity!

UNIT-

1.1 BASIC TERMINOLOGY: ELEMENTARY DATA ORGANIZATION

1.1.1 Data and Data Item Data are simply collection of facts and figures. Data are values or set of values. A data item refers to a single unit of values. Data items that are divided into sub items are group items; those that are not are called elementary items. For example, a student’s name may be divided into three sub items – [first name, middle name and last name] but the ID of a student would normally be treated as a single item.

In the above example ( ID, Age, Gender, First, Middle, Last, Street, Area ) are elementary data items, whereas (Name, Address ) are group data items.

1.1.2 Data Type Data type is a classification identifying one of various types of data, such as floating-point, integer, or Boolean, that determines the possible values for that type; the operations that can be done on values of that type; and the way values of that type can be stored. It is of two types: Primitive and non-primitive data type. Primitive data type is the basic data type that is provided by the programming language with built-in support. This data type is native to the language and is supported by machine directly while non-primitive data type is derived from primitive data type. For example- array, structure etc.

1.1.3 Variable It is a symbolic name given to some known or unknown quantity or information, for the purpose of allowing the name to be used independently of the information it represents. A variable name in computer source code is usually associated with a data storage location and thus also its contents and these may change during the course of program execution.

1.1.4 Record Collection of related data items is known as record. The elements of records are usually called fields or members. Records are distinguished from arrays by the fact that their number of fields is typically fixed, each field has a name, and that each field may have a different type.

1.1.5 Program A sequence of instructions that a computer can interpret and execute is termed as program.

1.1.6 Entity An entity is something that has certain attributes or properties which may be assigned some values. The values themselves may be either numeric or non-numeric. Example:

1.1.7 Entity Set An entity set is a group of or set of similar entities. For example, employees of an organization, students of a class etc. Each attribute of an entity set has a range of values, the set of all possible values that could be assigned to the particular attribute. The term “information” is sometimes used for data with given attributes, of, in other words meaningful or processed data.

1.1.8 Field A field is a single elementary unit of information representing an attribute of an entity, a record is the collection of field values of a given entity and a file is the collection of records of the entities in a given entity set.

1.1.9 File File is a collection of records of the entities in a given entity set. For example, file containing records of students of a particular class.

1.1.10 Key A key is one or more field(s) in a record that take(s) unique values and can be used to distinguish one record from the others.

1.2 ALGORITHM

A well-defined computational procedure that takes some value, or a set of values, as input and produces some value, or a set of values, as output. It can also be defined as sequence of computational steps that transform the input into the output.

An algorithm can be expressed in three ways:- (i) in any natural language such as English, called pseudo code. (ii) in a programming language or (iii) in the form of a flowchart.

1.3 EFFICIENCY OF AN ALGORITHM

In computer science, algorithmic efficiency are the properties of an algorithm which relate to the

amount of resources used by the algorithm. An algorithm must be analyzed to determine its

numbers to the set of real numbers.

We say that f and g are asymptotic and write f(x) ≈ g(x) if

f(x) / g(x) = c (constant)

1.5.2 Asymptotic Notations

1.7.2.1 Big-Oh Notation (O)

It provides possibly asymptotically tight upper bound for f(n) and it does not give best case complexity but can give worst case complexity.

Let f be a nonnegative function. Then we define the three most common asymptotic bounds as

follows.

We say that f(n) is Big-O of g(n), written as f(n) = O(g(n)), iff there are positive constants c and

n0 such that

0 ≤ f(n) ≤ c g(n) for all n ≥ n

If f(n) = O(g(n)), we say that g(n) is an upper bound on f(n).

Example - n^2 + 50n = O(n^2 )

0 ≤ h(n) ≤ c g(n)

0 ≤ n^2 + 50n ≤ c n^2

0/n^2 ≤ n^2 /n^2 + 50n/n^2 ≤ c n^2 /n^2 Divide by n^2

0 ≤ 1 + 50/n ≤ c Note that 50/n → 0 as n → ∞

Pick n = 50

0 ≤ 1 + 50/50 = 2 ≤ c = 2 With c=

0 ≤ 1 + 50/n 0 ≤ 2 Find n 0

-1 ≤ 50/n 0 ≤ 1

-20n 0 ≤ 50 ≤ n 0 = 50 n 0 =

0 ≤ n^2 + 50n ≤ 2n^2 ∀ n ≥ n 0 =50, c=

1.7.2.2 Big-Omega Notation (Ω)

It provides possibly asymptotically tight lower bound for f(n) and it does not give worst case

complexity but can give best case complexity

f(n) is said to be Big-Omega of g(n), written as f(n) = Ω(g(n)), iff there are positive constants c

and n0 such that

0 ≤ c g(n) ≤ f(n) for all n ≥ n

If f(n) = Ω(g(n)), we say that g(n) is a lower bound on f(n).

Example - n^3 = Ω(n^2 ) with c=1 and n 0 =

0 ≤ c g(n) ≤ h(n)

0 ≤ 1*1^2 = 1 ≤ 1 = 1^3

0 ≤ c g(n) ≤ h(n)

O Determine c 2 = ½

½-2/n ≤ c 2 = ½ ½-2/n = ½ maximum of ½-2/n

Ω Determine c 1 = 1/

0 < c 1 ≤ ½-2/n 0 < c 1 minimum when n=

0 < c 1 ≤ ½-2/

0 < c 1 ≤ 5/10-4/10 = 1/

n 0 Determine n 0 = 5

c 1 ≤ ½-2/n 0 ≤ c 2

1/10 ≤ ½-2/n 0 ≤ ½

1/10-½ ≤ -2/n 0 ≤ 0 Subtract ½

-4/10 ≤ -2/n 0 ≤ 0

-4/10 n 0 ≤ -2 ≤ 0 Multiply by n 0

-n 0 ≤ -2*10/4 ≤ 0 Multiply by 10/

n 0 ≥ 2*10/4 ≥ 0 Multiply by -

n 0 ≥ 5 ≥ 0

n 0 ≥ 5 n 0 = 5 satisfies

Θ 0 < c 1 n^2 ≤ n^2 /2-2n ≤ c 2 n^2 ∀n ≥ n 0 with c 1 =1/10, c 2 =½ and n 0 =

1.5.3 Time Space Trade-off

The best algorithm to solve a given problem is one that requires less memory space and less time to run to completion. But in practice, it is not always possible to obtain both of these objectives. One algorithm may require less memory space but may take more time to complete its execution. On the other hand, the other algorithm may require more memory space but may take less time to run to completion. Thus, we have to sacrifice one at the cost of other. In other words, there is Space-Time trade-off between algorithms.

If we need an algorithm that requires less memory space, then we choose the first algorithm at the cost of more execution time. On the other hand if we need an algorithm that requires less time for execution, then we choose the second algorithm at the cost of more memory space.

1.6 ABSTRACT DATA TYPE

It can be defined as a collection of data items together with the operations on the data. The word

“abstract” refers to the fact that the data and the basic operations defined on it are being studied independently of how they are implemented. It involves what can be done with the data, not how

has to be done. For ex, in the below figure the user would be involved in checking that what can

be done with the data collected not how it has to be done.

An implementation of ADT consists of storage structures to store the data items and algorithms

for basic operation. All the data structures i.e. array, linked list, stack, queue etc are examples of

ADT.

1.7 DATA STRUCTURE

In computer science, a data structure is a particular way of storing and organizing data in a computer’s memory so that it can be used efficiently. Data may be organized in many different ways; the logical or mathematical model of a particular organization of data is called a data structure. The choice of a particular data model depends on the two considerations first; it must be rich enough in structure to mirror the actual relationships of the data in the real world. On the other hand, the structure should be simple enough that one can effectively process the data whenever necessary.

1.7.1 Need of data structure ∑ It gives different level of organization data. ∑ It tells how data can be stored and accessed in its elementary level. ∑ Provide operation on group of data, such as adding an item, looking up highest priority item. ∑ Provide a means to manage huge amount of data efficiently.

b) The other way is to have the linear relationship between the elements represented by means of pointers or links. These linear structures are called linked lists. The common examples of linear data structure are arrays, queues, stacks and linked lists.

1.7.3.4 Non-linear Data Structure This structure is mainly used to represent data containing a hierarchical relationship between elements. E.g. graphs, family trees and table of contents.

1.9 A BRIEF DESCRIPTION OF DATA STRUCTURES

1.8.1 Array The simplest type of data structure is a linear (or one dimensional) array. A list of a finite number n of similar data referenced respectively by a set of n consecutive numbers, usually 1, 2, 3....... n. if we choose the name A for the array, then the elements of A are denoted by subscript notation A 1, A 2, A 3.... A n or by the parenthesis notation A (1), A (2), A (3)...... A (n) or by the bracket notation A [1], A [2], A [3]...... A [n] Example: A linear array A[8] consisting of numbers is pictured in following figure.

1.8.2 Linked List A linked list or one way list is a linear collection of data elements, called nodes, where the linear order is given by means of pointers. Each node is divided into two parts: ∑ The first part contains the information of the element/node ∑ The second part contains the address of the next node (link /next pointer field) in the list. There is a special pointer Start/List contains the address of first node in the list. If this special pointer contains null, means that List is empty. Example:

1.8.3 Tree Data frequently contain a hierarchical relationship between various elements. The data structure which reflects this relationship is called a rooted tree graph or, simply, a tree.

1.8.4 Graph Data sometimes contains a relationship between pairs of elements which is not necessarily hierarchical in nature, e.g. an airline flights only between the cities connected by lines. This data structure is called Graph.

1.8.5 Queue A queue, also called FIFO system, is a linear list in which deletions can take place only at one end of the list, the Font of the list and insertion can take place only at the other end Rear.

1.8.6 Stack It is an ordered group of homogeneous items of elements. Elements are added to and removed from the top of the stack (the most recently added items are at the top of the stack). The last element to be added is the first to be removed (LIFO: Last In, First Out).

1.9 DATA STRUCTURES OPERATIONS The data appearing in our data structures are processed by means of certain operations. In fact, the particular data structure that one chooses for a given situation depends largely in the frequency with which specific operations are performed. The following four operations play a major role in this text:

Traversing: accessing each record/node exactly once so that certain items in the record may be processed. (This accessing and processing is sometimes called “visiting” the record.) ∑ Searching: Finding the location of the desired node with a given key value, or finding the locations of all such nodes which satisfy one or more conditions. ∑ Inserting: Adding a new node/record to the structure. ∑ Deleting: Removing a node/record from the structure.

1.10 ARRAYS: DEFINITION

So the address of forth element is 503 because the first element in 500.

When the program indicate or dealing with element of array in any instruction like (write (X [I]), read (X [I] ) ), the compiler depend on going relation to bounding the requirement address.

1.10.2 Two-Dimensional Arrays

The simplest form of the multidimensional array is the two-dimensional array. A two- dimensional array is, in essence, a list of one-dimensional arrays. To declare a two-dimensional integer array of size x,y you would write something as follows:

type arrayName [ x ][ y ];

Where type can be any valid C data type and arrayName will be a valid C identifier. A two- dimensional array can be think as a table which will have x number of rows and y number of columns. A 2-dimensional array a , which contains three rows and four columns can be shown as below:

Thus, every element in array a is identified by an element name of the form a[ i ][ j ] , where a is the name of the array, and i and j are the subscripts that uniquely identify each element in a.

1.10.2.1 Representation of two dimensional arrays in memory A two dimensional ‘m x n’ Array A is the collection of m X n elements. Programming language stores the two dimensional array in one dimensional memory in either of two ways- ∑ Row Major Order: First row of the array occupies the first set of memory locations reserved for the array; Second row occupies the next set, and so forth.

To determine element address A[i,j]:

Location ( A[ i,j ] ) =Base Address + ( N x ( I - 1 ) ) + ( j - 1 )

For example:

Given an array [1…5,1…7] of integers. Calculate address of element T[4,6], where BA=900.

Sol) I = 4 , J = 6

M= 5 , N= 7

Location (T [4,6]) = BA + (7 x (4-1)) + (6-1)

= 900+ (7 x 3) +

= 900+ 21+

= 926

∑ Column Major Order : Order elements of first column stored linearly and then comes elements of next column.

To determine element address A[i,j]:

Location ( A[ i,j ] ) =Base Address + ( M x ( j - 1 ) ) + ( i - 1 )

For example:

Given an array [1…6,1…8] of integers. Calculate address element T[5,7], where BA=

Sol) I = 5 , J = 7

M= 6 , N= 8

Location (T [4,6]) = BA + (6 x (7-1)) + (5-1)

= 300+ (6 x 6) +

Columns- wise

Location (A[i,j,k]) = BA + MN(k-1) + M(j-1) + (i-1)

Location (A[5,3,6]) = 900 + 8x5(6-1) + 8(3-1) + (5-1)

= 900 + 40 x 5 +8 x 2 + 4

= 900 + 200 +16 +

1.10.4 Operations on array a) Traversing: means to visit all the elements of the array in an operation is called traversing. b) Insertion: means to put values into an array c) Deletion / Remove: to delete a value from an array. d) Sorting: Re-arrangement of values in an array in a specific order (Ascending or Descending) is called sorting. e) Searching: The process of finding the location of a particular element in an array is called searching.

a) Traversing in Linear Array: It means processing or visiting each element in the array exactly once; Let ‘A’ is an array stored in the computer’s memory. If we want to display the contents of ‘A’ , it has to be traversed i.e. by accessing and processing each element of ‘A’ exactly once.

The alternate algorithm for traversing (using for loop) is :

This program will traverse each element of the array to calculate the sum and then calculate & print the average of the following array of integers. ( 4, 3, 7, -1, 7, 2, 0, 4, 2, 13)

#include <iostream.h> #define size 10 // another way int const size = 10 int main() { int x[10]={4,3,7,-1,7,2,0,4,2,13}, i, sum=0,LB=0, UB=size; float av; for(i=LB,i<UB;i++) sum = sum + x[i]; av = (float)sum/size; cout<< “The average of the numbers= “<<av<<endl; return 0; }

b) Sorting in Linear Array: Sorting an array is the ordering the array elements in ascending (increasing from min to max) or descending (decreasing from max to min) order.

Bubble Sort: The technique we use is called “Bubble Sort” because the bigger value gradually bubbles their way up to the top of array like air bubble rising in water, while the small values sink to the bottom of array. This technique is to make several passes through the array. On each pass, successive pairs of elements are compared. If a pair is in increasing order (or the values are identical), we leave the values as they are. If a pair is in decreasing order, their values are swapped in the array.

Arrays are used to implement mathematical vectors and matrices, as well as other kinds of

rectangular tables. Many databases, small and large, consist of (or include) one-dimensional

arrays whose elements are records.

Arrays are used to implement other data structures, such as heaps, hash tables, deques, queues, stacks, strings, and VLists.

One or more large arrays are sometimes used to emulate in-program dynamic memory allocation,

particularly memory pool allocation. Historically, this has sometimes been the only way to

allocate "dynamic memory" portably.

Arrays can be used to determine partial or complete control flow in programs, as a compact

alternative to (otherwise repetitive) multiple IF statements. They are known in this context

as control tables and are used in conjunction with a purpose built interpreter whose control

flow is altered according to values contained in the array.

1.10.6 Sparse matrix

Matrix with maximum zero entries is termed as sparse matrix. It can be represented as:

ÿ Lower triangular matrix: It has non-zero entries on or below diagonal.

ÿ Upper Triangular matrix: It has non-zero entries on or above diagonal.

ÿ Tri-diagonal matrix: It has non-zero entries on diagonal and at the places immediately above or below diagonal.

1.11 STATIC AND DYNAMIC MEMORY ALLOCATION

In many programming environments memory allocation to variables can be of two types static

memory allocation and dynamic memory allocation. Both differ on the basis of time when

memory is allocated. In static memory allocation memory is allocated to variable at compile time

whereas in dynamic memory allocation memory is allocated at the time of execution. Other

differences between both memory allocation techniques are summarized below-

1.12 LINKED LIST

A linked list or one way list is a linear collection of data elements, called nodes, where the linear order is given by means of pointers . Each node is divided into two parts. ÿ The first part contains the information of the element. ÿ The second part called the link field contains the address of the next node in the list. To see this more clearly lets look at an example:

The Head is a special pointer variable which contains the address of the first node of the list. If there is no node available in the list then Head contains NULL value that means, List is empty. The left part of the each node represents the information part of the node, which may contain an