Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

The Notes for Better understanding of Database, Lecture notes of Database Management Systems (DBMS)

Handwritten notes Of Database management system

Typology: Lecture notes

2023/2024

Uploaded on 05/22/2025

mugdhesh-lad
mugdhesh-lad 🇮🇳

1 document

1 / 121

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
DATABASE ENGINEERING
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d
pf1e
pf1f
pf20
pf21
pf22
pf23
pf24
pf25
pf26
pf27
pf28
pf29
pf2a
pf2b
pf2c
pf2d
pf2e
pf2f
pf30
pf31
pf32
pf33
pf34
pf35
pf36
pf37
pf38
pf39
pf3a
pf3b
pf3c
pf3d
pf3e
pf3f
pf40
pf41
pf42
pf43
pf44
pf45
pf46
pf47
pf48
pf49
pf4a
pf4b
pf4c
pf4d
pf4e
pf4f
pf50
pf51
pf52
pf53
pf54
pf55
pf56
pf57
pf58
pf59
pf5a
pf5b
pf5c
pf5d
pf5e
pf5f
pf60
pf61
pf62
pf63
pf64

Partial preview of the text

Download The Notes for Better understanding of Database and more Lecture notes Database Management Systems (DBMS) in PDF only on Docsity!

DATABASE ENGINEERING

DATABASE ENGINEERING

LECTURE NOTES

Prepared by

Dr. Subasish Mohapatra

Department of Computer Science and Application

College of Engineering and Technology, Bhubaneswar

Biju Patnaik University of Technology, Odisha

CONTENTS

LECTURE-1: Introduction to Data

LECTURE-2: DBMS

LECTURE-3: 3 level Architecture of DBMS

LECTURE-4: Elements of DBMS

LECTURE-5: ER-MODEL

LECTURE-6: ER-DIAGRAM:

LECTURE-7: Advanced ER-Diagram:

LECTURE-8: Conversion of ER-Diagram to Relational Database

LECTURE-9: Record Based Logical Model

LECTURE-10: RELATIONAL MODEL

LECTURE-11: CONSTRAINTS

LECTURE-12: FILE ORGANISATION

LECTURE-13: INDEX

LECTURE-14: Clustering Index

LECTURE-15: B+ Tree Index

LECTURE-16: Hash File Organization

LECTURE-17: Query Processing

LECTURE-18: Evaluation of Expressions

LECTURE- 19 Relational Algebra

LECTURE-20 Additional Operations

LECTURE- 21 Tuple Relational Calculus

LECTURE- 22 Structured Query Language (SQL)

LECTURE- 23 Nested Sub queries

LECTURE- 24 Integrity Constraints

LECTURE- 25 Query by Example (QBE)

LECTURE- 26 Relational Database Design

LECTURE- 27 Closure of a set of Functional Dependencies

LECTURE- 28 Loss less Decomposition

LECTURE-29 Normalization

LECTURE-30 Boyce-Code Normal Form (BCNF)

LECTURE- 31 Query Processing

LECTURE- 32 Query Optimization

LECTURE- 33 Transaction

LECTURE- 34 Problems due to locking

LECTURE- 35 Multiversion Technique Based on Timestamp Ordering

LECTURE- 36 Serializability

LECTURE- 37 Object Oriented Databases

LECTURE- 38 Parallel Database

Module-1:

LECTURE- 1 : Introduction to Data

Introduction:

In computerized information system data are the basic resource of the organization. So, proper

organization and management for data is required for organization to run smoothly. Database

management system deals the knowledge of how data stored and managed on a computerized

information system. In any organization, it requires accurate and reliable data for better decision

making, ensuring privacy of data and controlling data efficiently.

The examples include deposit and/or withdrawal from a bank, hotel, airline or railway reservation,

purchase items from supermarkets in all cases, a database is accessed.

What is data?

Data are the known facts or figures that have implicit meaning. It can also be defined as it is the

representation of facts, concepts or instructions in a formal manner, which is suitable for

understanding and processing. Data can be represented in alphabets (A-Z, a-z), digits (0-9) and

using special characters (+,-.#,$, etc)

e.g: 25, “ajit” etc.

Information:

Information is the processed data on which decisions and actions are based. Information can be

defined as the organized and classified data to provide meaningful values.

Eg: “The age of Ravi is 25”

File:

File is a collection of related data stored in secondary memory.

File Oriented Approach:

The traditional file oriented approach to information processing each application has a separate

master file and its own set of personal file. In file oriented approach the program dependent on the

files and files dependent upon the programs.

Disadvantages of file oriented approach:

  1. Data redundancy and inconsistency:

The same information may be written in several files. This redundancy leads to higher

storage and access cost. It may lead data inconsistency that is the various copies of the same

data may present at multiple places for example a changed customer address may be

reflected in single file but not else where in the system.

  1. Difficulty in accessing data :

The conventional file processing system do not allow data to be retrieved in a convenient

and efficient manner according to user choice.

  1. Data isolation :

Because data are scattered in various files and files may be in different formats with new

application programs to retrieve the appropriate data is difficult.

  1. Integrity Problems:

LECTURE- 2 : DBMS

Database:

A database is organized collection of related data of an organization stored in formatted way which

is shared by multiple users.

The main feature of data in a database are:

  1. It must be well organized
  2. It is related
  3. It is accessible in a logical order without any difficulty
  4. It is stored only once

For example consider the roll no, name, address of a student stored in a student file. It is collection

of related data with an implicit meaning. Data in the database may be persistent, integrated and

shared.

Persistent:

If data is removed from database due to some explicit request from user to remove.

Integrated:

A database can be a collection of data from different files and when any redundancy among those

files are removed from database is said to be integrated data.

Sharing Data:

The data stored in the database can be shared by multiple users simultaneously without affecting the

correctness of data.

Why Database:

In order to overcome the limitation of a file system, a new approach was required. Hence a database

approach emerged. A database is a persistent collection of logically related data. The initial attempts

were to provide a centralized collection of data. A database has a self describing nature. It contains

not only the data sharing and integration of data of an organization in a single database.

A small database can be handled manually but for a large database and having multiple users it is

difficult to maintain it. In that case a computerized database is useful.

The advantages of database system over traditional, paper based methods of record keeping are:

Compactness : No need for large amount of paper files

Speed : The machine can retrieve and modify the data more faster way then human being

Less drudgery : Much of the maintenance of files by hand is eliminated

Accuracy: Accurate, up-to-date information is fetched as per requirement of the user at any

time.

Database Management System (DBMS):

A database management system consists of collection of related data and refers to a set of programs

for defining, creation, maintenance and manipulation of a database.

Function of DBMS:

1. Defining database schema : it must give facility for defining the database structure also

specifies access rights to authorized users.

2. Manipulation of the database: The dbms must have functions like insertion of record into

database, updation of data, deletion of data, retrieval of data

3. Sharing of database: The DBMS must share data items for multiple users by maintaining

consistency of data.

4. Protection of database: It must protect the database against unauthorized users. 5. Database recovery: If for any reason the system fails DBMS must facilitate data base

recovery.

Advantages of DBMS:

Reduction of redundancies:

Centralized control of data by the DBA avoids unnecessary duplication of data and effectively

reduces the total amount of data storage required avoiding duplication in the elimination of the

inconsistencies that tend to be present in redundant data files.

Sharing of Data:

A database allows the sharing of data under its control by any number of application programs or

users.

Data Integrity:

Data integrity means that the data contained in the database is both accurate and consistent.

Therefore data values being entered for storage could be checked to ensure that they fall with in a

specified range and are of the correct format.

Data Security:

The DBA who has the ultimate responsibility for the data in the dbms can ensure that proper access

procedures are followed including proper authentication to access to the DataBase System and

additional check before permitting access to sensitive data.

Conflict Resolution:

DBA resolve the conflict on requirements of various user and applications. The DBA chooses the

best file structure and access method to get optional performance for the application.

Data Independence:

Data independence is usually considered from two points of views; physically data independence

and logical data independence.

Physical Data Independence allows changes in the physical storage devices or organization of the

files to be made without requiring changes in the conceptual view or any of the external views and

hence in the application programs using the data base.

Logical Data Independence indicates that the conceptual schema can be changed without affecting

the existing external schema or any application program.

LECTURE- 3 : 3 level Architecture of DBMS

Database Basics:

Data Item:

The data item is also called as field in data processing and is the smallest unit of data that has

meaning to its users.

Eg: “e101”, ”sumit”

Entities and attributes:

An entity is a thing or object in the real world that is distinguishable from all other objects

Eg: Bank, employee, student

Attributes are properties are properties of an entity.

Eg: Empcode, ename, rolno, name

Logical data and physical data :

Logical data are the data for the table created by user in primary memory.

Physical data refers to the data stored in the secondary memory.

Schema and sub-schema :

A schema is a logical data base description and is drawn as a chart of the types of data that are used.

It gives the names of the entities and attributes and specify the relationships between them.

A database schema includes such information as :

 Characteristics of data items such as entities and attributes.

 Logical structures and relationships among these data items.

 Format for storage representation.

 Integrity parameters such as physical authorization and back up policies.

A subschema is derived schema derived from existing schema as per the user requirement. There

may be more then one subschema create for a single conceptual schema.

Three Level Architecture of DBMS :

Internal level

External level

Conceptual

level

View

user

View

User

View

User n

Mapping supplied by DBMS

Conceptual view

Mapping supplied by DBMS/OS

Internal level

A database management system that provides three level of data is said to follow three-level

architecture.

 External level

 Conceptual level

 Internal level

External Level :

The external level is at the highest level of database abstraction. At this level, there will be many

views define for different users requirement. A view will describe only a subset of the database. Any

number of user views may exist for a given global schema(coneptual schema).

For example, each student has different view of the time table. the view of a student of BTech

(CSE) is different from the view of the student of Btech (ECE). Thus this level of abstraction is

concerned with different categories of users.

Each external view is described by means of a schema called sub schema.

Conceptual Level :

At this level of database abstraction all the database entities and the relationships among them are

included. One conceptual view represents the entire database. This conceptual view is defined by

the conceptual schema.

The conceptual schema hides the details of physical storage structures and concentrate on

describing entities, data types, relationships, user operations and constraints.

It describes all the records and relationships included in the conceptual view. There is only one

conceptual schema per database. It includes feature that specify the checks to relation data

consistency and integrity.

Internal level :

It is the lowest level of abstraction closest to the physical storage method used. It indicates how the

data will be stored and describes the data structures and access methods to be used by the database.

The internal view is expressed by internal schema.

The following aspects are considered at this level:

  1. Storage allocation e.g: B-tree, hashing
  2. Access paths eg. specification of primary and secondary keys, indexes etc
  3. Miscellaneous eg. Data compression and encryption techniques, optimization of the internal

structures.

Database Users :

Naive Users :

Users who need not be aware of the presence of the database system or any other system supporting

their usage are considered naïve users. A user of an automatic teller machine falls on this category.

LECTURE- 4 : Elements of DBMS

Elements of DBMS:

DML Pre-Compiler:

It converts DML statements embedded in an application program to normal procedure calls in the

host language. The pre-complier must interact with the query processor in order to generate the

appropriate code.

DDL Compiler:

The DDL compiler converts the data definition statements into a set of tables. These tables contains

information concerning the database and are in a form that can be used by other components of the

dbms.

File Manager:

File manager manages the allocation of space on disk storage and the data structure used to

represent information stored on disk.

Database Manager:

A database manager is a program module which provides the interface between the low level data

stored in the database and the application programs and queries submitted to the system.

The responsibilities of database manager are:

1. Interaction with File Manager : The data is stored on the disk using the file system which

is provided by operating system. The database manager translate the different DML

statements into low-level file system commands so the database manager is responsible for

the actual storing, retrieving and updating of data in the database.

2. Integrity Enforcement: The data values stored in the database must satisfy certain

constraints (eg: the age of a person can't be less then zero). These constraints are specified

by DBA. Data manager checks the constraints and if it satisfies then it stores the data in the

database.

3. Security Enforcement: Data manager checks the security measures for database from

unauthorized users.

4. Backup and Recovery: Database manager detects the failures occur due to different

causes (like disk failure, power failure, deadlock, software error) and restores the database

to original state of the database.

5. Concurrency Control: When several users access the same database file simultaneously,

there may be possibilities of data inconsistency. It is responsible of database manager to

control the problems occur for concurrent transactions.

Query Processor:

The query processor used to interpret to online user’s query and convert it into an efficient series of

operations in a form capable of being sent to the data manager for execution. The query processor

uses the data dictionary to find the details of data file and using this information it create query

plan/access plan to execute the query.

Data Dictionary:

Data dictionary is the table which contains the information about database objects. It contains

information like

  1. external, conceptual and internal database description
  2. description of entities, attributes as well as meaning of data elements
  3. synonyms, authorization and security codes
  4. database authorization

The data stored in the data dictionary is called meta data.

DBMS STRUCTURE:

Que: List four significant differences between a File-Processing System and a DBMS.

Ans: Some major differences between a database management system and a file-processing system

are:

  • Both systems contain a collection of data and a set of programs which access that data. A

database management system coordinates both the physical and the logical access to the

data, whereas a file-processing system coordinates only the physical access.

Naïve user Application

programers

On line user DBA

Application

programs

System calls Ddl compiler

Application prog

obj code

Dml precomplier Query processor Ddl compiler

Database manager

File manager

Data file

Data dictionary

DBMS

database may be able to access parts of the database for which they lack authority. For

example, a high school student could get access to national defense secret codes, or

employees could find out what their supervisors earn.

d. Data could be lost permanently, rather than at least being available in a consistent state that

existed prior to a failure.

e. Consistency constraints may be violated when intgrity constraints failed in a transaction. For

example, incorrect bank balances might be reflected due to simultaneous withdrawals and

deposits, and so on.

Que. What are five main functions of a database administrator?

Ans: Five main functions of a database administrator are:

 To create the scheme definition

 To define the storage structure and access methods

 To modify the scheme and/or physical organization when necessary

 To grant authorization for data access

 To specify integrity constraints

Que: List six major steps that you would take in setting up a database for a particular enterprise.

Ans: Six major steps in setting up a database for a particular enterprise are:

 Define the high level requirements of the enterprise (this step generates a document known

as the system requirements specification.)

 Define a model containing all appropriate types of data and data relationships.

 Define the integrity constraints on the data.

 Define the physical level.

 For each known problem to be solved on a regular basis (e.g., tasks to be carried out by

clerks or Web users) define a user interface to carry out the task, and write the necessary

application programs to implement the user interface.

 Create/initialize the database.

EXERCISE:

  1. What is database management system?
  2. What are the disadvantage of file processing system?
  3. State advantage and disadvantage of database management system.
  4. What are different types of database users?
  5. What is data dictionary and what are its contents?
  6. What are the functions of DBA?
  7. What are the different database languages? Explain with example.
  8. Explain the three layer architecture of DBMS.
  9. Differentiate between physical data independence and logical data independence.
  10. Explain the functions of database manager.
  11. Explain meta data.

LECTURE- 5 : ER-MODEL

Data Model:

The data model describes the structure of a database. It is a collection of conceptual tools for

describing data, data relationships and consistency constraints and various types of data models

such as

  1. Object based logical model
  2. Record based logical model
  3. Physical model

Types of data model:

  1. Object based logical model

a. ER-model

b. Functional model

c. Object oriented model

d. Semantic model

  1. Record based logical model

a. Hierarchical database model

b. Network model

c. Relational model

  1. Physical model

Entity Relationship Model (ER Model)

The entity-relationship data model perceives the real world as consisting of basic objects, called

entities and relationships among these objects. It was developed to facilitate database design by

allowing specification of an enterprise schema which represents the overall logical structure of a

data base.

Main Features of ER-MODEL:

 Entity relationship model is a high level conceptual model

 It allows us to describe the data involved in a real world enterprise in terms of objects and

their relationships.

 It is widely used to develop an initial design of a database

 It provides a set of useful concepts that make it convenient for a developer to move from a

basic set of information to a detailed and description of information that can be easily

implemented in a database system

 It describes data as a collection of entities, relationships and attributes.

Basic Concepts:

The E-R data model employs three basic notions : entity sets, relationship sets and attributes.

Entity Sets:

An entity is a “thing” or “object” in the real world that is distinguishable from all other objects. For

example, each person in an enterprise is an entity. An entity has a set properties and the values for

some set of properties may uniquely identify an entity. BOOK is entity and its properties (called as

attributes) bookcode, booktitle, price etc.

An entity set is a set of entities of the same type that share the same properties, or attributes. The set

cardinalities must be one of the following:

1. One to One:

An entity in A is associated with at most one entity in B, and an entity in B is associated with at

most one entity in A.

Eg: relationship between college and principal

2. One to Many:

An entity in A is associated with any number of entities in B. An entity in B is associated with at the

most one entity in A.

Eg: Relationship between department and faculty

3. Many to One:

An entity in A is associated with at most one entity in B. An entity in B is associated with any

number in A.

4. Many to Many:

Entities in A and B are associated with any number of entities from each other.

More about Entities and Relationship:

Recursive Relationships:

When the same entity type participates more than once in a relationship type in different roles, the

relationship types are called recursive relationships.

Participation Constraints:

The participation constraints specify whether the existence of any entity depends on its being

related to another entity via the relationship. There are two types of participation constraints

college principal has

Department (^) Works Faculty

in

1 M

M

emp Department

Works

M

customer account

deposits

N

a) Total : When all the entities from an entity set participate in a relationship type, is called total

participation. For example, the participation of the entity set student on the relationship set must

‘opts’ is said to be total because every student enrolled must opt for a course.

b) Partial: When it is not necessary for all the entities from an entity set to particapte in a

relationship type, it is called partial participation. For example, the participation of the entity set

student in ‘represents’ is partial, since not every student in a class is a class representative.

Weak Entity:

Entity types that do not contain any key attribute, and hence can not be identified independently are

called weak entity types. A weak entity can be identified by uniquely only by considering some of

its attributes in conjunction with the primary key attribute of another entity, which is called the

identifying owner entity.

Generally a partial key is attached to a weak entity type that is used for unique identification of

weak entities related to a particular owner type. The following restrictions must hold:

 The owner entity set and the weak entity set must participate in one to may relationship set.

This relationship set is called the identifying relationship set of the weak entity set.

 The weak entity set must have total participation in the identifying relationship.

Example:

Consider the entity type Dependent related to Employee entity, which is used to keep track of the

dependents of each employee. The attributes of Dependents are: name, birthdate, sex and

relationship. Each employee entity set is said to its own the dependent entities that are related to it.

However, not that the ‘Dependent’ entity does not exist of its own, it is dependent on the Employee

entity.

Keys:

Super Key:

A super key is a set of one or more attributes that taken collectively, allow us to identify uniquely an

entity in the entity set. For example , customer-id, (cname, customer-id), (cname, telno)

Candidate Key:

In a relation R, a candidate key for R is a subset of the set of attributes of R, which have the

following properties:

1. Uniqueness: N o two distinct tuples in R have the same values for the candidate key

  1. Irreducible: No proper subset of the candidate key has the uniqueness property that is

the candidate key. Eg: (cname,telno)

Primary Key:

The primary key is the candidate key that is chosen by the database designer as the principal means

of identifying entities within an entity set. The remaining candidate keys if any, are called Alternate

Key.