Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Concurrency Control in Databases: Isolation, Serializability, and Recovery - Prof. Babu, Lecture notes of Database Management Systems (DBMS)

This document about database management system it is clear notes given by my professor. you can easily understand this notes trust me.

Typology: Lecture notes

2022/2023

Available from 03/03/2023

Saiteja_000
Saiteja_000 🇮🇳

4 documents

1 / 29

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Prepared by Ravindar.M, Assoc.Prof, CSE. Dept, JITS-KNR 1 | P a g e
UNIT IV
Transaction Management: Transaction Concept, Transaction State, Implementation of Atomicity and
Durability, Concurrent Executions, Serializability, Recoverability, Implementation of Isolation, Testing for
serializability, Lock Based Protocols, Timestamp Based Protocols, Validation- Based Protocols, Multiple
Granularity, Recovery and Atomicity, LogBased Recovery, Recovery with Concurrent Transactions.
1. TRANSACTION
Definition: A transaction is a single logical unit consisting of one or more database access
operation.
Example: Withdrawing 1000 rupees from ATM.
Every transaction is executed as a single unit.
If the database operations do not update the database but only retrieve data, this type of
transaction is called a read-only transaction.
A successful transaction can change the database from one consistent state to another
consistent state.
DBMS transactions must satisfy ACID properties (atomic, consistent, isolated and durable).
2. ACID PROPERTIES
ACID properties are used for maintaining the integrity of database during transaction processing.
ACID stands for Atomicity, Consistency, Isolation, and Durability.
Atomicity: This property ensure that either all of the tasks of a transaction are performed or
none of them. In simple words it is referred as all or nothing rule.
Each transaction is said to be atomic if when one part of the transaction fails, the entire
transaction fails. When all parts of the transaction completed successfully, then the transaction
said to be success. (“all or nothing rule” )
Example: Transferring $100 from account A to account B.
(Assume initially, account A balance = $400 and account B balance = 700$.)
The following set of operations are performed to withdraw 1000 rupees from database
i. Read current balance from Database (Let say 5000 rupees)
ii. Deduct 1000 from current balance ( 5000 1000 = 4000) one Transaction
iii. Update current balance in Database (4000 rupees)
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16
pf17
pf18
pf19
pf1a
pf1b
pf1c
pf1d

Partial preview of the text

Download Concurrency Control in Databases: Isolation, Serializability, and Recovery - Prof. Babu and more Lecture notes Database Management Systems (DBMS) in PDF only on Docsity!

UNIT – IV

Transaction Management: Transaction Concept, Transaction State, Implementation of Atomicity and Durability, Concurrent Executions, Serializability, Recoverability, Implementation of Isolation, Testing for serializability, Lock Based Protocols, Timestamp Based Protocols, Validation- Based Protocols, Multiple Granularity, Recovery and Atomicity, Log–Based Recovery, Recovery with Concurrent Transactions.

1. TRANSACTION

Definition: A transaction is a single logical unit consisting of one or more database access operation. Example: Withdrawing 1000 rupees from ATM.

 Every transaction is executed as a single unit.  If the database operations do not update the database but only retrieve data, this type of transaction is called a read-only transaction.  A successful transaction can change the database from one consistent state to another consistent state.  DBMS transactions must satisfy ACID properties (atomic, consistent, isolated and durable).

2. ACID PROPERTIES

ACID properties are used for maintaining the integrity of database during transaction processing. ACID stands for A tomicity, C onsistency, I solation, and D urability.

 Atomicity : This property ensure that either all of the tasks of a transaction are performed or

none of them. In simple words it is referred as “ all or nothing rule ”. Each transaction is said to be atomic if when one part of the transaction fails, the entire transaction fails. When all parts of the transaction completed successfully, then the transaction said to be success. (“ all or nothing rule ” ) Example: Transferring $100 from account A to account B. (Assume initially, account A balance = $400 and account B balance = 700$.)

The following set of operations are performed to withdraw 1000 rupees from database i. Read current balance from Database (Let say 5000 rupees) ii. Deduct 1000 from current balance ( 5000 – 1000 = 4000) one Transaction iii. Update current balance in Database (4000 rupees)

Transferring $100 from account A to account B has two operations a) Debiting 100$ from A’s balance ($400 - $100 = $300) b) Crediting 100$ to B’s balance ($700+$100 = $800) Let’s say first operation (a) passed successfully while second (b) failed, in this case A’s balance would be 300$ while B would be having 700$ instead of 800$. This is unacceptable in a banking system. Either the transaction should fail without executing any of the operation or it should process both the operations. The Atomicity property ensures that.

ii. Consistency : The consistency property ensures that the database must be in consistent state

before and after the transaction. There must not be any possibility that some data is incorrectly affected by the execution of a transaction. For example, transfering funds from one account to another, the consistency property ensures that the total values of funds in both the accounts is the same before and end of the transaction. i.e., Assume initially, A balance = $400 and B balance = 700$. The total balance of A + B = 1100$ (Before transferring 100$ from A to B) The total balance of A + B = 1100$ (After transferring 100$ from A to B)

iii. Isolation : For every pair of transactions, one of the transactions should not start execution

before the other transaction execution completed, if they use some common data variable. That is, if the transaction T1 is executing and using the data item X, then transaction T2 should not start until the transaction T1 ends, if T2 also use same data item X. For example, Transaction T1 : Transfer 100$ from account A to account B Transaction T2 : Transfer 150$ from account B to account C Assume initially, A balance = B balance = C balance = $ Transaction T1 Transaction T 10:00 AM (^) Read A’s balance ($1000) Read B’s balance ($1000) 10:01 AM (^) A balance = A Balance – 100$ (1000-100 = 900$) B balance = B Balance – 150$ (1000-150 = 850$) 10:02 AM (^) Read B’s balance ($1000) Read C’s balance ($1000) 10:03 AM (^) B balance = B Balance + 100$ (1000+100 = 1100$) C balance = C Balance + 150$ (1000+150 = 1150$) 10:04 AM (^) Write A’s balance (900$) Write B’s balance ( 85 0$) 10:05AM (^) Write B’s balance (1100$) Write C’s balance (11 5 0$) 10:06 AM (^) COMMIT COMMIT

Partially Committed State:

 When the transaction executes its last statement, then the transaction is said to be in partially committed state.  Still, all the changes made by the transaction are stored in the buffer in main memory, but they are not updated in the database.  From partially committed state, a transaction can go into one of two states, a committed state or a failed state.

Committed State:

 After all the changes made by the transaction have been successfully updated in the database, it enters into a committed state and the transaction is considered to be fully committed.  After a transaction has entered the committed state, it is not possible to roll back (undo) the transaction. This is because the system is updated into a new consistent state and the changes are made permanent.  The only way to undo the changes is by carrying out another transaction called as compensating transaction that performs the reverse operations.

Failed State:

 When a transaction is getting executed in the active state or partially committed state and some failure occurs due to which it becomes impossible to continue the execution, it enters into a failed state.

Aborted State:

 After the transaction has failed and entered into a failed state, all the changes made by it have to be undone.  To undo the changes made by the transaction, it becomes necessary to roll back the transaction.  After the transaction has rolled back completely, it enters into an aborted state.

Terminated State:

 This is the last state in the life cycle of a transaction.  After entering the committed state or aborted state, the transaction finally enters into a terminated state where its life cycle finally comes to an end.

4. TYPES OF SCHEDULES – SERIALIZABILITY In DBMS, schedules may be classified as

i. Serial Schedules:

 All the transactions execute serially one after the other.  When one transaction executes, no other transaction is allowed to execute. Examples: Schedule- 1 Schedule- 2 T1 T2 T1 T Read(A) Read(A) A=A- 100 A=A+ Write(A) Write(A) Read(B) COMMIT B=B+100 Read(A) Write(B) A=A- 100 COMMIT Write(A) Read(A) Read(B) A=A+500 B=B+ Write(A) Write(B) COMMIT COMMIT In schedule 1, after T1 completes its execution, transaction T2 executes. So, schedule-1 is a Serial Schedule. Similarly, in schedule-2, after T2 completes its execution, transaction T executes. So, schedule -2 is also an example of a Serial Schedule.

ii. Non-Serial Schedules:

 In non-serial schedules, multiple transactions execute concurrently.  Operations of all/some of the transactions are inter-leaved or mixed with each other.  Some non-serial schedules may lead to inconsistency of the database and may produce wrong results.

Conflict Serializability: If a given non-serial schedule can be converted into a serial

schedule by swapping its non-conflicting operations, then it is called as a conflict serializable schedule.

Two operations are called as conflicting operations if all the following conditions hold true (1) Both the operations belong to different transactions (2) Both the operations are on the same data item (3) At least one of the two operations is a write operation Schedule – 1 Schedule – 2 Schedule - 3 Schedule - 4 T1 T2 T1 T2 T1 T2 T1 T Read(A) Read(A) Write(B) Write(B) Read(A) Write(A) Read(A) Write(B)

In Schedule -1, only rule (1) & (2) are true, but rule (3) is not holding. So, the operations are not conflict. In Schedule -2, rule (1), (2) & (3) are true. So, the operations are conflict. In Schedule -3, only rule (1) & (3) are true, but rule (2) is not holding. So, the operations are not conflict. In Schedule -4, rule (1), (2) & (3) are true. So, the operations are conflict.

Testing of Conflict Serializability: Precedence Graph is used to test the Conflict

Serializability of a schedule. The algorithm to draw precedence graph is (1) Draw a node for each transaction in Schedule S. (2) If Ta reads X value written by Tb, then draw arrow from Tb → Ta. (3) If Tb writes X value after it has been read by Ta, then draw arrow from Ta → Tb. (4) If Ta writes X after Tb writes X, then draw arrow from Tb → Ta. If the precedence graph has no cycle, then Schedule S is known as conflict serializable. If a precedence graph contains a cycle, then S is not conflict serializable.

Problem-01: Check whether the given schedule S is conflict serializable or not. S : R 1 (A) , R 2 (A) , R 1 (B) , R 2 (B) , R 3 (B) , W 1 (A) , W 2 (B) Solution: Given that S : R 1 (A) , R 2 (A) , R 1 (B) , R 2 (B) , R 3 (B) , W 1 (A) , W 2 (B). The schedule for the above operations is Schedule- 1

T1 T2 T

Read(A) Read(A) Read(B) Read(B) Read(B) Write(A) Write(B)

List all the conflicting operations and determine the dependency between the transactions

(Thumb rule to find conflict operations: For each Write(X) in Ta, make a pair with each Read(X) and Write(X) in Tb. The order is important in each pair i.e., for example, Read after Write on X or write after read on X in the given schedule. )  R 2 (A) , W 1 (A) (T 2 → T 1 )  R 1 (B) , W 2 (B) (T 1 → T 2 )  R 3 (B) , W 2 (B) (T 3 → T 2 ) Draw the precedence graph:

There exists a cycle in the above graph. Therefore, the schedule S is not conflict serializable.

Problem-02: Check whether the given schedule S is conflict serializable schedule.

Schedule – S

T1 T2 T3 T Read(X) Write(X) COMMIT Write(X) COMMIT Write(Y) Read(Z) COMMIT Read(X) Read(Y) COMMIT Solution: List all the conflicting operations to determine the dependency between transactions.

R 2 (X) , W 3 (X) (T 2 → T 3 ) W 3 (X) , W 1 (X) (T 3 → T 1 ) W 3 (X) , R 4 (X) (T 3 → T 4 ) R 2 (X) , W 1 (X) (T 2 → T 1 ) W 1 (X) , R 4 (X) (T 1 → T 4 ) W 2 (Y) , R 4 (Y) (T 2 → T 4 ) Draw the precedence graph:

There exists no cycle in the precedence graph. Therefore, the schedule S is conflict serializable.

Now let us check whether the three rules of view-equivalent satisfy or not.

Schedule- 1 (S1) Schedule- 2 (S2)

T1 T2 T1 T

Read(A) Read(A) Write(A) 2 Write(A) Read(A) Read(B) Write(A) Write(B) Read(B) 2 Read(A) Write(B) Write(A) (^2) Read(B) 2 Read(B) Write(B) Write(B) Rule 1: Initial Read First Read(A) is by T1 in S1 and in S2 also the first Read(A) is by T1 only. First Read(B) is by T1 in S1 and in S2 also the first Read(B) is by T1 only. Rule 2: Updated Read Write(A) of T1 is read by T2 in S1 and in S2 also Write(A) of T1 is read by T Write(A) of T1 is read by T2 in S1 and in S2 also Write(A) of T1 is read by T Rule 3: Final Write The final Write(A) is by T2 in S1 and in S2 also the final Write(A) is by T2 only The final Write(B) is by T2 in S1 and in S2 also the final Write(B) is by T2 only Conclusion: Hence, all the three rules are satisfied in this example, which means Schedule S and S2 are view equivalent. Also, it is proved that schedule S2 is the serial schedule of S1. Thus we can say that the S1 schedule is a view serializable schedule. Note: Other way of solving it is, if we are able to prove that S1 is conflict serializable, then S1 is also view serializable. (Refer conflict serializable problems. Every conflict serializable schedule is also view serializable but not vice-versa.)

5. IMPLEMENTATION OF ATOMICITY AND DURABILITY

The recovery-management component of a DBMS supports atomicity and durability by a variety of schemes. The simplest scheme to implement it is Shadow copy. Shadow copy: In shadow-copy scheme,  A transaction that wants to update the database first creates a complete copy of the database.  All updates are done on the new database copy, leaving the original copy, untouched.  If at any point the transaction has to be aborted, the system simply deletes the new copy. The old copy of the database has not been affected.

(^1 )

1

1

3

3 3

3

 If the transaction complete successfully, then the database system updates the pointer db- pointer to point to the new copy of the database; the new copy then becomes the original copy of the database. The old copy of the database is then deleted. Figure below depicts the scheme, showing the database state before and after the update.

Figure: Shadow copy technique for atomicity and durability

6. RECOVERABILITY

During execution, if any of the transaction in a schedule is aborted, then this may leads the database into inconsistence state. If anything goes wrong, then the completed operations in the schedule needs to be undone. Sometimes, these undone operations may not possible. The recoverability of schedule depends on undone operations.

Irrecoverable Schedule: In a schedule, if a transaction Ta performs a dirty read operation

from other transaction Tb and Ta commits before Tb then such a schedule is known as an Irrecoverable Schedule. Example: Consider the following schedule T1 T Read(A) Write(A) | | Read(A) //Dirty Read | Write(A) | COMMIT | ROLLBACK Here,  T2 performs a dirty read operation.

If a transaction reads a data value that is updated by an uncommitted transaction, then this type of read is called as a dirty read.

7. IMPLEMENTATION OF ISOLATION

Isolation determines how transactions integrity is visible to other users and systems. It means that a transaction should take place in a system in such a way that it is the only one transaction that is accessing the resources in a database system.

Isolation level defines the degree to which a transaction must be isolated from the data modifications made by any other transactions in the database system. The phenomena’s used to define levels of isolation are:

a) Dirty Read b) Non-repeatable Read c) Phantom Read

Dirty Read: If a transaction reads a data value updated by an uncommitted transaction, then

this type of read is called as dirty read.

T1 T Read(A) Write(A) | | Read(A) //Dirty Read | Write(A) | COMMIT | ROLLBACK

As T1 aborted, the results produced by T2 become wrong. This is because T2 read A (Dirty Read) which is updated by T1.

Non-Repeatable Read: Non repeatable read occurs when a transaction read same data value

twice and get a different value each time. It happens when a transaction reads once before and once after committed UPDATES from another transaction. Table in Database Read A= Write A= Read A=

First, T1 reads data item A and get A= Next, T2 writes data item A as A = 20 Last, T1 reads data item A and get A=

T1 T Read(A) Write(A) Read(A)

A

Other example for Non-repeatable read: Table: STUDENT_DATA before T2 Table: STUDENT_DATA after T A B C 100 5 10 101 5 20 102 6 30

T1: SELECT SUM(C ) FROM STUDENT_DATA WHERE B=5; Answer is (10+20) = 30 T2: UPDATE STUDENT_DATA SET C = 15 WHERE A=100; Answer, in First row C changes to 15 T1: SELECT SUM(C ) FROM STUDENT_DATA WHERE B=5; Answer is (15+20) = 35

Phantom reads : Phantom reads occurs when a transaction read same data value twice and get

a different value each time. It happens when a transaction reads once before and once after committed INSERTS and/or DELETES from another transaction. Non-repeatable read Phantom read When T1 perform second read, there is no change in no of rows in the given table

When T1 perform second read, the no of rows either increase or decrease. T2 perform UPDATE operation on the given table

T2 perform INSERT and/or DELETE operation on the given table

Example for Phantom read: Table: STUDENT_DATA before T2 Table: STUDENT_DATA after T A B C 100 5 10 101 5 20 102 6 30

T1: SELECT SUM(C ) FROM STUDENT_DATA WHERE B=5; Answer is (10+20) = 30 T2: INSERT INTO STUDENT_DATA VALUES(103, 5, 25); Answer, in First row C changes to 15 T1: SELECT SUM(C ) FROM STUDENT_DATA WHERE B=5; Answer is (10+20+25) = 55

Based on these three phenomena, SQL define four isolation levels. They are:

(1) Read uncommitted: This is the lowest level of isolation. In this level, one transaction

may read the data item modified by other transaction which is not committed. It mean dirty read is allowed. In this level, transactions are not isolated from each other.

A B C 100 5 15 101 5 20 102 6 30

A B C 100 5 10 101 5 20 102 6 30 103 5 25

9. LOCK-BASED PROTOCOL  Lock assures that one transaction should not retrieve or update a record which another transaction is updating.  For example, traffic at junction, there are signals which indicate stop and go. When one side signal is green (vehicles allowed passing), then other side signals are red (locked. Vehicles not allowed passing). Similarly, in database transaction when one transaction operations are under execution, the other transactions are locked.  If at a junction, green signal is given to more than one side, then there may be chances of accidents. Similarly, in database transactions, if the locking is not done properly, then it will display the inconsistent and corrupt data. There are two lock modes: (1). Shared Lock (2). Exclusive Lock Shared Locks are represented by S. If a transaction Ti apply shared lock on data item A , then Ti can only read A but not write into A. Shared lock is requested using lock-S instruction. Exclusive Locks are represented by X. If a transaction Ti apply exclusive lock on data item A , then Ti can read as well as write data item A. Exclusive lock is requested using lock-X instruction. Lock Compatibility Matrix:  Lock Compatibility Matrix controls whether multiple transactions can acquire locks on the same resource at the same time. Transaction Ti applied Shared Exclusive Transaction Tj request for

Shared (^) √ X Exclusive (^) X X  If a transaction Ti applied shared lock on data item A , then Tj can also be allowed to apply shared lock on A.  If a transaction Ti applied shared lock on data item A , then Tj is not allowed to apply exclusive lock on A.  If a transaction Ti applied exclusive lock on data item A , then Tj is not allowed to apply shared lock on A.  If a transaction Ti applied exclusive lock on data item A , then Tj is not allowed to apply exclusive lock on it.  Any number of transactions can hold shared locks on a data item, but if any transaction holds an exclusive lock on a data item, then other transactions are not allowed to hold any lock on that data item.

 Whenever a transaction wants to read a data item, it should apply shared lock and when a transaction wants to write it should apply exclusive lock. If the lock is not applied, then the transaction is not allowed to perform the operation.

There are four types of lock protocols available. They are:

(1) Simplistic lock protocol

 It is the simplest locking protocol.  It considers each read/write operation of a transaction as individual.  It allows transactions to perform write/read operation on a data item only after obtaining a lock on that data item.  Transactions unlock the data item immediately after completing the write/read operation.  When a transaction needs to perform many read and write operations, for each operation lock is applied before performing it and release the lock immediately after completion of the operation.

(2) Pre-claiming Lock Protocol

 In pre-claiming Lock Protocol, for each transaction a list is prepared consisting of the data items and type of lock required on each of the data item.  Before initiating an execution of the transaction, it requests DBMS to issue all the required locks as per the list.  If all the locks are granted then this protocol allows the transaction to begin. When the transaction is completed then it releases all the lock.  If all the locks are not granted then this protocol allows the transaction to rolls back and waits until all the locks are granted.

(3) Two-phase locking (2PL) protocol

 Every transaction execution starts by acquiring few locks or zero locks. During execution it acquire all other required locks one after the other.

No of locks Transaction^ Begin of^ End^ of Transaction

(4) Strict Two-phase locking (Strict-2PL) protocol

 The first phase of Strict-2PL is similar to 2PL. In the first phase, after acquiring all the locks, the transaction continues to execute normally.  The only difference between 2PL and strict 2PL is that Strict-2PL does not release a lock after using it.  Strict-2PL waits until the whole transaction to commit, and then it releases all the locks at a time.  Strict-2PL protocol does not have shrinking phase of lock release.

Strict-2PL does not have cascading abort as 2PL does.

10. TIMESTAMP BASED PROTOCOL

 A timestamp is issued to each transaction when it enters into the system.  It uses either system time or logical counter as a timestamp.  It is most commonly used concurrency protocol.  The timestamp of transaction T is denoted as TS(T).  The system order the transactions based on their arrival time. For example, let the arrival times of transactions T1, T2 and T3 be 9:00AM, 9:01AM and 9:02AM respectively. Then TS(T1) < TS(T2) < TS(T3). ( 9:00AM < 9:01AM < 9:02AM )  By using timestamp, the system prepares the serializability order. i.e., T1→T2→T  The read timestamp of data item X is denoted by R–timestamp(X).  R–timestamp(X): It is the time stamp of the youngest transaction that performed read operation on X.

X

T1: Read(X) T2: Read(X) T3: Read(X)

R-Timestamp(X) = TS(T3)

No of locks

Transactio^ Begin ofn End^ of Transaction

Growing Phase

 The write timestamp of data item X is denoted by W–timestamp(X).  W–timestamp(X): It is the time stamp of the youngest transaction that performed write operation on X.

There are mainly two Timestamp Ordering Algorithms in DBMS. They are:  Basic Timestamp Ordering  Thomas Write rule

(1). Basic Timestamp Ordering

 Check the following condition whenever a transaction Ti issues a Read (X) operation: o If W_ timestamp(X) >TS(Ti) then the operation is rejected. o If W_ timestamp(X) <= TS(Ti) then the operation is executed. (Read is not allowed by Ti, if any younger transactions than Ti write X)  Check the following condition whenever a transaction Ti issues a Write(X) operation: o If TS(Ti) < R_ timestamp(X) then the operation is rejected. (Write is not allowed by Ti, if any younger transactions than Ti read X) o If TS(Ti) < W_ timestamp(X) then the operation is rejected and Ti is rolled back otherwise the operation is executed. (Write is not allowed by Ti, if any younger transactions than Ti write X and also Ti should be rolled back and restarted later)

(2) Thomas's Write Rule

Thomas Write Rule is a timestamp-based concurrency control protocol which ignores outdated writes. It follows the following steps: (i). If R_TS(X) > TS(Ta) , then abort and rollback Ta and reject the operation.

Transaction: T

Arrival = 9:00 AMTS(T1) = 9:00 AM

Transaction: T

Arrival = 9:02 AMTS(T1) = 9:0 2 AM

Variable A

Initial A= | | | Read(A) (A=100) (^) A = 100 ( R_TS(A) = 9:02AM ) | | : Write(A) (A=200) | (A=100) A = 200 100 Reject and Rollback T

X

T1: Write(X)

T2: Write(X)

R-Timestamp(X) = TS(T2)