































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Chapter 12 Distributed Database Management Systems Comprehensive Exam Study Guide Latest Updated 2024/2025
Typology: Exams
1 / 39
This page cannot be seen from the preview
Don't miss anything!
REQUEST
REQUEST
REQUEST
REQUEST
Discuss the possible data request scenarios in a distributed database environment.
1. Single request accessing a single remote database****. (See Figure D12.1.)
The most primitive and least effective of the distributed database scenarios is based on a single SQL statement (a "request" or "unit of work") is directed to a single remote DBMS. (Such a request is known as a remote request .). We suggest that you remind the student of the distinction between a request and a transaction: A request uses a single SQL statement to request data. A transaction is a collection of two or more SQL statements.
2. Multiple requests accessing a single remote database. (See Figure D12.2.)
REQUEST
REQUEST
REQUEST
REQUEST
REQUEST
A unit of work now consists of multiple SQL statements directed to a single remote DBMS. The local user defines the start/stop sequence of the units of work, using COMMIT, but the remote DBMS manages the unit of work's processing.
3. Multiple requests accessing multiple remote databases. (See Figure D12.3.)
A unit of work now may be composed of multiple SQL statements directed to multiple remote DBMSes. However, any one SQL statement may access only one of the remote DBMSes. As was true in the second scenario, the local user defines the start/stop sequence of the units of work, using COMMIT, but the remote DBMS to which the SQL statement was directed manages the unit of work's processing. In this scenario, a two-phase COMMIT must be used to coordinate COMMIT processing for the multiple locations.
fragmentation? Why and how must data replication be addressed in a distributed database environment? What replication strategies are available, and how do they work? Since the current literature abounds with references to file servers and client-server architectures, what do these terms mean? How are file servers different from client/server architectures? Why would you want to know?
Many questions raised in this section are more specific -- and certainly more technical -- than the questions raised in the previous chapters. Since the chapter covers the answers to these questions in great detail, we have elected to give you section references to avoid needless duplication.
We have answered these questions in detail in the Answers to Review Question section of this chapter. Note particularly the answers to questions 5, 6, 11, and 15-17.
1. Describe the evolution from centralized DBMSs to distributed DBMSs.
This question is answered in detail in section 12-1.
2. List and discuss some of the factors that influenced the evolution of the DDBMS.
These factors are listed and discussed in section 12-1.
3. What are the advantages of the DDBMS?
See section 12-2 and Table 12.1.
4. What are the disadvantages of the DDBMS?
See section 12-2 and Table 12.1.
5. Explain the difference between distributed database and distributed processing.
See section 12-3.
6. What is a fully distributed database management system?
See section 12-4.
7. What are the components of a DDBMS?
See section 12-5.
8. List and explain the transparency features of a DDBMS.
The figure references in the discussions refer to the figures found in the text.
9. Define and explain the different types of distribution transparency.
See section 12-8.
10. Describe the different types of database requests and transactions.
A database transaction is formed by one or more database requests. Each database request is the equivalent of a single SQL statement. The basic difference between a local transaction and a distributed transaction is that the latter can update or request data from several remote sites on a network. In a DDBMS, a database request and a database transaction can be of two types: remote or distributed.
Note: The figure references in the discussions refer to the figures found in the text. The figures are not reproduced in this manual.
A remote request accesses data located at a single remote database processor (or DP site). In other words, an SQL statement (or request) can reference data at only one remote DP site. Use Figure 12. to illustrate the remote request.
A remote transaction , composed of several requests, accesses data at only a single remote DP site. Use Figure 12.10 to illustrate the remote transaction.
As you discuss Figure 12.10, note that both tables are located at a remote DP (site B) and that the complete transaction can reference only one remote DP. Each SQL statement (or request) can reference only one (the same) remote DP at a time; the entire transaction can reference only one remote DP; and it is executed at only one remote DP.
A distributed transaction allows a transaction to reference several different local or remote DP sites. Although each single request can reference only one local or remote DP site, the complete transaction can reference multiple DP sites because each request can reference a different site. Use Figure 12.11 to illustrate the distributed transaction.
A distributed request lets us reference data from several different DP sites. Since each request can
The location and partition of the data should be transparent to the end user. Use Figure 12.12 to illustrate the distributed request.
As you discuss Figure 12.12, note that the transaction uses a single SELECT statement to reference two tables, CUSTOMER and INVOICE. The two tables are located at two different remote DP sites, B and C.
The distributed request feature also allows a single request to reference a physically partitioned table. For example, suppose that a CUSTOMER table is divided into two fragments C1 and C2, located at sites B and C respectively. The end user wants to obtain a list of all customers whose balance exceeds $250.00. Use Figure 12.13 to illustrate this distributed request.
Note that full fragmentation support is provided only by a DDBMS that supports distributed requests.
11. Explain the need for the two-phase commit protocol. Then describe the two phases.
See section 12-9c.
12. What is the objective of the query optimization functions?
The objective of query optimization functions is to minimize the total costs associated with the execution of a database request. The costs associated with a request are a function of: the access time (I/O) cost involved in accessing the physical data stored on disk the communication cost associated with the transmission of data among nodes in distributed database systems the CPU time cost.
It is difficult to separate communication and processing costs. Query-optimization algorithms use different parameters, and the algorithms assign different weight to each parameter. For example, some algorithms minimize total time, others minimize the communication time, and still others do not factor in the CPU time, considering it insignificant relative to the other costs. Query optimization must provide distribution and replica transparency in distributed database systems.
13. To which transparency feature are the query optimization functions related?
Query-optimization functions are associated with the performance transparency features of a
DDBMS. In a DDBMS the query-optimization routines are more complicated because the DDBMS must decide where and which fragment of the database to access. Data fragments are stored at several sites, and the data fragments are replicated at several sites.
14. What issues should be considered when resolving data requests in a distributed environment?
a centralized node. However, in a distributed database system data are located in multiple geographically disperse sites connected via a network. In such cases, network latency and network partitioning impose a new level of complexity. In most highly distributed systems, designers tend to emphasize availability over data consistency and partition tolerance. This trade-off has given way to a new type of database systems in which data are basically available, soft state and eventually consistent (BASE).
For more information about BASE systems see section12-12.