Download BCNF Decomposition and Normalization in Database Design and more Slides Database Management Systems (DBMS) in PDF only on Docsity!
Rasmus Ejlers Møgelberg
BCNF decomposition
Introduction to Database Design 2012, Lecture 9
Rasmus Ejlers Møgelberg
Overview
• BCNF exercise example
• Decomposition to BCNF
- algorithm for lossless decomposition
• 4NF
• Normalisation example
• Start of indexing?
Rasmus Ejlers Møgelberg
Last time
- To reason about BCNF or 3NF we need to know all logically implied FDs, not just the given set
- One way to enumerate all FDs is to compute all attribute closures
- Attribute closure:
- Let α be set of attributes
- Attribute closure α+^ is also a set of attributes
- X∈ α+^ if α → X
Rasmus Ejlers Møgelberg
Example
- Consider schema^ R ( A , B), S(B , C , D,E ) with dependencies
- Compute all candidate keys
- Is^ R^ in BCNF?
- Is it in 3NF?
- (this is a typical exam exercise) B → DE B → A A → CD CE → B
Rasmus Ejlers Møgelberg
Example
- Decompose the relation
- With the functional dependencies
cd_shop(cd_id, artist, title, order_id, order_date, quantity, customer_id, name, address)
cd_id → artist, title customer_id → name, address order_id → order_date, customer_id order_id, cd_id → quantity
Rasmus Ejlers Møgelberg
Non determinancy
- Much depends on the choice of BCNF violation
- Try e.g. decomposing first using
- There is no guarantee that decomposition is dependency preserving
- (even if there is a dependency preserving decomposition)
- One heuristic is to maximise right hand sides of BCNF violations order_id → order_date, customer_id
Rasmus Ejlers Møgelberg
Correctness
- Correctness:
- Tables become smaller for every decomposition
- Every 2-attribute table is BCNF
- So in the end, the schema must be BCNF
- Every decomposition is lossless
- In fact if^ α→β^ then decomposition of R(αβγ) into (αβ) and (αγ) is always lossless (book page 346)
Rasmus Ejlers Møgelberg
Discussion
- BCNF algorithm suggests a new strategy to DB design:
- Put everything in one table
- Write up all FDs
- Apply BCNF decomposition
- This is^ not^ a good strategy
- This can lead to terrible designs
- Without ER diagram, it is also harder to extend an existing DB design
Rasmus Ejlers Møgelberg
Problem in a nutshell
- Attributes^ address^ and^ movie^ are independent and not determined by other attributes
- For every pair of tuples
- There are also tuples
- This is called a^ multivalued dependency
| name | address | movie |
| Actor name | Address 1 | Movie 1 |
| Actor name | Address 2 | Movie 2 |
| name | address | movie |
| Actor name | Address 1 | Movie 2 |
| Actor name | Address 2 | Movie 1 |
Rasmus Ejlers Møgelberg
Multivalued dependencies
- Consider a table R(αβγ)
- Definition.^ There is a multivalued dependency α↠ β if for all tuples t,u in all legal instances
- if t[α] = u[α]
- then there exists tuple s such that
- In example^ name^ ↠^ address s[α] = t[α] s[β] = t[β] s[γ] = u[γ]
Rasmus Ejlers Møgelberg
Rules for multivalued dependencies
- In R(αβγ) if^ α↠^ β^ then also^ α↠^ γ
- If^ α→β^ then also^ α↠^ β^ (can take s = u)
- Consequences
- if β⊆α then α↠ β
- (^) if α superkey then α↠ β
- It is^ not^ the case that if^ α↠^ βγ^ then^ α↠^ β
Rasmus Ejlers Møgelberg
4NF
- Definition.^ A table r(R) is in^ 4NF^ if for all multivalued dependencies α↠ β either
- β⊆α (α→β is trivial)
- or α is a superkey
- Definition.^ A schema is in^ 4NF^ if all tables are in 4NF - Theorem.^ A schema in 4NF is also BCNF
Rasmus Ejlers Møgelberg
An example
Rasmus Ejlers Møgelberg
An example
create table Cargo (ID integer PRIMARY KEY, RD varchar (3), -- Region Departure RA varchar (3), -- Region Arrival CD varchar (3),! -- Country Departure CA varchar (3),! -- Country Arrival AL varchar (3), -- Airline FNR varchar (5),! -- Flight Nummer SNR integer, -- Header Segment Number DEP varchar (3),! -- City Departure ARR varchar (3), -- City Arrival STD varchar (4),! -- Schedule Time of Departure DDC integer, -- Date Variation of Departure STA varchar (4), -- Schedule Time of Arrival ADC integer, -- Date Variation of Arrival MO integer (1),! -- Monday TU integer (1),! -- Tuesday WE integer (1),! -- Wednesday TH integer (1),! -- Thursday FR integer (1),! -- Friday SA integer (1),! -- Saturday SU integer (1),! -- Sunday ACTYPE varchar(3), -- Aircraft Type Code ACTYPEFULLNAME varchar(30), -- Aircraft Name AG varchar (1),! -- Aircraft Group Code AGFULLNAME varchar(30), -- Aircraft Group Name START_OP_1 varchar(8),!! -- Start Date of Flight Operations END_OP_1 varchar(8) -- End Date of Flight Operations );
Rasmus Ejlers Møgelberg
An example
We are told that the following functional dependencies hold: ACTYPE -> ACTYPEFULLNAME AG -> AGFULLNAME ACTYPE -> AG Task 1: Check that the cargo database respects these three functional dependencies: mysql> select actype from cargo group by actype having count(distinct actypefullname)>1; Empty set (0.03 sec) mysql> select ag from cargo group by ag having count(distinct agfullname)>1; Empty set (0.03 sec) mysql> select actype from cargo group by actype having count(distinct ag)>1; Empty set (0.03 sec)
Rasmus Ejlers Møgelberg
An example
- (Full example on blog) Taks 2: Check if ACTYPE and AG are candidate keys: Inspect the data, or look at the number of occurrences of each actype value: mysql> select actype,count() from cargo group by actype having count()>1; +--------+----------+ | actype | count() | +--------+----------+ | 100 | 781 | ... | SHP | 11 | +--------+----------+ 44 rows in set (0.03 sec) mysql> select ag,count() from cargo group by ag having count()>1; +------+----------+ | ag | count() | +------+----------+ | F | 1572 | | O | 14889 | | T | 2409 | | W | 6472 | +------+----------+ 4 rows in set (0.03 sec) Conclusion: None of them are keys, should split into separate tables to achieve BCNF. Task 3. Compute tables according to the FDs ACTYPE! ACTYPEFULLNAME, AG AG! AGFULLNAME mysql> create table aircraft as (SELECT distinct actype,actypefullname,ag from cargo); Query OK, 46 rows affected (0.15 sec) Records: 46 Duplicates: 0 Warnings: 0 create table acgroup as (select distinct ag, agfullname from cargo);