8 September 2019

Database(8) -- Design Theory

by Jerry Zhang

Closing Sets of Functional Dependencies

If we are given a set of FD’s S, then any set of FD’s equivalent to S is said to be a basis for S.

A minimal basis:

All the FD’s in B have singleton right sides.
If any FD is removed from B, the result is no longer a basis.
If for any FD in B we remove one or more attributes from the left side of F, the result is no longer a basis.

Design of Relational Database Schemas

Anomalies

Problems such as redundancy that occur when we try to cram too much into a single relation are called anomalies.

Redundancy
Update Anomalies
Deletion Anomalies

Decomposing Relations

Given a relation R(A1,A2,…,An), we may decompose R into two relations S(B1,B2,…,Bm) and T(C1,C2,…,Ck) such that:

{A1,A2,…,An} = {B1,B2,…,Bm} U {C1,C2,…,Ck}
S = π_B1,B2,…,Bm(R)
T = π_C1,C2,…,Ck(R)

Boyce-Codd Normal Form (BCNF)

A relation R is in BCNF if and only if: whenever there is a nontrivial FD A1,A2,…,An -> B1B2…Bm for R, it is the case that {A1, A2,…, An} is a superkey for R.

That is, the left side of every nontrivial FD must be a superkey.

Any two-attribute relation is in BCNF.

Decomposition into BCNF

Look for a nontrivial FD A1A2…An -> B1B2…Bm that violates BCNF; i.e., {A1A2…An} is not a superkey.

Algorithm 3.20: BCNF Decomposition Algorithm

INPUT: A relation R0 with a set of functional dependecies S0.

OUTPUT: A decomposition of R0 into a collection of relations, all of which are in BCNF.

METHOD: The following steps can be applied recursively to any relation R and set of FD’s S. Initially, apply them with R = R0 and S = S0.

Check whether R is in BCNF. If so, nothing more needs to be done. Return {R} as the answer.
If there are BCNF violations, let one be X -> Y. Use Algorithm 3.7(closure of a Set of Attributes) to compute X+. Choose R1 = X+ as one relation schema and let R2 have attributes X and those attributes of R that are not in X+. 如果某一组属性不是superkey，但它们决定了另外一组属性，即构成了不应该存在的依赖关系，则先算出这组属性的closure，作为第一张表；再把这组属性和其他所有非closure属性放在一起作为第二张表。
Use Algorithm 3.12 to compute the sets of FD’s for R1 and R2; let these be S1 and S2, respectively.
Recursively decompose R1 and R2 using this algorithm. Return the union of the results of these decompositions.

Algorithm 3.12: Projecting a Set of Functional Dependencies.

INPUT: A relation R and a second relation R1 computed by the projection R1 = π_L(R). Also, a set of FD’s S that hold in R.

OUTPUT: The set of FD’s that hold in R1.

METHOD:

Let T be the eventual output set of FD’s. Initially, T is empty.
For each set of attributes X that is a subset of the attributes of R1, compute X+. This computation is performed with respect to the set of FD’s S, and may involve attributes that are in the schema of R but not R1. Add to T all nontrivial FD’s X -> A such that A is both in X+ and an attribute of R1.
Now, T is a basis for the FD’s that hold in R1, but may not be a minimal basis. We may construct a minimal basis by modifying T as follows:

If there is an FD F in T that follows from the other FD’s in T, remove F from T.
Let Y -> B be an FD F in T, with at least two attributes in Y, and let Z be Y with one of its attributes removed. If Z -> B follows from the FD’s in T (including Y -> B), then replace Y -> B by Z -> B.
Repeat the above steps in all possible ways until no more changes to T can be made.

tags: Database

Jerry's Blog

Recording what I learned everyday