# Database system concept notes-Chapter 8: Relational database design

### Article Directory

- Chapter 8: Relational Database Design
- 8.1 Features of good relationship design
- 8.1.1 Design choices: a larger model
- 8.1.2 Design choices: smaller models
- 8.2 Atomic domains and first normal form
- 8.3 Decomposition using functional dependencies
- 8.3.1 Code and functional dependencies
- 8.3.2 Boyce-Codd Paradigm
- 8.3.3 BCNF and stay dependent
- 8.3.4 Third Normal Form
- 8.3.5 Higher paradigm
- 8.4 Functional dependency theory
- 8.4.1 Closure of functional dependency set
- 8.4.2 Closure of attribute set
- 8.4.3 Regular coverage
- 8.4.4 Non-destructive decomposition
- 8.4.5 Stay Dependent
- 8.5 Decomposition algorithm
- 8.5.1 BCNF decomposition
- 8.5.2 3NF decomposition
- 8.5.4 Comparison of BCNF and 3NF
- 8.6 Decomposition using multi-valued dependencies
- Two, PPT supplementary content
- 1. Candidate code solution theory
- 2. Judgment of lossless connection-table method
- 3. Second Normal Form (2NF)

## Chapter 8: Relational Database Design

This chapter is more difficult, I did not learn very well, some content needs to be added at the end of the term

### 8.1 Features of good relationship design

#### 8.1.1 Design choices: a larger model

In the textbook example, we consider combining the instructor table and the department table into a larger table, named inst_dept.

After synthesizing this new table, it seems to be beneficial. For example, some queries can be expressed using fewer joins.

However, the synthesized table also has drawbacks. For example, for teachers in each department, the department information (building, budget) must be repeated.

If the original two tables are used, then each budget amount will only be stored once.

This shows that the merged inst_dept table is not good, because it repeatedly stores the budget and has to assume that some users may update the budget amount of one tuple instead of all tuples, which will cause data inconsistency

There are other problems with this merged relationship, such as when a department without a teacher appears, null information will be inserted, etc.

#### 8.1.2 Design choices: smaller models

Assuming from the previous large model inst_dept table, how do we specify that it should be divided into the instructor table and the department table?

We found that in the merged large model, budget will be duplicated, and there will be data redundancy.

In other words, there is a rule. If there is a relationship between dept_name->budget functions all the way, then there is a problem with this table, due to dept_name Not the main code, so the budget will be duplicated, and duplication is redundant

Therefore, due to the existence of this functional dependency, this table needs to be split into two tables, instructor and department

**For many databases, it is much more difficult to find the correct decomposition due to a large number of attributes and multiple functional dependencies. Therefore, in order to deal with this situation, the following standard method is required**

**Note that not all decompositions are beneficial**

For example, if employee is decomposed into the following two patterns.

After decomposing into these two tables, if the employee has the same name, then the result of the natural connection between the two tables will be different from the original data, and the problem of inconsistent information will occur. , So this decomposition is undesirable. The

above decomposition is called **lossy decomposition** . Conversely, if the data after decomposition is still consistent, it is called **lossless decomposition**

### 8.2 Atomic domains and first normal form

If a domain is considered an indivisible unit, then the domain is **atomic**

If all attributes of a relational pattern R are **atomic** , then this relational pattern R belongs to the **first normal form (1NF)**

Pay attention to the correct understanding of this **atomicity,** there are specific examples in the textbook

### 8.3 Decomposition using functional dependencies

First of all, what is the purpose of learning and using functional dependencies?

Using functional dependencies, existing relationships can be decomposed. Therefore, the essential function of functional dependencies is to decompose existing redundant relationship patterns.

**About the introduction of subsequent related symbols**

#### 8.3.1 Code and functional dependencies

In the real world, data usually has various constraints, and instances that satisfy all such constraints are called **legal instances**

Constraints in the real world can be converted into **codes** in the database **(super codes, candidate codes, master codes)**

**Reviewing the definition of hypercode. The definition of **

**functional dependence.**

From another perspective, the dependency reflected by hypercode is a special case of functional dependence.

That is, functional dependence can be used to express constraints that cannot be expressed by hypercode.

Example: for the relational model inst_dept

```
inst_dept(ID,name,salary,dept_name,building,budget)
```

In this mode, the function depends on dept_name->budget to be established. The

attribute pair (ID, dept_name) constitutes a supercode of inst_dept

**Two ways to use functional dependencies**

- Determine whether the instance of the relationship satisfies the given functional dependency set F
- Explain the constraints on the set of legal relationships. If we want to consider only the relationship satisfying the functional dependency set F on R, we say that F is established on r®

**Trivial function depends on **

**closure of F set**

#### 8.3.2 Boyce-Codd Paradigm

Boyce-Codd paradigm (BCNF) can eliminate redundancy based on functional dependency discovery

The condition for judging that the relational pattern R with functional dependency set F belongs to BCNF is that one of the

two conditions is satisfied.

**Examples **

**of general rules for decomposition of relational schemas that do not belong to the BCNF schema**

#### 8.3.3 BCNF and stay dependent

In the ER diagram of the university database in Chapter 7, a student can only have one mentor. If we make a small change to the relationship model, so that a student can have multiple tutors, but a department can only have one at most

**The modified ER diagram is as follows.**

For the new ER diagram, the previous relationship mode remains unchanged, but the exported dept_advisor mode becomes

```
dept_advisor(s_ID,i_ID,dept_name)
```

Then the following two function dependencies are established on dept_advisor

- i_ID->dept_name
- s_ID,dept_name->i_ID

Note that dept_advisor does not belong to the BCNF mode. If you decompose according to the BCNF decomposition rules, you will get the following two tables

```
(s_ID,i_ID)
(i_ID,dept_name)
```

But after decomposing in this way, although the new model satisfies BCNF, it does not meet the dependencies we require

The above decomposition is not to **maintain dependence** , because we often want to maintain dependence, we need to consider a weaker paradigm than BCNF, called 3NF

#### 8.3.4 Third Normal Form

**The definition of the third paradigm **

**Note that any relational model that satisfies BCNF satisfies 3NF, so BCNF is a stricter paradigm than 3NF**

#### 8.3.5 Higher paradigm

Note that BCNF decomposition is not necessarily perfect, and there may be problems, such as the following example.

Therefore, solving these problems requires multi-value dependence and a higher paradigm.

### 8.4 Functional dependency theory

#### 8.4.1 Closure of functional dependency set

Given the functional dependency F on the pattern, we can prove that some other functional dependencies also hold on the pattern. We call these functional dependencies "logically implied" by F.

When testing the paradigm, you can’t just consider the given set of functional dependencies, you also need to consider all the functional dependencies that are established in the model.

**Definition of logical implication**

Given a relational pattern r (R ), if every instance of r (R) that satisfies F also satisfies f, then the functional dependency f on R is **logically implied** by the functional dependency set F on r

**Example: **

**Armstrong's axiom **

**calculation algorithm for F+**

#### 8.4.2 Closure of attribute set

**Conceptual example of attribute set closure**

#### 8.4.3 Regular coverage

For the functional dependency set F of a relational model, whenever a user performs an update on the relation, the database system must ensure that this update does not destroy any functional dependencies

**Irrelevant attributes**

If removing an attribute in the functional dependency does not change the closure of the attribute dependency set, the attribute is said to be **irrelevant**

Formal definition of irrelevant attributes. Definition of

**regular**

coverage. Algorithm

for calculating regular coverage. Examples of finding regular coverage.

Note that in some cases, there may be more than one regular coverage.

#### 8.4.4 Non-destructive decomposition

**Definition of Lossless Decomposition**

More generally speaking, lossless decomposition means that the result of the natural connection between the two after the decomposition is the same as before the decomposition.

```
select * from
r1 natural join r2
```

versus

```
select * from
r
```

The same result

**For binary decomposition, the method of judging lossless connection**

#### 8.4.5 Stay Dependent

Remaining dependent after decomposition is called maintaining dependent

**Keep the specific definition of dependency**

### 8.5 Decomposition algorithm

#### 8.5.1 BCNF decomposition

**1. Determine whether a relational pattern belongs to the simplified way of BCNF 2. BCNF **

**decomposition algorithm**

If a relational model does not belong to BCNF and needs to be converted to BCNF, it can be converted using the BCNF algorithm. The specific steps of the algorithm are as follows:

**BCNF decomposition example**

For a relational model

, the final decomposition result is obtained after continuous iteration of the algorithm

#### 8.5.2 3NF decomposition

Convert the model to a 3NF lossless decomposition algorithm and maintain dependence

#### 8.5.4 Comparison of BCNF and 3NF

Regarding the two paradigms of database design: 3NF and BCNF, one of the advantages of 3NF is that we can always get 3NF design under the premise of satisfying losslessness and maintaining dependence. But 3NF also has shortcomings. Sometimes it is necessary to use a null value to indicate some potentially meaningful connections between data items.

The design goal for the database on which the application function depends is

- BCNF
- Lossless
- Stay dependent

Since these three requirements cannot always be met, we have to choose between BCNF and 3NF

### 8.6 Decomposition using multi-valued dependencies

The part of multi-value dependence is not important, and it is not involved in the exam, so it is omitted

## Two, PPT supplementary content

### 1. Candidate code solution theory

Solving the candidate code of a relation is an NP problem, but there is a relatively simple method

For a given relational pattern R(U,F), its attributes can be divided into 4 categories

- Type L: Functions that only appear in F depend on the attributes on the left
- Type R: Functions that only appear in F depend on the attributes on the right
- Type N: Attributes that do not appear on either side of the functional dependency of F
- LR class: the function of F depends on attributes that appear on both sides

Examples **of sufficient conditions for quickly solving candidate codes** : Set the relational model R(U,F), U = {A, B, C, D}, F = {D->B, B->D, AD->B, AC ->D}, find the candidate code of R

A and C are attributes of type L, AC must be a member of R's candidate code, (AC)F+ = {ACDB}, so AC is the only candidate code of R

**Inference **

**Theorem 2 **

**Theorem 3 **

**Example Questions **

**Inference**

### 2. Judgment of lossless connection-table method

Use the following methods to determine whether it is a lossless connection.

Specific examples are available on the PPT.

### 3. Second Normal Form (2NF)

What kind of relationship belongs to the second normal form?