Entry
Database: Relational: Normalization: Simple: How to do normalization?
Mar 13th, 2005 10:42
Knud van Eeden,
----------------------------------------------------------------------
--- Knud van Eeden --- 25 December 2003 - 00:48 am -------------------
Database: Relational: Normalization: Simple: How to do normalization?
---
Normalization (according to a procedure given by Edgar Codd in the
original 1970 article, which showed for the first time the theory of
relational databases),
'A relational model of data for large shared data banks',
which appeared in the ACM (=Association for Computing Machinery)
magazine, Vol. 13, No. 6, pp. 377-387
http://www.acm.org/classics/nov95/toc.html
proceeds as follows.:
---
Steps: Overview: General
1. -if the current node is an endbranch (=no children anymore), stop
2. -Starting with the relation at the top of the tree,
3. -take its primary key
4. -and expand each of the immediately subordinate
relations by inserting this primary key domain or domain
combination.
5. -The primary key of each expanded relation consists of the primary
key before expansion augmented by the primary key copied down from
the parent relation.
6. -Now, strike out from the parent relation all nonsimple domains,
7. -remove the top node of the tree,
8. -repeat the same sequence of operations on each remaining
subtree.
---
---
Steps: Overview: Specific
---
employee
|
+-------------------+----------+
| |
jobhistory offspring
|
salaryhistory
---
Given: Input:
---
employee (man#, name, birthdate, jobhistory, children)
jobhistory (jobdate#, title, salaryhistory)
salaryhistory (salarydate#, salary)
offspring (offspringname#, birthyear)
---
Desired: Output:
---
employee' (man#, name, birthdate)
jobhistory' (man#, jobdate#, title)
salaryhistory' (man#, jobdate#, salarydate, salary)
offspring' (man#, offspringname#, birthyear)
---
[Internet: source: http://www.acm.org/classics/nov95/s1p4.html]
---
---
Steps: Worked out: Specific
---
Follow the following recursive steps:
---
1. -if the current node is an endbranch (=no children anymore), stop
2. -Starting with the relation at the top of the tree,
employee (man#, name, birthdate, jobhistory, children)
3. -take its primary key
man#
4. -and expand each of the immediately subordinate
relations by inserting this primary key domain or domain
combination.
1. jobhistory (man#, jobdate#, title, salaryhistory)
2. offspring (man#, offspringname#, birthyear)
5. -The primary key of each expanded relation consists of the primary
key before expansion augmented by the primary key copied down from
the parent relation.
6. -Now, strike out from the parent relation all nonsimple domains,
employee (man#, name, birthdate)
7. -remove the top node of the tree,
8. -repeat the same sequence of operations on each remaining
subtree.
---
1. -if the current node is an endbranch (=no children anymore),
stop
jobhistory has 1 child, so continue
2. -Starting with the relation at the top of the tree,
jobhistory (man#, jobdate#, title, salaryhistory)
3. -take its primary key
man#, jobdate#
4. -and expand each of the immediately subordinate
relations by inserting this primary key domain or domain
combination.
1. salaryhistory (man#, jobdate#, salarydate, salary)
5. -The primary key of each expanded relation consists of the
primary
key before expansion augmented by the primary key copied down
from
the parent relation.
6. -Now, strike out from the parent relation all nonsimple
domains,
jobhistory (man#, jobdate#, title)
7. -remove the top node of the tree,
8. -repeat the same sequence of operations on each remaining
subtree.
---
1. -if the current node is an endbranch (=no children
anymore), stop
salaryhistory (man#, jobdate#, salarydate, salary)
has no children, so stop
---
1. -if the current node is an endbranch (=no children anymore),
stop
offspring (man#, offspringname#, birthyear)
has no children, so stop
---
---
Collecting all the resulting terms gives:
employee (man#, name, birthdate)
jobhistory (man#, jobdate#, title)
salaryhistory (man#, jobdate#, salarydate, salary)
offspring (man#, offspringname#, birthyear)
which is the wanted output
Note:
The order of the rows here is not important, because each row has all
its connection information contained in it independent of its position
---
---
Algorithm
1. Start with the current node equal to the root
---
1. For the current node
---
1. If the current node has no children anymore:
1. Store that result in the output
2. Return
---
1. For first to last child of that node
1. Add the primary key of the current node to that child
2. Next child
---
2. Remove the childs of the current node
---
3. Store the result in the output
---
4. Repeat the above steps for all the childs of the current node
---
---
So a possible simplest implementation of this algorithm might be:
employee
|
+-------------------+----------+
| |
jobhistory offspring
|
salaryhistory
---
---
Steps: Overview:
1. -Store the given tree
2. -Create an output
2. -Run the algorithm
---
---
Steps: Worked out:
1. -Store the given tree in an array
INPUT
--------------------------------------------------------
1. employee# | jobhistory | offspring | -1 |
--------------------------------------------------------
2. jobhistory# | salaryhistory | -1 | |
--------------------------------------------------------
3. salaryhistory# | -1 | | |
--------------------------------------------------------
4. offspring# | -1 | | |
--------------------------------------------------------
This is one of the easiest ways to store the tree.
---
Each row contains a node,
with on the right of it all its children,
---
Followed by -1
(to indicate that there are no more childs for that node)
---
You indicate that that node has a key, by putting a '#' at the end
of its name.
---
2. -Create some output structure in an array
OUTPUT
--------------------------------------------------------
1.
--------------------------------------------------------
2.
--------------------------------------------------------
3.
--------------------------------------------------------
4.
--------------------------------------------------------
3. -Store initially all keys in the output
OUTPUT
--------------------------------------------------------
1. [employee#]
--------------------------------------------------------
2. [jobhistory#]
--------------------------------------------------------
3. [salaryhistory#]
--------------------------------------------------------
4. [offspring#]
--------------------------------------------------------
2. -Run the algorithm: in general
1. Start with the current node equal to the root
employee#
---
1. For the current node
employee#
---
1. If the current node has no children anymore:
1. Return
---
1. For first to last child of that node
1. Add all the keys of the current node to that child
(e.g. put it in front)
OUTPUT
--------------------------------------------------------
1. [employee#]
--------------------------------------------------------
2. employee# [jobhistory#]
--------------------------------------------------------
3. [salaryhistory#]
--------------------------------------------------------
4. employee# [offspring#]
--------------------------------------------------------
2. Next child
---
3. Repeat the above steps for all the childs of the current node
---
---
1. For the current node
jobhistory
---
1. If the current node has no children anymore:
1. Return
---
1. For first to last child of that node
1. Add all the keys of the current node to that child
(e.g. put it in front)
OUTPUT
--------------------------------------------------------
1. [employee#]
--------------------------------------------------------
2. employee# [jobhistory#]
--------------------------------------------------------
3. employee# jobhistory# [salaryhistory#]
--------------------------------------------------------
4. employee# [offspring#]
--------------------------------------------------------
2. Next child
---
3. Repeat the above steps for all the childs of the current node
---
---
1. For the current node
offspring
---
1. If the current node has no children anymore:
1. Return
---
---
1. For the current node
salaryhistory
---
1. If the current node has no children anymore:
1. Return
---
---
So the final result is:
OUTPUT
--------------------------------------------------------
1. [employee#]
--------------------------------------------------------
2. employee# [jobhistory#]
--------------------------------------------------------
3. employee# jobhistory# [salaryhistory#]
--------------------------------------------------------
4. employee# [offspring#]
--------------------------------------------------------
---
---
Internet: see also:
---
Database: Language: SQL: Overview: Can you give an overview of links
about SQL?
http://www.faqts.com/knowledge_base/view.phtml/aid/32811/fid/54
----------------------------------------------------------------------