faqts : Computers : Databases : Microsoft SQL Server

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

3 of 18 people (17%) answered Yes
Recently 1 of 10 people (10%) answered Yes

Entry

Database: Relational: Normalization: Simple: How to do normalization?

Mar 13th, 2005 10:42
Knud van Eeden,


----------------------------------------------------------------------
--- Knud van Eeden --- 25 December 2003 - 00:48 am -------------------
Database: Relational: Normalization: Simple: How to do normalization?
---
Normalization (according to a procedure given by Edgar Codd in the
original 1970 article, which showed for the first time the theory of
relational databases),
'A relational model of data for large shared data banks',
which appeared in the ACM (=Association for Computing Machinery)
magazine, Vol. 13, No. 6, pp. 377-387
http://www.acm.org/classics/nov95/toc.html
proceeds as follows.:
---
Steps: Overview: General
 1. -if the current node is an endbranch (=no children anymore), stop
 2. -Starting with the relation at the top of the tree,
 3. -take its primary key
 4. -and expand each of the immediately subordinate
     relations by inserting this primary key domain or domain
     combination.
 5. -The primary key of each expanded relation consists of the primary
     key before expansion augmented by the primary key copied down from
     the parent relation.
 6. -Now, strike out from the parent relation all nonsimple domains,
 7. -remove the top node of the tree,
 8. -repeat the same sequence of operations on each remaining
     subtree.
---
---
Steps: Overview: Specific
---
                            employee
                                |
            +-------------------+----------+
            |                              |
        jobhistory                       offspring
            |
       salaryhistory
---
Given: Input:
---
employee (man#, name, birthdate, jobhistory, children)
jobhistory (jobdate#, title, salaryhistory)
salaryhistory (salarydate#, salary)
offspring (offspringname#, birthyear)
---
Desired: Output:
---
employee' (man#, name, birthdate)
jobhistory' (man#, jobdate#, title)
salaryhistory' (man#, jobdate#, salarydate, salary)
offspring' (man#, offspringname#, birthyear)
---
[Internet: source: http://www.acm.org/classics/nov95/s1p4.html]
---
---
Steps: Worked out: Specific
---
Follow the following recursive steps:
---
 1. -if the current node is an endbranch (=no children anymore), stop
 2. -Starting with the relation at the top of the tree,
      employee (man#, name, birthdate, jobhistory, children)
 3. -take its primary key
      man#
 4. -and expand each of the immediately subordinate
     relations by inserting this primary key domain or domain
     combination.
      1. jobhistory (man#, jobdate#, title, salaryhistory)
      2. offspring (man#, offspringname#, birthyear)
 5. -The primary key of each expanded relation consists of the primary
     key before expansion augmented by the primary key copied down from
     the parent relation.
 6. -Now, strike out from the parent relation all nonsimple domains,
      employee (man#, name, birthdate)
 7. -remove the top node of the tree,
 8. -repeat the same sequence of operations on each remaining
     subtree.
     ---
     1. -if the current node is an endbranch (=no children anymore), 
stop
          jobhistory has 1 child, so continue
     2. -Starting with the relation at the top of the tree,
          jobhistory (man#, jobdate#, title, salaryhistory)
     3. -take its primary key
          man#, jobdate#
     4. -and expand each of the immediately subordinate
         relations by inserting this primary key domain or domain
         combination.
          1. salaryhistory (man#, jobdate#, salarydate, salary)
     5. -The primary key of each expanded relation consists of the 
primary
         key before expansion augmented by the primary key copied down 
from
         the parent relation.
     6. -Now, strike out from the parent relation all nonsimple 
domains,
          jobhistory (man#, jobdate#, title)
     7. -remove the top node of the tree,
     8. -repeat the same sequence of operations on each remaining
         subtree.
         ---
         1. -if the current node is an endbranch (=no children 
anymore), stop
              salaryhistory (man#, jobdate#, salarydate, salary)
              has no children, so stop
     ---
     1. -if the current node is an endbranch (=no children anymore), 
stop
          offspring (man#, offspringname#, birthyear)
          has no children, so stop
---
---
Collecting all the resulting terms gives:
      employee (man#, name, birthdate)
      jobhistory (man#, jobdate#, title)
      salaryhistory (man#, jobdate#, salarydate, salary)
      offspring (man#, offspringname#, birthyear)
which is the wanted output
Note:
The order of the rows here is not important, because each row has all
its connection information contained in it independent of its position
---
---
Algorithm
 1. Start with the current node equal to the root
  ---
  1. For the current node
   ---
   1. If the current node has no children anymore:
      1. Store that result in the output
      2. Return
   ---
   1. For first to last child of that node
    1. Add the primary key of the current node to that child
   2. Next child
   ---
  2. Remove the childs of the current node
  ---
  3. Store the result in the output
  ---
  4. Repeat the above steps for all the childs of the current node
---
---
So a possible simplest implementation of this algorithm might be:
                            employee
                                |
            +-------------------+----------+
            |                              |
        jobhistory                       offspring
            |
       salaryhistory
---
---
Steps: Overview:
 1. -Store the given tree
 2. -Create an output
 2. -Run the algorithm
---
---
Steps: Worked out:
 1. -Store the given tree in an array
    INPUT
    --------------------------------------------------------
    1. employee#      | jobhistory    | offspring | -1     |
    --------------------------------------------------------
    2. jobhistory#    | salaryhistory | -1        |        |
    --------------------------------------------------------
    3. salaryhistory# | -1            |           |        |
    --------------------------------------------------------
    4. offspring#     | -1            |           |        |
    --------------------------------------------------------
     This is one of the easiest ways to store the tree.
     ---
     Each row contains a node,
     with on the right of it all its children,
     ---
     Followed by -1
     (to indicate that there are no more childs for that node)
     ---
     You indicate that that node has a key, by putting a '#' at the end
     of its name.
     ---
  2. -Create some output structure in an array
    OUTPUT
    --------------------------------------------------------
    1.
    --------------------------------------------------------
    2.
    --------------------------------------------------------
    3.
    --------------------------------------------------------
    4.
    --------------------------------------------------------
 3. -Store initially all keys in the output
    OUTPUT
    --------------------------------------------------------
    1. [employee#]
    --------------------------------------------------------
    2. [jobhistory#]
    --------------------------------------------------------
    3. [salaryhistory#]
    --------------------------------------------------------
    4. [offspring#]
    --------------------------------------------------------
 2. -Run the algorithm: in general
      1. Start with the current node equal to the root
          employee#
       ---
       1. For the current node
          employee#
        ---
        1. If the current node has no children anymore:
           1. Return
        ---
        1. For first to last child of that node
         1. Add all the keys of the current node to that child
            (e.g. put it in front)
             OUTPUT
             --------------------------------------------------------
             1. [employee#]
             --------------------------------------------------------
             2. employee# [jobhistory#]
             --------------------------------------------------------
             3. [salaryhistory#]
             --------------------------------------------------------
             4. employee# [offspring#]
             --------------------------------------------------------
        2. Next child
        ---
       3. Repeat the above steps for all the childs of the current node
       ---
       ---
       1. For the current node
          jobhistory
        ---
        1. If the current node has no children anymore:
           1. Return
        ---
        1. For first to last child of that node
         1. Add all the keys of the current node to that child
            (e.g. put it in front)
             OUTPUT
             --------------------------------------------------------
             1. [employee#]
             --------------------------------------------------------
             2. employee# [jobhistory#]
             --------------------------------------------------------
             3. employee# jobhistory# [salaryhistory#]
             --------------------------------------------------------
             4. employee# [offspring#]
             --------------------------------------------------------
        2. Next child
        ---
       3. Repeat the above steps for all the childs of the current node
       ---
       ---
       1. For the current node
           offspring
        ---
        1. If the current node has no children anymore:
           1. Return
        ---
        ---
       1. For the current node
          salaryhistory
        ---
        1. If the current node has no children anymore:
           1. Return
---
---
So the final result is:
 OUTPUT
 --------------------------------------------------------
 1. [employee#]
 --------------------------------------------------------
 2. employee# [jobhistory#]
 --------------------------------------------------------
 3. employee# jobhistory# [salaryhistory#]
 --------------------------------------------------------
 4. employee# [offspring#]
 --------------------------------------------------------
---
---
Internet: see also:
---
Database: Language: SQL: Overview: Can you give an overview of links 
about SQL?
http://www.faqts.com/knowledge_base/view.phtml/aid/32811/fid/54
----------------------------------------------------------------------