TechView with Kamlesh khaliya

Resources for OCP, Oracle Interviews(kamleshkhaliya@gmail.com)

Wednesday, 9 May 2012

Database Normal Forms


What is Normalization?

Normalization is the process removing redundant data from database by splitting tables in to smaller and more manageable tables, based on the dependencies. Prime objective behind Normalization is atomicity, data should be stored on one place ( at atomic level). So during the life time of the database, we do not lose information or introduce inconsistencies.

There is no fool-proof algorithmic method of identifying dependency. We have to use our commonsense and judgment of specify
dependencies. Let X and Y be two attributes of a relation. Given the value of X, if there is only one value of Y corresponding to it, then Y is said to be functionally dependent on X. This is indicated by the notation:
X → Y
For example, given the value of item code, there is only one value of item name for i t Thus item name is functionally dependent on item code. This is shows as:
item code → item name
Functional dependency may also be based on a composite attribute. For example, if we write
X,Z → Y
it means that there is only one value of Y corresponding to given values of X, Z. In other words, Y is functionally dependent on the composite X, Z.
Note : In database rows are known as tupple and columns are known as attributes.

First Normal Form An entity is in First Normal Form (1NF) when it contains no repeating groups of data.

So to achieve the 1NF we
- Eliminate duplication columns from the same table.
- Create separate tables for each group of related data, define the primary key for unique identification of each record and maintain the relationship between these new tables and their predecessors through foreign key.
Second Normal Form An entity is in Second Normal Form (2NF) when it meet all the requirements of the first normal form and all of its non-key attributes must fully dependent on its primary key(there should not be any Partial Dependency ).
    • Non-key attribute: any attribute that is not part of the primary key.
    • Non-key attribute is fully dependent on the primary key known as Functional Dependency.
    • Partial Dependency : A type of functional dependency where an attribute is functionally dependent on only part of the primary key (primary key must be a composite key)
So to achieve the 2NF we identify the partial dependent columns and create separate table for those and define the primary key for unique identification of each record and maintain the relationship between these new tables and their predecessors through foreign key.
Third Normal Form An entity is in Third Normal Form (3NF) when it meets the requirement of being in Second Normal Form (2NF) and there must not be any transitive dependencies means all of its attributes must only dependent on the primary key (there should not be any dependency between non key attributes)..
Transitive dependency : A transitive dependency is an indirect functional dependency, one in which X→Z only by virtue of X→Y and Y→Z. (If two non-key attributes are functionally dependent )
To achieve 3NF we create separate table for columns which are in transit dependency, define the primary key for unique identification of each record and maintain the relationship between these new tables and their predecessors through foreign key.
Boyce-Codd Normal Form Boyce Codd Normal Form (BCNF) is a refinement of 3NF, in the case with two or more candidate keys which are composite and overlapping (that is, they have at least one field in common). If these conditions are not fulfilled, 3NF and BCNF are equivalent.
If the conditions fulfilled and If an attribute of a composite key is dependent on an attribute of the other composite key, a normalization called BCNF is needed.

Fourth Normal Form An entity is in Fourth Normal Form (4NF) when it meets the requirement of being in Third Normal Form (3NF) and additionally: if not contain two or more 1:n (one-to-many) or n:m (many-to-many) relationships that are not directly related.
Fifth Normal Form There is one more form of normalization which is sometimes applied, but it is indeed very esoteric and is in most cases probably not required to get the most functionality out of your data structure or application. It's tenet suggests:

The original table must be reconstructed from the tables into which it has been broken down.


The benefit of applying this rule ensures you have not created any extraneous columns in your tables, and that all of the table structures you have created are only as large as they need to be. It's good practice to apply this rule, but unless you're dealing with a very large data schema you probably won't need it.
I hope you have found this article useful, and are able to begin applying these rules of normalization to all of your database projects. And in case you're wondering where all of this came from, the first three rules of normalization were outlined by Dr. E.F. Codd in his 1972 paper, "Further Normalization of the Data Base Relational Model". Other rules have since been theorized by later Set Theory and Relational Algebra mathematicians.


Note : second, third, and Boyce–Codd normal forms are concerned with functional dependencies, 4NF is concerned with a more general type of dependency known as a multivalued dependency. 

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home