Subscribe by Email

Monday, September 7, 2009

Database Normalization

Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process: eliminating redundant data (for example, storing the same data in more than one table) and ensuring data dependencies make sense (only storing related data in a table).
Database normalization can essentially be defined as the practice of optimizing table structures. Optimization is accomplished as a result of a thorough investigation of the various pieces of data that will be stored within the database, in particular concentrating upon how this data is interrelated. An analysis of this data and its corresponding relationships is advantageous because it can result both in a substantial improvement in the speed in which the tables are queried, and in decreasing the chance that the database integrity could be compromised due to tedious maintenance procedures.

Why Normalize?
- To reduce redundancy : One obvious drawback of data repetition is that it consumes more space and resources than is necessary. Redundancy introduces the possibility for error.
- Unforeseen Scalability Issues : as a database grows in size, initial design decisions will continue to play a greater role in the speed of and resources allocated to this database.

The process towards database normalization progressing through a series of steps, typically known as Normal Forms.
- First Normal Form (1NF): A relation is in 1NF if and only if all underlying domains contain scalar values only. 1NF is often refered to the atomic rule. In a database, this means that each column should only be designed to hold one and only one piece of information.
- Second Normal Form (2NF): It further addresses the concept of removing duplicative data:
* Meet all the requirements of the first normal form.
* Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
* Create relationships between these new tables and their predecessors through the use of foreign keys.
- Third Normal Form (3NF): It goes one large step further:
* Meet all the requirements of the second normal form.
* Remove columns that are not dependent upon the primary key.
- Fourth Normal Form (4NF): It has one additional requirement:
* Meet all the requirements of the third normal form.
* A relation is in 4NF if it has no multi-valued dependencies.

No comments:

Facebook activity