Normalization And Its Types
Normalization is a process used in database design to organize data efficiently. The goal is to eliminate data redundancy (repeated data) and improve data integrity by dividing large tables into smaller, related ones. This technique ensures that the data is logically stored and can be accessed or updated with minimal errors or inconsistencies.
Why is Normalization Important?
- Reduces Data Redundancy: Prevents repetition of data, saving space and minimizing the risk of errors.
- Improves Data Integrity: Ensures data remains consistent, accurate, and up-to-date.
- Faster Queries: With organized data, database queries become faster and more efficient.
- Easier Updates: It simplifies updates, deletions, and insertions, as there’s only one place for each piece of data.
Types of Normalization
Normalization is generally done in stages, each stage is called a Normal Form (NF).

Let’s explore the most common types:
1. First Normal Form (1NF)
A table is in 1NF if:
-
Each column contains atomic values (indivisible values).
-
Each column contains only one value for each row.
-
All entries in a column are of the same type.
Example:
-
If a “Phone Numbers” column has multiple phone numbers for one person, that table is not in 1NF. It should be split so that each phone number has its own row.
2. Second Normal Form (2NF)
A table is in 2NF if:
-
It is in 1NF.
-
All non-key attributes (columns) are fully dependent on the primary key.
Example:
-
Suppose a table contains information about students and their courses, but the student’s address is stored along with the course details. This violates 2NF. To correct it, we create a separate table for student addresses.
3. Third Normal Form (3NF)
A table is in 3NF if:
-
It is in 2NF.
-
No transitive dependency exists, meaning non-key attributes do not depend on other non-key attributes.
Example:
-
If a table includes the employee’s department and the department’s manager, the manager’s information depends on the department. This creates a transitive dependency. We remove this by creating a new table for departments and managers.
4. Boyce-Codd Normal Form (BCNF)
A table is in BCNF if:
-
It is in 3NF.
-
Every determinant is a candidate key (a unique identifier for each record).
Example:
-
If a table contains course codes and the name of the professor, and each professor can teach multiple courses but not all courses are taught by the same professor, then this setup might violate BCNF. You would need to adjust the design so that every determinant is unique.
5. Fourth Normal Form (4NF)
A table is in 4NF if:
-
It is in BCNF.
-
It has no multi-valued dependencies, meaning no column contains multiple independent values.
Example:
-
If a student is enrolled in both multiple sports and multiple clubs, we would separate these into different tables to avoid multi-valued dependency.
6. Fifth Normal Form (5NF)
A table is in 5NF if:
-
It is in 4NF.
-
It cannot be further decomposed without losing data integrity.
Example:
-
If a table contains a relationship between students, courses, and instructors, we may need to split the data into separate tables to ensure all possible combinations are stored without redundancy.
When to Stop Normalizing?
While normalization is crucial, it is not always necessary to go all the way to 5NF. In practice, most databases are normalized up to 3NF or BCNF because higher normal forms can lead to performance issues due to increased complexity and the need for more joins in queries.
Normal Forms Summary Table
| Normal Form | Description |
| 1NF | A relation will be in 1NF if it contains an atomic value. |
| 2NF | A relation will be in 2NF if it follows the following
• It is in 1NF |
| 3NF | A relation will be in 3NF if it follows the following
• It is in 2NF |
| BCNF | A relation will be in BCNF if it follows the following
|
| 4NF | A relation will be in 4NF if it follows the following • It is in Boyce Codd’s normal form • It has no multi-valued dependency. |
| 5NF | A relation is in 5NF if it follows the following • It is in 4NF • It does not contain any join dependency, and joining should be lossless. |
Conclusion
Normalization is a powerful technique to ensure data is organized and stored efficiently in a database. By following the normal forms, database designers can reduce redundancy, improve data integrity, and ensure faster query performance. However, it’s essential to find the right balance based on your needs and database performance considerations.