4NF in DBMS
Fourth Normal Form (4NF) is a higher level of database normalization that aims to eliminate multivalued dependencies. A table is said to be in 4NF if:
-
It is in Boyce-Codd Normal Form (BCNF).
-
It has no multivalued dependencies.
Notation of Multi-Valued Dependency
What is a Multivalued Dependency?
A multivalued dependency occurs when one attribute (or a set of attributes) in a table determines two or more independent attributes, and these independent attributes are not dependent on each other. In simpler terms, if an attribute X determines two or more attributes Y and Z, but Y and Z are independent of each other, then there is a multivalued dependency between X and Y/Z.
Multivalued dependencies can result in data redundancy, as the same information may be repeated for each combination of values.
Conditions for 4NF
For a table to be in 4NF, it must satisfy the following two conditions:
-
It must be in BCNF: The table must already satisfy the conditions of BCNF, i.e., every functional dependency must have a superkey on the left-hand side.
-
No Multivalued Dependencies: There should be no multivalued dependencies. In other words, if an attribute set X determines two or more independent attributes, the table must be decomposed to remove the multivalued dependencies.
Multivalued Dependency Example
Consider a Student-Courses-Books table, where students can enroll in multiple courses, and each course can have multiple books associated with it.
StudentID | CourseID | BookTitle |
---|---|---|
101 | C001 | Math Basics |
101 | C001 | Advanced Math |
101 | C002 | Physics 101 |
102 | C001 | Math Basics |
102 | C002 | Physics 101 |
102 | C002 | Chemistry Intro |
Functional Dependencies:
-
StudentID → CourseID (A student is enrolled in a course).
-
CourseID → BookTitle (A course has books associated with it).
This table violates 4NF because:
-
StudentID determines both CourseID and BookTitle. However, BookTitle and CourseID are independent of each other. A student might be enrolled in a course, and that course can have multiple books, but the books are not related to each other.
-
There is a multivalued dependency: StudentID → CourseID and StudentID → BookTitle, but CourseID and BookTitle are independent of each other.
Converting to 4NF
To convert the above table to 4NF, we must eliminate the multivalued dependency by decomposing the table into two separate tables:
-
Student-Courses Table: This table stores the relationship between students and the courses they are enrolled in.
StudentID CourseID 101 C001 101 C002 102 C001 102 C002 -
Student-Books Table: This table stores the relationship between students and the books associated with their courses.
StudentID BookTitle 101 Math Basics 101 Advanced Math 101 Physics 101 102 Math Basics 102 Physics 101 102 Chemistry Intro
In this decomposition:
-
Each table now stores only the relevant information, eliminating the multivalued dependency between CourseID and BookTitle.
-
The data is now stored in 4NF because both tables no longer contain multivalued dependencies.
Benefits of 4NF
-
Reduces Redundancy: By eliminating multivalued dependencies, 4NF reduces unnecessary repetition of data, which improves storage efficiency and data integrity.
-
Improves Data Integrity: With no multivalued dependencies, data anomalies (such as update anomalies) are less likely to occur.
-
Easier Data Management: Having data organized in 4NF makes it easier to manage and query, as each piece of information is stored independently.
When is a Table Not in 4NF?
A table is not in 4NF if:
-
It has multivalued dependencies (i.e., an attribute set determines two or more independent attributes).
-
The table is not in BCNF, or it violates the condition that the left-hand side of every functional dependency must be a superkey.
Conclusion
Fourth Normal Form (4NF) ensures that a table is free from multivalued dependencies, which eliminates unnecessary redundancy and improves the integrity and efficiency of the database design. To achieve 4NF:
-
The table must first be in BCNF.
-
It must eliminate any multivalued dependencies by decomposing the table into smaller relations.
This process helps create a more efficient, consistent, and scalable database schema.