Closure In DBMS

In Database Management Systems (DBMS), closure refers to the process of determining all the functional dependencies that can be derived from a given set of functional dependencies (FDs). Specifically, attribute closure is used to find all attributes that can be functionally determined by a given set of attributes in a relational database.

Closure is an important concept when working with normalization and is particularly useful for finding candidate keys and understanding how different attributes relate to each other.

What is Attribute Closure?

Attribute closure is the set of all attributes that can be functionally determined by a particular attribute or set of attributes in a relational schema. The attribute closure helps us understand how one or more attributes are linked to others through functional dependencies.

Notation:
The closure of a set of attributes X is denoted as X⁺.

For example, if you have an attribute set X = {A, B}, then X⁺ represents all attributes that can be derived from A and B based on the functional dependencies in the database.

How to Find Attribute Closure?

To find the closure of a set of attributes X, follow these steps:

  1. Start with the set of attributes: Begin with the given set of attributes X.

  2. Apply functional dependencies: Use the functional dependencies (FDs) in the schema to determine other attributes that can be functionally determined by X.

  3. Repeat until no more attributes can be added: Continue applying the functional dependencies until no more attributes can be derived.

  4. Final closure: The result is the set of attributes that are functionally determined by X.

Example 1: Closure of Attributes in a Relation

Let’s consider a relation R with the following attributes: A, B, C, D, E.

Functional Dependencies (FDs):

  1. A → B

  2. B → C

  3. A → D

  4. D → E

Now, let’s find the closure of {A} (A⁺).

Steps:

  • Start with A⁺ = {A} (the closure initially contains just A).

  • Apply A → B (from FD 1): Add B to the closure.

    • A⁺ = {A, B}

  • Apply B → C (from FD 2): Add C to the closure.

    • A⁺ = {A, B, C}

  • Apply A → D (from FD 3): Add D to the closure.

    • A⁺ = {A, B, C, D}

  • Apply D → E (from FD 4): Add E to the closure.

    • A⁺ = {A, B, C, D, E}

The closure of A is {A, B, C, D, E}. This means that A functionally determines all attributes in the relation.

Example 2: Closure of Attributes in a Relation

Let’s take a relation R with attributes A, B, C, D, and the following functional dependencies:

  • R = {A, B, C, D}

  • FD = {A → B, B → C, C → D}

We need to find the closure of all the attributes that appear on the left-hand side of the FDs.

Closure of Attribute “A”

  • A determines A itself (recursive rule).

  • A → B (direct functional dependency).

  • Since B is determined, B → C (transitive dependency).

  • Since C is determined, C → D (transitive dependency).

Closure of A = A⁺ = {A, B, C, D}

Thus, the closure of A contains all the attributes of the relation, making A the candidate key.

Closure of Attribute “B”

  • B determines B itself (recursive rule).

  • B → C (direct functional dependency).

  • Since C is determined, C → D (transitive dependency).

  • B cannot determine A.

Closure of B = B⁺ = {B, C, D}

Since B does not determine A, it cannot be a candidate key.

Closure of Attribute “C”

  • C determines C itself (recursive rule).

  • C → D (direct functional dependency).

  • C cannot determine A or B.

Closure of C = C⁺ = {C, D}

Since C does not determine all the attributes, it is not a candidate key.

Closure of Attribute “D”

  • D determines D itself (recursive rule).

  • D cannot determine A, B, or C.

Closure of D = D⁺ = {D}

Since D does not determine all the attributes, it is not a candidate key.

Conclusion

The closure of A determines all attributes, making A the candidate key. The other attributes B, C, and D are not candidate keys.

Example 3: Another Example of Attribute Closure

Let’s consider a different relation R with attributes A, B, C, D, and the following functional dependencies:

  • R = {A, B, C, D}

  • FD = {A → B, B → C, C → D, D → A}

Now we will find the closure of all the attributes.

Closure of Attribute “A”

  • A determines A itself (recursive rule).

  • A → B (direct functional dependency).

  • B → C (transitive dependency).

  • C → D (transitive dependency).

  • D → A (transitive dependency), which brings us back to A.

Closure of A = A⁺ = {A, B, C, D}

The closure of A determines all attributes, so A is a candidate key.

Closure of Attribute “B”

  • B determines B itself (recursive rule).

  • B → C (direct functional dependency).

  • C → D (transitive dependency).

  • D → A (transitive dependency).

  • A → B (transitive dependency), bringing us back to B.

Closure of B = B⁺ = {A, B, C, D}

The closure of B determines all attributes, so B is also a candidate key.

Closure of Attribute “C”

  • C determines C itself (recursive rule).

  • C → D (direct functional dependency).

  • D → A (transitive dependency).

  • A → B (transitive dependency).

Closure of C = C⁺ = {A, B, C, D}

The closure of C determines all attributes, so C is a candidate key.

Closure of Attribute “D”

  • D determines D itself (recursive rule).

  • D → A (direct functional dependency).

  • A → B (transitive dependency).

  • B → C (transitive dependency).

Closure of D = D⁺ = {A, B, C, D}

The closure of D determines all attributes, so D is a candidate key.

Conclusion

In this case, all attributes (A, B, C, D) can be candidate keys because the closure of each attribute set determines all attributes in the relation.

Example 4: More Complex Example

Consider a relation R = {A, B, C, D, E, F, G} with the following functional dependencies:

  • A → BC

  • BC → DE

  • D → F

  • CF → G

Now we will find the closure of several attributes and attribute sets.

Closure of Attribute “A”

  • A determines A itself.

  • A → BC: Therefore, B and C are determined.

  • Using BC → DE, D and E are also determined.

  • Using D → F, F is determined.

  • Using CF → G, G is determined.

Closure of A = A⁺ = {A, B, C, D, E, F, G}

Thus, the closure of A determines all attributes in the relation, making A the candidate key.

Closure of Attribute “D”

  • D determines D itself.

  • Using D → F, F is determined.

  • D cannot determine A, B, C, E, or G.

Closure of D = D⁺ = {D, F}

Since D does not determine all attributes, it is not a candidate key.

Closure of Attribute Set {B, C}

  • B and C determine B and C themselves.

  • Using BC → DE, D and E are determined.

  • Using D → F, F is determined.

  • Using CF → G, G is determined.

Closure of {B, C} = {B, C, D, E, F, G}

Since {B, C} cannot determine A, it is not a candidate key.

Conclusion

The only candidate key in this example is A, as its closure determines all attributes.

Closure and Candidate Keys

Candidate Key is a minimal set of attributes that can functionally determine all attributes in a relation. To find candidate keys using closure:

  1. Check closures of different combinations of attributes: The set of attributes that determines all other attributes is a candidate key.

  2. Minimize the set: Ensure that no attribute in the set can be removed while still determining all attributes.

For example, if we want to find a candidate key for the relation R with attributes A, B, C, D, E, we can check the closure of different attribute sets, like {A}, {A, B}, etc., until we find the minimal set that determines all attributes.

Applications of Closure in DBMS

  1. Normalization: Closure is used to determine candidate keys and remove redundant data during the normalization process. By finding attribute closures, you can simplify the schema and avoid anomalies.

  2. Finding Candidate Keys: By checking which sets of attributes determine all other attributes, you can find the candidate keys for a relation.

  3. Checking Data Consistency: Attribute closure helps in ensuring data consistency by showing the relationships between attributes and how they can be derived.

  4. Query Optimization: In query optimization, closures can help identify which attributes need to be considered when forming query plans and reduce unnecessary data fetching.

Example of Finding Candidate Key Using Closure

Let’s consider a relation R with attributes A, B, C, D, E, and the following functional dependencies:

  1. A → B

  2. B → C

  3. A → D

  4. D → E

To find a candidate key:

  • Check the closure of A:
    A⁺ = {A, B, C, D, E}. This closure includes all attributes in the relation, so A is a candidate key.

  • Check the closure of other combinations, like {B}, {C}, {D}, etc., to ensure there is no smaller set that can determine all attributes.

In this case, A is the only minimal set that determines all attributes, making it the candidate key.

Short-cut to find the candidate’s key

The smart trick is to find all possible candidate keys in the relation.  Suppose the following relation

R(A,B,C,D,E)

FD= {A→B, BC→D, E→C, D→A}

1.    If any attributes do not exist on the right side of FD, then they must be part of the candidate key with some other attribute as well.

Check in the above-given relation that every attribute is determined from some attributes, but attribute E is not determined by any attribute. So, attribute “E” must be used in the candidate Key with some other Key to determine E itself.

2.  As Closure of EA+ can determines the all attributes of Relation, EA is the first candidate key. To make the rest of the candidate keys, follow the following

I.  Check either E or A attributes present on the right-hand side of any FD. If E or A is found on the Right side of the FD, then replace it with the Left side of that FD.

For the second candidate, key A is replaced with D, and E does not exist on the right side of any FD, so it remains the same. Therefore, the second candidate key will be ED.

II. In the same way, now D will check in all FD’s. If D is found on the Right-hand side of any FD, then it can also be replaced. Keep in mind the concept of decomposition if there are two attributes that determine the single attribute. 

In this case, D will be replaced either with B or C. Here, we check the closer of EB and EC and select one of them. So EB is the valid candidate key but EC cannot determines the entire attributes so

Candidate keys will be  (EA, ED, BE}

Conclusion

Closure in DBMS is a fundamental concept that helps in determining the relationships between attributes, finding candidate keys, and simplifying the schema. By using attribute closure and functional dependency closure, DBMS can optimize queries, ensure data integrity, and maintain consistency in the relational database model. Understanding how to compute closure is essential for tasks like normalization and designing efficient database schemas.