Candidate Key in DBMS
In Database Management Systems (DBMS), a candidate key is a set of one or more attributes (columns) that can uniquely identify a record (row) in a table. A table can have multiple candidate keys, but only one of them is chosen as the primary key.
Key Points about Candidate Key:
-
Uniqueness: Every candidate key must be able to uniquely identify each row in the table.
-
No Redundancy: A candidate key is a minimal superkey with only essential attributes for uniqueness.
-
Multiple Candidate Keys: A table can have more than one candidate key. The user chooses one of them to be the primary key. The others are called alternate keys.
| How to Find the Candidate Key in DBMS?
Answer: We can find all possible candidate keys by using the closure method, which is applied on the given Relational table and functional dependencies. |
An algorithm to find Candidate Keys in DBMS
Here is an algorithm that will calculate all possible candidate keys from the given relation
Step 01: Find the initial super key
Take all attributes of relation (R) and name it the first super key as (SK_1) because it can determine itself through the reflexive property.
Example:
Step 02: Minimize SK_1
Now check all Fd’s, and discard all those attributes (one by one) from SK_1, which are directly or indirectly dependent on some other attributes. Continue until no more attributes can be removed. We get a minimal super key named (SK_2)
Example:
Step 03: Verification of SK_2 for candidate key
First, take the proper subset of SK_2, and then take the closure of every proper subset
- If no proper subset determines all the attributes of relation (R), then SK_2 is a first candidate key (CK_1)
- Otherwise, the subset that determines all attributes of relation (R) becomes the first candidate key (CK_1).
Example
Step 04: Check Prime attributes in CK_1
Note down all the attributes of CK_1, which will be the prime attributes because they are participating in a candidate key.
Example
Step 05: Check for More Candidate Keys Possibilities
Check all functional dependencies (FD’s) again. If a prime attribute appears on the RHS of an FD, replace that prime attribute in the candidate key with the attributes on the LHS of that FD. In this way, a new set of attribute/attributes is generated.
- If the closure of this new set determines the relation (R), then it will be a new super key (SK_3); otherwise, discard it
- Now apply step 3 to SK_3 to generate more candidate keys, i.e., CK_2.
Now, repeat step-4, step 05 (includes step 3) again and again until no new set for the super key are remains.
Example:
Step 7: Repeat Until No New Keys Are Found
Repeat Steps 4–6 for every newly discovered candidate key.
Continue generating, testing, and minimizing new super keys until no additional candidate keys can be produced.
Examples of Candidate Keys in DBMS
Consider a Student table with the following columns:
| Student_ID | Phone | Name | |
|---|---|---|---|
| 1 | john@email.com | 1234567890 | John |
| 2 | jane@email.com | 9876543210 | Jane |
-
Student_ID: Can uniquely identify each student.
-
Email: Can also uniquely identify each student.
-
Phone: Similarly, it can uniquely identify each student.
In this case, Student_ID, Email, and Phone are all candidate keys because each of them can uniquely identify a student. However, one of them would be chosen as the primary key.
Difference Between Candidate Key and Primary Key
-
Candidate Key: A set of attributes that can uniquely identify a record in a table. There can be multiple candidate keys.
-
Primary Key: A candidate key chosen to uniquely identify records in a table. It cannot contain null values.
The primary key is selected from one of the candidate keys, and it’s used to maintain uniqueness within a table. While a table can have multiple candidate keys, only one is designated as the primary key.