DBMS - Super Keys and Candidate Keys

Quiz

Keys in a database are used to uniquely identifying records and maintaining data integrity. Read this chapter to get a good understanding of Super Keys and Candidate Keys and how they work.

A Quick Recap of Keys in DBMS

A key in a database is an attribute or a group of attributes that uniquely identifies every record in a table. For instance, imagine a table with the following columns −

CID (College ID)
UID (University ID)
Name
Marks

Here, each row represents a student. We use keys to uniquely identify a specific student's record. Without a key, we can get several issues like retrieving duplicate records or failing to pinpoint a specific entry etc.

Super Keys: The Superset of Keys

As its name suggests, a Super Key is the super set of keys. A Super Key is any combination of attributes in a table that can uniquely identify a record. It does not matter if the combination includes unnecessary attributes. As long as the group can uniquely identify a row, it is a Super Key.

Following are the important characteristics of a super key −

It can include one or more attributes.
The group of attributes may include unnecessary ones.
Every combination must ensure unique identification.

For example, in our Student table, CID, UID, and Name can all be super keys individually because they are capable of identifying records. The combinations like (CID + UID), (UID + Marks), or even (CID + UID + Name) are also super keys because they still uniquely identify rows.

The Number of Super Keys

The total number of super keys in a table depends on the attributes. For a table with four attributes as given in our example, there can be multiple super keys. These can include all four attributes together, subsets of three, subsets of two, or even individual attributes. They are provided to uniquely identify rows.

Importance of Super Keys

Super keys are important because they form the base for identifying records. However, not all super keys are practical or can be usable. Many super keys include redundant attributes. They do not add value to the identification process. This brings us to the next type of key, known as Candidate Keys.

Candidate Keys: The Minimal Super Keys

A Candidate Key is a minimal subset of a Super Key. In simple terms, it is the smallest possible group of attributes that can uniquely identify rows. If we remove even one attribute from a candidate key, it can no longer serve as a unique identifier.

Following are the important characteristics of a candidate key −

It is a super key, but without redundant attributes.
Every table can have one or more candidate keys.
Candidate keys are potential candidates for becoming a Primary Key.

Example of Candidate Keys

Let us see the Student table again. We have multiple Super Keys −

CID
UID
CID + UID
UID + Name

Out of these, CID and UID are the Candidate Keys because −

They can uniquely identify rows.
They don't include any unnecessary attributes.

On the other hand, combinations like (CID + UID) are not Candidate Keys because the extra attribute does not add any value to the identification process.

Why the Term "Candidate" is used?

The term "candidate" is used to signify the fact that these keys are "candidates" for being selected as the primary key of the table.

How Do We Decide a Candidate Key?

To determine Candidate Keys, we must follow a set of steps as mentioned below −

List All Super Keys − Start by identifying all combinations of attributes that can uniquely identify rows.
Eliminate Redundancy − Remove combinations where unnecessary attributes are included.
Verify Minimality − Ensure that each candidate key is minimal. So by removing even one attribute should make it invalid.

Example − In our Student table, CID and UID are standalone attributes that can uniquely identify rows. These are Candidate Keys. Now, the combinations like (CID + UID) or (UID + Marks) are Super Keys but not Candidate Keys because they include unnecessary attributes.

Candidate Keys and Primary Keys

Once we have identified the Candidate Keys, we need to choose one to serve as the Primary Key. This decision is typically based on the scope and practicality of the keys.

How to Choose a Primary Key?

Let us suppose both the CID and UID are Candidate Keys in our Student table. So, the question arises, which one should be chosen as the Primary Key? It depends on the context −

CID is limited to a single college. It can only uniquely identify students within one institution.
UID, however, is universal. It can identify students across multiple colleges.

Given this, UID is a better choice for a primary key because its scope is broader and more meaningful.

Example − A Super Key is like a shopping list that includes everything you might need. Some items are essential, while others are extra. A Candidate Key, on the other hand, is the refined list; it contains only the essential items that we absolutely need to get the job done. The Primary Key is the one item that we decide is the most critical to buy first.

Differences between Super Keys and Candidate Keys

To summarize, here is a quick comparison between Super Keys and Candidate Keys −

Feature	Super Key	Candidate Key
Definition	Any attribute or set of attributes that uniquely identifies a row	Minimal subset of super keys
Redundancy	Can include unnecessary attributes	No unnecessary attributes
Count	Generally higher	Fewer than the number of super keys
Example	CID + UID, UID + Name	CID, UID

Conclusion

In this chapter, we explained in detail the concepts of Super Keys and Candidate Keys in databases. Starting with a brief recap of how keys are important for uniquely identifying rows, we covered Super Keys, learning how they can include any combination of attributes, even redundant ones.

After Super Keys, we moved on to Candidate Keys which are the minimal and practical subsets of Super Keys. We also covered how to identify Candidate Keys, compared them with Super Keys, and saw how Candidate Keys can be used to decide the Primary Key of a table.

Print Page