
- DBMS - Home
- DBMS - Overview
- DBMS - Architecture
- DBMS - Data Models
- DBMS - Data Schemas
- DBMS - Data Independence
- DBMS - System Environment
- Centralized and Client/Server Architecture
- DBMS - Classification
- Relational Model
- DBMS - Codd's Rules
- DBMS - Relational Data Model
- DBMS - Relational Model Constraints
- DBMS - Relational Database Schemas
- DBMS - Handling Constraint Violations
- Entity Relationship Model
- DBMS - ER Model Basic Concepts
- DBMS - ER Diagram Representation
- Relationship Types and Relationship Sets
- DBMS - Weak Entity Types
- DBMS - Generalization, Aggregation
- DBMS - Drawing an ER Diagram
- DBMS - Enhanced ER Model
- Subclass, Superclass and Inheritance in EER
- Specialization and Generalization in Extended ER Model
- Data Abstraction and Knowledge Representation
- Relational Algebra
- DBMS - Relational Algebra
- Unary Relational Operation
- Set Theory Operations
- DBMS - Database Joins
- DBMS - Division Operation
- DBMS - ER to Relational Model
- Examples of Query in Relational Algebra
- Relational Calculus
- Tuple Relational Calculus
- Domain Relational Calculus
- Relational Database Design
- DBMS - Functional Dependency
- DBMS - Inference Rules
- DBMS - Minimal Cover
- Equivalence of Functional Dependency
- Finding Attribute Closure and Candidate Keys
- Relational Database Design
- DBMS - Keys
- Super keys and candidate keys
- DBMS - Foreign Key
- Finding Candidate Keys
- Normalization in Database Designing
- Database Normalization
- First Normal Form
- Second Normal Form
- Third Normal Form
- Boyce Codd Normal Form
- Difference Between 4NF and 5NF
- Structured Query Language
- Types of Languages in SQL
- Querying in SQL
- CRUD Operations in SQL
- Aggregation Function in SQL
- Join and Subquery in SQL
- Views in SQL
- Trigger and Schema Modification
- Storage and File Structure
- DBMS - Storage System
- DBMS - File Structure
- DBMS - Secondary Storage Devices
- DBMS - Buffer and Disk Blocks
- DBMS - Placing File Records on Disk
- DBMS - Ordered and Unordered Records
- Indexing and Hashing
- DBMS - Indexing
- DBMS - Single-Level Ordered Indexing
- DBMS - Multi-level Indexing
- Dynamic B- Tree and B+ Tree
- DBMS - Hashing
- Transaction and Concurrency
- DBMS - Transaction
- DBMS - Concurrency Control
- DBMS - Deadlock
- Backup and Recovery
- DBMS - Data Backup
- DBMS - Data Recovery
- DBMS Useful Resources
- DBMS - Quick Guide
- DBMS - Useful Resources
- DBMS - Discussion
DBMS - Super Keys and Candidate Keys
Keys in a database are used to uniquely identifying records and maintaining data integrity. Read this chapter to get a good understanding of Super Keys and Candidate Keys and how they work.
A Quick Recap of Keys in DBMS
A key in a database is an attribute or a group of attributes that uniquely identifies every record in a table. For instance, imagine a table with the following columns −
- CID (College ID)
- UID (University ID)
- Name
- Marks
Here, each row represents a student. We use keys to uniquely identify a specific student's record. Without a key, we can get several issues like retrieving duplicate records or failing to pinpoint a specific entry etc.
Super Keys: The Superset of Keys
As its name suggests, a Super Key is the super set of keys. A Super Key is any combination of attributes in a table that can uniquely identify a record. It does not matter if the combination includes unnecessary attributes. As long as the group can uniquely identify a row, it is a Super Key.
Following are the important characteristics of a super key −
- It can include one or more attributes.
- The group of attributes may include unnecessary ones.
- Every combination must ensure unique identification.
For example, in our Student table, CID, UID, and Name can all be super keys individually because they are capable of identifying records. The combinations like (CID + UID), (UID + Marks), or even (CID + UID + Name) are also super keys because they still uniquely identify rows.
The Number of Super Keys
The total number of super keys in a table depends on the attributes. For a table with four attributes as given in our example, there can be multiple super keys. These can include all four attributes together, subsets of three, subsets of two, or even individual attributes. They are provided to uniquely identify rows.
Importance of Super Keys
Super keys are important because they form the base for identifying records. However, not all super keys are practical or can be usable. Many super keys include redundant attributes. They do not add value to the identification process. This brings us to the next type of key, known as Candidate Keys.
Candidate Keys: The Minimal Super Keys
A Candidate Key is a minimal subset of a Super Key. In simple terms, it is the smallest possible group of attributes that can uniquely identify rows. If we remove even one attribute from a candidate key, it can no longer serve as a unique identifier.
Following are the important characteristics of a candidate key −
- It is a super key, but without redundant attributes.
- Every table can have one or more candidate keys.
- Candidate keys are potential candidates for becoming a Primary Key.
Example of Candidate Keys
Let us see the Student table again. We have multiple Super Keys −
- CID
- UID
- CID + UID
- UID + Name
Out of these, CID and UID are the Candidate Keys because −
- They can uniquely identify rows.
- They don't include any unnecessary attributes.
On the other hand, combinations like (CID + UID) are not Candidate Keys because the extra attribute does not add any value to the identification process.
Why the Term "Candidate" is used?
The term "candidate" is used to signify the fact that these keys are "candidates" for being selected as the primary key of the table.
How Do We Decide a Candidate Key?
To determine Candidate Keys, we must follow a set of steps as mentioned below −
- List All Super Keys − Start by identifying all combinations of attributes that can uniquely identify rows.
- Eliminate Redundancy − Remove combinations where unnecessary attributes are included.
- Verify Minimality − Ensure that each candidate key is minimal. So by removing even one attribute should make it invalid.
Example − In our Student table, CID and UID are standalone attributes that can uniquely identify rows. These are Candidate Keys. Now, the combinations like (CID + UID) or (UID + Marks) are Super Keys but not Candidate Keys because they include unnecessary attributes.
Candidate Keys and Primary Keys
Once we have identified the Candidate Keys, we need to choose one to serve as the Primary Key. This decision is typically based on the scope and practicality of the keys.
How to Choose a Primary Key?
Let us suppose both the CID and UID are Candidate Keys in our Student table. So, the question arises, which one should be chosen as the Primary Key? It depends on the context −
- CID is limited to a single college. It can only uniquely identify students within one institution.
- UID, however, is universal. It can identify students across multiple colleges.
Given this, UID is a better choice for a primary key because its scope is broader and more meaningful.
Example − A Super Key is like a shopping list that includes everything you might need. Some items are essential, while others are extra. A Candidate Key, on the other hand, is the refined list; it contains only the essential items that we absolutely need to get the job done. The Primary Key is the one item that we decide is the most critical to buy first.
Differences between Super Keys and Candidate Keys
To summarize, here is a quick comparison between Super Keys and Candidate Keys −
Feature | Super Key | Candidate Key |
---|---|---|
Definition | Any attribute or set of attributes that uniquely identifies a row | Minimal subset of super keys |
Redundancy | Can include unnecessary attributes | No unnecessary attributes |
Count | Generally higher | Fewer than the number of super keys |
Example | CID + UID, UID + Name | CID, UID |
Conclusion
In this chapter, we explained in detail the concepts of Super Keys and Candidate Keys in databases. Starting with a brief recap of how keys are important for uniquely identifying rows, we covered Super Keys, learning how they can include any combination of attributes, even redundant ones.
After Super Keys, we moved on to Candidate Keys which are the minimal and practical subsets of Super Keys. We also covered how to identify Candidate Keys, compared them with Super Keys, and saw how Candidate Keys can be used to decide the Primary Key of a table.