 
 Data Structure Data Structure
 Networking Networking
 RDBMS RDBMS
 Operating System Operating System
 Java Java
 MS Excel MS Excel
 iOS iOS
 HTML HTML
 CSS CSS
 Android Android
 Python Python
 C Programming C Programming
 C++ C++
 C# C#
 MongoDB MongoDB
 MySQL MySQL
 Javascript Javascript
 PHP PHP
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Normal Forms in DBMS
Normalization is a process of organizing data in a database to reduce redundancy and improve data consistency. Primary keys are really important in organizing information in a database. They help to make sure that every row in a table has a unique identification so that nothing gets mixed up or lost.
In this article, we will discuss the different normal forms of database normalization.
Normalization
Normalization is a process of organizing the data in database to avoid data redundancy, insertion anomaly, update anomaly & deletion anomaly.
 
Normalization is the process of organizing data in a database to minimize redundancy and dependency. In database design, there are different normal forms based on the primary keys of a table. These include ?
 
First Normal Form (1NF)
1NF requires that each column in a table contains atomic values and that each row is uniquely identified. This means that a table cannot have repeating groups or arrays as columns, and each row must have a unique primary key.
Example
A table is in 1NF if each column contains atomic values and each row is uniquely identified. For example, a table that lists customers and their phone numbers ?
| Customer ID | Name | Phone Numbers | 
|---|---|---|
| 1 | John | 555-1234, 555-5678 | 
| 2 | Jane | 555-9876 | 
| 3 | Michael | 555-5555 | 
This violates 1NF because the Phone Numbers column contains repeating groups.
To normalize this table to 1NF, we can split the Phone Numbers column into separate rows and add a separate primary key column ?
| Customer ID | Name | Phone Number | 
|---|---|---|
| 1 | John | 555-1234 | 
| 1 | John | 555-5678 | 
| 2 | Jane | 555-9876 | 
| 3 | Michael | 555-5555 | 
Second Normal Form (2NF)
2NF builds on 1NF by requiring that each non-primary key column in a table is fully functionally dependent on the primary key. This means that a table should not have partial dependencies, where a non-primary key column depends on only part of the primary key.
Example
A table is in 2NF if each non-primary key column is fully functionally dependent on the primary key. For example, a table that lists orders and their line items:
| Order ID | Customer ID | Customer Name | Item ID | Item Name | Quantity | 
|---|---|---|---|---|---|
| 1 | 1 | John | 1 | Shirt | 2 | 
| 1 | 1 | John | 2 | Pants | 1 | 
| 2 | 2 | Jane | 1 | Shirt | 1 | 
| 2 | 2 | Jane | 3 | Hat | 3 | 
This violates 2NF because the Customer Name column depends on only part of the primary key (Customer ID). To normalize this table to 2NF, we can split it into two tables ?
| Order ID | Customer ID | Item ID | Quantity | 
|---|---|---|---|
| 1 | 1 | 1 | 2 | 
| 1 | 1 | 2 | 1 | 
| 2 | 2 | 1 | 1 | 
| 2 | 2 | 3 | 3 | 
| Customer ID | Customer Name | 
|---|---|
| 1 | John | 
| 2 | Jane | 
Third Normal Form (3NF)
3NF builds on 2NF by requiring that each non-primary key column in a table is not transitively dependent on the primary key. This means that a table should not have transitive dependencies, where a non-primary key column depends on another non-primary key column.
Example
To explain 3NF further, let's consider an example of a table that lists customer orders ?
| Order ID | Customer ID | Customer Name | Customer City | Order Date | Order Total | 
|---|---|---|---|---|---|
| 1 | 100 | John Smith | New York | 2022-01-01 | 100 | 
| 2 | 101 | Jane Doe | Los Angeles | 2022-01-02 | 200 | 
| 3 | 102 | Bob Johnson | San Francisco | 2022-01-03 | 300 | 
In this example, the non-primary key column "Customer City" is transitively dependent on the primary key. That is, it depends on "Customer ID", which is not part of the primary key, instead of depending directly on the primary key "Order ID". To bring this table to 3NF, we can split it into two tables ?
Table 1: Customers
| Customer ID | Customer Name | Customer City | 
|---|---|---|
| 100 | John Smith | New York | 
| 101 | Jane Doe | Los Angeles | 
| 102 | Bob Johnson | San Francisco | 
Table 2: Orders
| Order ID | Customer ID | Order Date | Order Total | 
|---|---|---|---|
| 1 | 100 | 2022-01-01 | 100 | 
| 2 | 101 | 2022-01-02 | 200 | 
| 3 | 102 | 2022-01-03 | 300 | 
Now, the "Customer City" column is no longer transitively dependent on the primary key and is instead in a separate table that has a direct relationship with the primary key. This makes the table 3NF-compliant.
Boyce-Codd Normal Form (BCNF)
BCNF is a stricter form of 3NF that applies to tables with more than one candidate key. BCNF requires that each non-trivial dependency in a table is a dependency on a candidate key. This means that a table should not have non-trivial dependencies, where a non-primary key column depends on another non-primary key column. BCNF ensures that each table in a database is a separate entity and eliminates redundancies.
Example
A table is in BCNF if each determinant is a candidate key. In other words, every non-trivial functional dependency in the table must be on a candidate key. For example, consider a table that lists information about books and their authors ?
Table: Books
| Book ID | Title | Author ID | Author Name | Author Nationality | 
|---|---|---|---|---|
| 1 | Crime and Punishment | 100 | Fyodor Dostoevsky | Russian | 
| 2 | The Great Gatsby | 101 | F. Scott Fitzgerald | American | 
| 3 | Pride and Prejudice | 102 | Jane Austen | British | 
In this example, the functional dependency between "Author ID" and "Author Name" violates BCNF because it is not on a candidate key. To bring this table to BCNF, we can split it into two tables ?
Table 1: Authors
| Author ID | Author Name | Author Nationality | 
|---|---|---|
| 101 | Fyodor Dostoevsky | Russian | 
| 101 | F. Scott Fitzgerald | American | 
| 102 | Jane Austen | British | 
Table 2: Books
| Book ID | Title | Author ID | 
|---|---|---|
| 1 | Crime and Punishment | 100 | 
| 2 | The Great Gatsby | 101 | 
| 3 | Pride and Prejudice | 102 | 
Now, the "Author Name" and "Author Nationality" columns are not transitively dependent on the primary key, and the table is in BCNF.
Fourth Normal Form (4NF)
4NF builds on BCNF by requiring that a table should not have multi-valued dependencies. A multi-valued dependency occurs when a non-primary key column depends on a combination of other non-primary key columns. For example, a table that lists customer orders with a primary key of order ID and non-primary key columns for customer ID and order items violates 4NF because order items depend on both order ID and customer ID.
For example, a table that lists orders and their products, with columns for order ID, product ID, and product details, violates 4NF because the product details depend on the combination of order ID and product ID.
Example
Consider the following table of orders and products
| Order ID | Product ID | Product Name | Product Description | 
|---|---|---|---|
| 1 | 100 | Widget | Red Widget | 
| 1 | 200 | Widget | Blue Widget | 
| 2 | 100 | Widget | Red Widget | 
| 2 | 300 | Thing | Green Thing | 
| 3 | 200 | Widget | Blue Widget | 
| 3 | 300 | Thing | Green Thing | 
In this table, the product name and description depend on both the order ID and product ID, creating a multi-valued dependency. To bring the table into 4NF, we can split it into three tables ?
| Order ID | Product ID | 
|---|---|
| 1 | 100 | 
| 1 | 200 | 
| 2 | 100 | 
| 2 | 300 | 
| 3 | 200 | 
| 3 | 300 | 
| Product ID | Product Name | 
|---|---|
| 100 | Widget | 
| 200 | Widget | 
| 300 | Thing | 
| Product ID | Product Description | 
|---|---|
| 100 | Red Widget | 
| 200 | Blue Widget | 
| 300 | Green Thing | 
Advantages and Disadvantages of Normalization
Advantages of Normalization
- Reduced Data Redundancy 
- Improved Data Consistency 
- Simplified Database Maintenance 
- Improved Query Performance 
Disadvantages of Normalization
- Increased Complexity 
- Decreased Read Performance 
- Increased Write Performance 
- Increased Storage Space 
- Over-Normalization 
Conclusion
So, we have explained the concept of normalization in database management systems (DBMS) and its importance in data management. We covered different normal forms of database normalization, including 1NF, 2NF, 3NF, BCNF, and 4NF. We explained how normalization helps eliminate data redundancy, insertion, update, and deletion anomalies. We have also provided examples of tables in different normal forms and how to normalize them to the desired normal form. We have written about the importance of normalization in ensuring data consistency and accuracy in DBMS.
