First Normal Form (1NF) in DBMS



Normalization is one of the key principles in designing relational databases. First Normal Form (1NF) is the foundational step in the normalization process. It is like cleaning up the data to make sure everything is neat and tidy. Read this chapter to understand what 1NF means, why it is important, and how to apply it.

What is First Normal Form?

The first normal form is a set of rule that ensures a table's structure remains simple and organized. According to E. F. Codd, the father of relational databases, a table is in 1NF if it does not contain any multi-valued attributes. In simpler terms, we can say each cell in a table should hold a single value, not a list or set of values.

Imagine we have a table where a single column contains more than one value for the same person. Now, that is a problem. For example, in a Student table, if a single row for a student lists multiple courses, the table is not in 1NF.

Example of a Non-1NF Table

Let us consider the following table −

Roll Number Name Course
1 Sayan C, C++
2 Harsh Java
3 Ankur C, DBMS

Here, we can see the problem: the Course column has multiple values for some students. For example, Sayan is enrolled in both C and C++, while Ankur is taking C and DBMS. It violates the rule of 1NF, which states that every column must have only one value for each row.

Importance of 1NF

Tables that are not in 1NF may cause confusion. For example, querying such a table to find all students taking "C++". It could miss some rows. Also, updating the data becomes messy. Imagine trying to add a new course for a student. How would we manage it in a single cell?

By converting a table to 1NF, we can eliminate these issues and make the data easier to manage and query.

Methods to Convert to First Normal Form

There are three different ways to fix a table and bring it into 1NF −

  • Split Rows with Repeated Values
  • Add Separate Columns for Multi-Valued Attributes
  • Divide the Table into Two

Let us go through each of these methods in detail using the above example.

Method 1: Split Rows with Repeated Values

The simplest way to convert a table into 1NF is to break down the rows with multiple values into separate rows. For the Student table, it would look like this −

Roll Number Name Course
1 Sayan C
1 Sayan C++
2 Harsh Java
3 Ankur C
3 Ankur DBMS

After the modification, each row contains just one value for the Course column. It fixes the 1NF violation. But, there is an interesting point to note: if we are asked to identify the primary key, we will realize that Roll Number alone is not enough because it repeats for some students. Instead, a composite primary key is needed. This will be a combination of Roll Number and Course. This is needed to uniquely identify each row.

Note − Splitting the rows is simple and straightforward but may lead to redundant data.

Method 2: Add Separate Columns for Multi-Valued Attributes

Another approach is to create additional columns to represent multiple values. For example −

Roll Number Name Course 1 Course 2
1 Sayan C C++
2 Harsh Java
3 Ankur C DBMS

Here, each course gets its own column. This structure ensures each cell contains only one value, satisfying the 1NF condition. However, this method has some limitations

  • If a student is enrolled in more than two courses, we would need to add more columns.
  • Some cells might remain empty (NULL), this will make it inefficient use of space.

In this case, the primary key would be Roll Number, since it uniquely identifies each row. But adding too many columns for every possible value is not such a practical for large datasets.

Note − Adding new columns works for small, fixed datasets but is not scalable for variable-length attributes.

Method 3: Divide the Table into Two

The third method is often the best − it will split the table into two separate tables. This way, we normalize the data without redundancy. Here is how it works −

Base Table (stores student details) −

Roll Number Name
1 Sayan
2 Harsh
3 Ankur

Referencing Table (stores course enrolments) −

Roll Number Course
1 C
1 C++
2 Java
3 C
3 DBMS

In this design −

  • The Base Table uses Roll Number as its primary key.
  • The Referencing Table uses a combination of Roll Number and Course as its primary key.
  • The Roll Number in the Referencing Table is also a foreign key, pointing to the Base Table.

This method is more flexible. If a student is enrolled in multiple courses, we can simply add more rows in the Referencing Table without modifying the Base Table.

Note − Dividing the Tables is a clean and scalable method, suitable for complex datasets.

Practical Benefits of First Normal Form

By bringing a table into 1NF, we ensure the following −

  • Consistency − Data is stored in a uniform format.
  • Simplicity − Queries are easier to write and understand.
  • Scalability − The structure can handle growth without major redesigns.

Achieving 1NF is the first step towards higher levels of normalization.

Conclusion

In this chapter, we covered the concept of the First Normal Form (1NF) in database design. We started by defining 1NF is and why it is important.

Through an example, we demonstrated how multi-valued attributes violate 1NF and explored three ways to fix the issue: splitting the rows with repeated values, adding separate columns for multi-valued attributes, and dividing the table into base and referencing tables. Each method has its pros and cons, but dividing the table into two is often the most practical solution.

Advertisements