DBMS - Unary Relational Operations



The relational model plays a key role in organizing and managing data. At its core lies relational algebra, a set of operations that enables users to efficiently manipulate and query data. Among these are unary operations such as SELECT and PROJECT, which are essential tools for filtering and organizing data.

In this chapter, we will explore these two unary relational operations (SELECT and PROJECT), understand how they work, and provide examples to make it easier for you to understand.

In all the examples throughout this chapter, we will use the following Sample Data Table called EMPLOYEE:

Fname Minit Lname Ssn Bdate Address Sex Salary Super_ssn Dno
John B Smith 123456789 1965-01-09 731 Fondren, Houston, TX M 30000 333445555 5
Franklin T Wong 333445555 1955-12-08 638 Voss, Houston, TX M 40000 888665555 5
Alicia J Zelaya 999887777 1968-01-19 3321 Castle, Spring, TX F 25000 987654321 4
Jennifer S Wallace 987654321 1941-06-20 291 Berry, Bellaire, TX F 43000 888665555 4
Ramesh K Narayan 666884444 1962-09-15 975 Fire Oak, Humble, TX M 38000 333445555 5
Joyce A English 453453453 1972-07-31 5631 Rice, Houston, TX F 25000 333445555 5
Ahmad V Jabbar 987987987 1969-03-29 980 Dallas, Houston, TX M 25000 987654321 4
James E Borg 888665555 1937-11-10 450 Stone, Houston, TX M 55000 NULL 1

Understanding the SELECT Operation

The SELECT operation allows us to retrieve a subset of rows (or tuples) from a table (or relation) that satisfy a specific condition. It acts like a filter. It is keeping only the rows that meet the specified criteria while discarding the rest. The resulting table keeps all the original attributes of the filtered rows.

Syntax and Representation

In relational algebra, the representation of SELECT operation is very simple:

$$\mathrm{\sigma_{condition} (R)}$$

Here:

  • σ (sigma): Symbolizes the SELECT operation
  • condition: Specifies the criteria to filter the rows
  • R: Name of the relation or table being operated upon

Example: Filtering by Department Number

Suppose we have a table named EMPLOYEE with attributes such as department number (Dno) and salary.

To retrieve the rows for employees in department 4, we write:

$$\mathrm{σ_{Dno} \:=\: 4\: (EMPLOYEE)}$$

It will fetch the following records –

Fname Minit Lname Ssn Bdate Address Sex Salary Super_ssn Dno
Alicia J Zelaya 999887777 1968-01-19 3321 Castle, Spring, TX F 25000 987654321 4
Jennifer S Wallace 987654321 1941-06-20 291 Berry, Bellaire, TX F 43000 888665555 4
Ahmad V Jabbar 987987987 1969-03-29 980 Dallas, Houston, TX M 25000 987654321 4

Example: Filtering by Salary

To retrieve employees earning more than $30,000, we can write the query as follows:

$$\mathrm{\sigma_{Salary\: \gt\: 30000}\: (EMPLOYEE)}$$

It will fetch the following records –

Fname Minit Lname Ssn Bdate Address Sex Salary Super_ssn Dno
Franklin T Wong 333445555 1955-12-08 638 Voss, Houston, TX M 40000 888665555 5
Jennifer S Wallace 987654321 1941-06-20 291 Berry, Bellaire, TX F 43000 888665555 4
Ramesh K Narayan 666884444 1962-09-15 975 Fire Oak, Humble, TX M 38000 333445555 5
James E Borg 888665555 1937-11-10 450 Stone, Houston, TX M 55000 NULL 1

Combining Multiple Conditions

The SELECT operation supports combining multiple conditions using Boolean operators like AND, OR, and NOT.

Example: Complex Filtering

Let us see another example where we merge multiple conditions. To find employees working in department 4 with a salary above $25,000 or in department 5 with a salary above $30,000, we can write the following complex query:

$$\mathrm{\sigma_{(Dno = 4 \wedge Salary > 25000) \:\vee\: (Dno = 5 \wedge Salary > 30000)}(EMPLOYEE)}$$

It will fetch the following records –

Fname Minit Lname Ssn Bdate Address Sex Salary Super_ssn Dno
Franklin T Wong 333445555 1955-12-08 638 Voss, Houston, TX M 40000 888665555 5
Jennifer S Wallace 987654321 1941-06-20 291 Berry, Bellaire, TX F 43000 888665555 4
Ramesh K Narayan 666884444 1962-09-15 975 Fire Oak, Humble, TX M 38000 333445555 5

This operation results in a filtered table with only those rows that satisfy at least one of the conditions.

From the corresponding examples, one thing we understood that the SELECT operation is commutative, which means,

$$\mathrm{\sigma_{cond1} (\sigma_{cond2}(R)) \:=\: \sigma_{cond2} (\sigma_{cond1}(R))}$$

SELECT does not alter the structure of the table; only the number of rows changes.

Understanding the PROJECT Operation

The PROJECT operation targets columns. It creates a new table with only the specified attributes, discarding all others. This operation is useful when we need to focus on specific fields from a dataset.

Syntax and Representation

In relational algebra, the PROJECT operation can be denoted as:

$$\mathrm{\pi_{\text{attribute_list}} (R)}$$

Here:

  • π (pi): symbolizes the PROJECT operation.
  • attribute_list: specifies the columns to include in the result.
  • R is the relation being projected.

Example: Retrieving Specific Attributes

Let us see some examples from simpler to complex. If we want to list only the first name, last name, and salary of employees from the EMPLOYEE table, we write the query in the following format.

$$\mathrm{\pi_{Fname,\: Lname,\: Salary} (EMPLOYEE)}$$

The resulting table will contain only the selected columns –

Fname Lname Salary
John Smith 30000
Franklin Wong 40000
Alicia Zelaya 25000
Jennifer Wallace 43000
Ramesh Narayan 38000
Joyce English 25000
Ahmad Jabbar 25000
James Borg 55000

Duplicate Elimination in PROJECT

A critical aspect of PROJECT is that it eliminates duplicate rows. If two rows in the original table have the same values for the projected columns, only one will appear in the result.

Example: Duplicate Elimination

To list unique combinations of gender (Sex) and salary, use:

$$\mathrm{\pi_{Sex,\: Salary} (EMPLOYEE)}$$

This gives that any duplicate rows in the Sex and Salary columns are removed in the output.

Sex Salary
M 30000
M 40000
F 25000
F 43000
M 38000
M 25000
M 55000

"F 25000" is removed since it is duplicate.

Sequences and Naming in Operations

For more complex queries, we often combine SELECT and PROJECT operations. These sequences can either be written as nested expressions or broken down into steps with intermediate results.

Example: Combining SELECT and PROJECT

To find the first name, last name, and salary of employees in department 5, we can write:

First, apply the SELECT operation:

$$\mathrm{\text{TEMP}\:\longleftarrow\: \sigma_{Dno}\: =\: 5 (EMPLOYEE)}$$

It will fetch the following records –

Fname Minit Lname Ssn Bdate Address Sex Salary Super_ssn Dno
John B Smith 123456789 1965-01-09 731 Fondren, Houston, TX M 30000 333445555 5
Franklin T Wong 333445555 1955-12-08 638 Voss, Houston, TX M 40000 888665555 5
Ramesh K Narayan 666884444 1962-09-15 975 Fire Oak, Humble, TX M 38000 333445555 5
Joyce A English 453453453 1972-07-31 5631 Rice, Houston, TX F 25000 333445555 5

Next, apply the PROJECT operation:

$$\mathrm{\text{RESULT}\: \longleftarrow \:π_{Fname,\: Lname,\: Salary} (TEMP)}$$

It will fetch the following records –

Fname Lname Salary
John Smith 30000
Franklin Wong 40000
Ramesh Narayan 38000
Joyce English 25000

Alternatively, this can be combined into a single inline expression:

$$\mathrm{\pi_{Fname,\: Lname,\:Salary} (\sigma_{Dno} \:=\: 5(EMPLOYEE))}$$

Renaming the Attributes

Sometimes, the resulting table from a sequence of operations needs attribute names to be changed for clarity. The RENAME operation, denoted by "ρ" (rho), allows us to rename the attributes or the table itself.

For example:

$$\mathrm{\rho (\text{First_Name, Last_Name, Salary}) \:\longleftarrow\: \pi_{Fname, Lname, Salary} (EMPLOYEE)}$$

Applications of SELECT and PROJECT Operations in SQL

In SQL, the SELECT and PROJECT operations are mirrored in the SELECT clause and the WHERE clause of a query. For example:

The relational algebra expression "$\mathrm{\sigma_{Dno \:=\:4\: \wedge\: Salary\:>\: 25000}\: (EMPLOYEE)}$" corresponds to:

SELECT * 
FROM EMPLOYEE 
WHERE Dno = 4 AND Salary > 25000;

The PROJECT operation "$\mathrm{\pi_{Sex,\: Salary} (EMPLOYEE)}$" translates to:

SELECT DISTINCT Sex, Salary 
FROM EMPLOYEE;

Conclusion

Unary relational operations include SELECT and PROJECT. These two operations form the foundation of data manipulation in relational databases. We explored how SELECT filters rows based on specific conditions, while PROJECT focuses on extracting particular columns. With the help of a few detailed examples, including combining operations and renaming attributes, we highlighted their practical applications.

Advertisements