DBMS - Set Theory Operations



In relational algebra we use several set theory operations. Since relational model is based on set theory it borrows several concepts from set theory as well in respect to the operations. These include UNION, INTERSECTION, MINUS (also called SET DIFFERENCE), and also the CARTESIAN PRODUCT. In this article, we will see down each operator and work through examples for a better understanding.

Let us see the following two tables that we will focus in the next examples −

In relational algebra, several operations are derived from set theory. Since the relational model is based on set theory, it adopts key concepts such as UNION, INTERSECTION, MINUS (also called SET DIFFERENCE), and CARTESIAN PRODUCT.

In this chapter, we will examine each of these operators with practical examples for better understanding. Let's begin by reviewing two tables that will be used in the upcoming examples −

EMPLOYEE Table

Fname Minit Lname Ssn Bdate Address Sex Salary Super_ssn Dno
John B Smith 123456789 1965-01-09 731 Fondren, Houston, TX M 30000 333445555 5
Franklin T Wong 333445555 1955-12-08 638 Voss, Houston, TX M 40000 888665555 5
Alicia J Zelaya 999887777 1968-01-19 3321 Castle, Spring, TX F 25000 987654321 4
Jennifer S Wallace 987654321 1941-06-20 291 Berry, Bellaire, TX F 43000 888665555 4
Ramesh K Narayan 666884444 1962-09-15 975 Fire Oak, Humble, TX M 38000 333445555 5
Joyce A English 453453453 1972-07-31 5631 Rice, Houston, TX F 25000 333445555 5
Ahmad V Jabbar 987987987 1969-03-29 980 Dallas, Houston, TX M 25000 987654321 4
James E Borg 888665555 1937-11-10 450 Stone, Houston, TX M 55000 NULL 1

DEPENDENT Table

Essn Dependent_name Sex Bdate Relationship
333445555 Alice F 1986-04-05 Daughter
333445555 Theodore M 1983-10-25 Son
333445555 Joy F 1958-05-03 Spouse
987654321 Abner M 1942-02-28 Spouse
123456789 Michael M 1988-01-04 Son
123456789 Alice F 1988-12-30 Daughter
123456789 Elizabeth F 1967-05-05 Spouse

Set Theory Basics in Relational Algebra

In set theory, each element in a set is unique, and operations like UNION and INTERSECTION combine or compare these elements systematically. Relational algebra applies similar principles by treating each row in a table as a set element.

However, relational tables have more structure than simple sets, so additional rules apply −

  • Union Compatibility − For operations like UNION and INTERSECTION, the tables must have the same number of attributes, and each attribute must have the same data type.
  • Duplicate Elimination − These operations inherently remove duplicates, as relational algebra operates on sets, not multisets.

The UNION Operator

The UNION operator combines rows from two relations, producing a result with all unique rows present in either relation-similar to merging two lists while removing duplicates.

Syntax and Representation

In relational algebra, UNION is represented as −

R ∪ S

Here, R and S are two relations (tables) that must be union compatible.

Combining Employee and Manager IDs

Let us consider we want a find of all employees who either work in department 5 or supervise someone in department 5. We use UNION to merge these groups −

Retrieve Social Security numbers (SSN) of employees in department 5 −

RESULT1 ← πSsnDno = 5(EMPLOYEE))
Ssn
123456789
333445555
666884444
453453453

Retrieve SSNs of managers supervising department 5 employees −

RESULT2 ← πSuper_ssnDno = 5(EMPLOYEE))
Super_ssn
333445555
888665555
333445555
333445555

Combine both lists −

RESULT ← RESULT1 ∪ RESULT2
Ssn
123456789
333445555
666884444
453453453
333445555
888665555
333445555
333445555

The final table will include SSNs of all employees who meet either condition. For instance, if RESULT1 contains {123, 456} and RESULT2 contains {456, 789}, the UNION result is {123, 456, 789}.

The INTERSECTION Operator

The INTERSECTION operator identifies rows present in both relations. It is like finding the overlap between two groups.

Syntax and Representation

INTERSECTION is denoted as −

R ∩ S

Finding SSN which are common in manager and employee

Consider the table of SSN of employees who work for department 1 and 5 −

RESULT1 ← πSsnDno = 5 ^ Dno = 1(EMPLOYEE))
Ssn
123456789
333445555
666884444
453453453
888665555

Retrieve SSNs of managers supervising department 5 employees −

RESULT2 ← πSuper_ssnDno = 5(EMPLOYEE))
Super_ssn
333445555
888665555
333445555
333445555

The common SSN values are −

RESULT ← RESULT1 ∩ RESULT2
Ssn
333445555
888665555

The MINUS (SET DIFFERENCE) Operator

The MINUS operator, also called SET DIFFERENCE, is used to remove rows in one relation that appear in another. It essentially answers the question, "What is in one table but not the other?"

Syntax and Representation

In relational algebra, MINUS is represented as −

R − S

SSN of employees who are not manager

Consider the table of SSN of employees who work for department 1 and 5 −

RESULT1 ← πSsnDno = 5 ^ Dno = 1(EMPLOYEE))
Ssn
123456789
333445555
666884444
453453453
888665555

Retrieve SSNs of managers supervising department 5 employees −

RESULT2 ← πSuper_ssnDno = 5(EMPLOYEE))
Super_ssn
333445555
888665555
333445555
333445555

The SSN of non-managers are −

RESULT ← RESULT1 − RESULT2
Ssn
123456789
666884444
453453453

The CARTESIAN PRODUCT (CROSS PRODUCT) Operator

The CARTESIAN PRODUCT operator combines every row of one table with every row of another. This operation produces all possible combinations of rows between the two tables. This can result in a large and often unwieldy table.

Syntax and Representation

In relational algebra, CARTESIAN PRODUCT is represented as −

R × S

Combining Employees with Dependents

Suppose we have a table EMPLOYEE and another table DEPENDENT. To pair every employee with every dependent −

EMP_DEPENDENTS ← EMPLOYEE × DEPENDENT

The resulting table will include every possible combination of rows from EMPLOYEE and DEPENDENT. However, this raw result is rarely useful on its own. Usually, we filter the combined rows using a SELECT operation to create meaningful connections.

Fname Lname Ssn Essn Dependent_name Sex Bdate ...
Alicia Zelaya 999887777 333445555 Alice F 1986-04-05 ...
Alicia Zelaya 999887777 333445555 Theodore M 1983-10-25 ...
Alicia Zelaya 999887777 333445555 Joy F 1958-05-03 ...
Alicia Zelaya 999887777 987654321 Abner M 1942-02-28 ...
Alicia Zelaya 999887777 123456789 Michael M 1988-01-04 ...
Alicia Zelaya 999887777 123456789 Alice F 1988-12-30 ...
Alicia Zelaya 999887777 123456789 Elizabeth F 1967-05-05 ...
Jennifer Wallace 987654321 333445555 Alice F 1986-04-05 ...
Jennifer Wallace 987654321 333445555 Theodore M 1983-10-25 ...
Jennifer Wallace 987654321 333445555 Joy F 1958-05-03 ...
Jennifer Wallace 987654321 987654321 Abner M 1942-02-28 ...
Jennifer Wallace 987654321 123456789 Michael M 1988-01-04 ...
Jennifer Wallace 987654321 123456789 Alice F 1988-12-30 ...
Jennifer Wallace 987654321 123456789 Elizabeth F 1967-05-05 ...
Joyce English 453453453 333445555 Alice F 1986-04-05 ...
Joyce English 453453453 333445555 Theodore M 1983-10-25 ...
Joyce English 453453453 333445555 Joy F 1958-05-03 ...
Joyce English 453453453 987654321 Abner M 1942-02-28 ...
Joyce English 453453453 123456789 Michael M 1988-01-04 ...
Joyce English 453453453 123456789 Alice F 1988-12-30 ...
Joyce English 453453453 123456789 Elizabeth F 1967-05-05 ...

The table is truncated for space otherwise it will be much larger.

Sequences and Combined Operations

Relational algebra becomes useful when we combine operators. Sometimes, the result of one operation feeds into another. This is allowing us to answer complex queries.

Female Employees and Their Dependents

To find dependents of female employees −

Filter female employees −

FEMALE_EMPS ← σSex=′F(EMPLOYEE)

Retrieve names and SSNs of female employees −

EMP_NAMES ← πFname,Lname,Ssn(FEMALE_EMPS)

Combine with the DEPENDENT table −

EMP_DEPENDENTS ← EMP_NAMES × DEPENDENT

Match dependents to employees by SSN −

ACTUAL_DEPENDENTS ← σSsn=Essn(EMP_DEPENDENTS)

Extract names of employees and their dependents −

πFname,Lname,Dependent_name(ACTUAL_DEPENDENTS)

This sequence demonstrates how CARTESIAN PRODUCT and SELECT work together to filter meaningful relationships from raw combinations.

SQL Equivalents of Set Theory Operators

These relational operators have direct counterparts in SQL −

  • UNION − Combines results from two queries, removing duplicates unless UNION ALL is used.
  • INTERSECT − Retrieves rows common to both queries.
  • EXCEPT (or MINUS) − Retrieves rows in one query but not the other.
  • CROSS JOIN − Implements CARTESIAN PRODUCT in SQL.

Example in SQL

For the UNION operation −

SELECT Ssn
FROM EMPLOYEE
WHERE Dno = 5
UNION
SELECT Super_ssn
FROM EMPLOYEE
WHERE Dno = 5;

Conclusion

In this chapter, we explored relational algebra operations derived from set theory — UNION, INTERSECTION, MINUS, and CARTESIAN PRODUCT. We demonstrated how these operators work using practical database examples and also discussed their equivalents in SQL. Understanding these foundational concepts is essential for querying and managing relational databases effectively.

Advertisements