DBMS - Set Theory Operations

Quiz

In relational algebra we use several set theory operations. Since relational model is based on set theory it borrows several concepts from set theory as well in respect to the operations. These include UNION, INTERSECTION, MINUS (also called SET DIFFERENCE), and also the CARTESIAN PRODUCT. In this article, we will see down each operator and work through examples for a better understanding.

Let us see the following two tables that we will focus in the next examples −

In relational algebra, several operations are derived from set theory. Since the relational model is based on set theory, it adopts key concepts such as UNION, INTERSECTION, MINUS (also called SET DIFFERENCE), and CARTESIAN PRODUCT.

In this chapter, we will examine each of these operators with practical examples for better understanding. Let's begin by reviewing two tables that will be used in the upcoming examples −

EMPLOYEE Table

Fname	Minit	Lname	Ssn	Bdate	Address	Sex	Salary	Super_ssn	Dno
John	B	Smith	123456789	1965-01-09	731 Fondren, Houston, TX	M	30000	333445555	5
Franklin	T	Wong	333445555	1955-12-08	638 Voss, Houston, TX	M	40000	888665555	5
Alicia	J	Zelaya	999887777	1968-01-19	3321 Castle, Spring, TX	F	25000	987654321	4
Jennifer	S	Wallace	987654321	1941-06-20	291 Berry, Bellaire, TX	F	43000	888665555	4
Ramesh	K	Narayan	666884444	1962-09-15	975 Fire Oak, Humble, TX	M	38000	333445555	5
Joyce	A	English	453453453	1972-07-31	5631 Rice, Houston, TX	F	25000	333445555	5
Ahmad	V	Jabbar	987987987	1969-03-29	980 Dallas, Houston, TX	M	25000	987654321	4
James	E	Borg	888665555	1937-11-10	450 Stone, Houston, TX	M	55000	NULL	1

DEPENDENT Table

Essn	Dependent_name	Sex	Bdate	Relationship
333445555	Alice	F	1986-04-05	Daughter
333445555	Theodore	M	1983-10-25	Son
333445555	Joy	F	1958-05-03	Spouse
987654321	Abner	M	1942-02-28	Spouse
123456789	Michael	M	1988-01-04	Son
123456789	Alice	F	1988-12-30	Daughter
123456789	Elizabeth	F	1967-05-05	Spouse

Set Theory Basics in Relational Algebra

In set theory, each element in a set is unique, and operations like UNION and INTERSECTION combine or compare these elements systematically. Relational algebra applies similar principles by treating each row in a table as a set element.

However, relational tables have more structure than simple sets, so additional rules apply −

Union Compatibility − For operations like UNION and INTERSECTION, the tables must have the same number of attributes, and each attribute must have the same data type.
Duplicate Elimination − These operations inherently remove duplicates, as relational algebra operates on sets, not multisets.

The UNION Operator

The UNION operator combines rows from two relations, producing a result with all unique rows present in either relation-similar to merging two lists while removing duplicates.

Syntax and Representation

In relational algebra, UNION is represented as −

R ∪ S

Here, R and S are two relations (tables) that must be union compatible.

Combining Employee and Manager IDs

Let us consider we want a find of all employees who either work in department 5 or supervise someone in department 5. We use UNION to merge these groups −

Retrieve Social Security numbers (SSN) of employees in department 5 −

RESULT1 ← π_Ssn(σ_{Dno = 5}(EMPLOYEE))

Ssn
123456789
333445555
666884444
453453453

Retrieve SSNs of managers supervising department 5 employees −

RESULT2 ← π_{Super_ssn}(σ_{Dno = 5}(EMPLOYEE))

Super_ssn
333445555
888665555
333445555
333445555

Combine both lists −

RESULT ← RESULT1 ∪ RESULT2

Ssn
123456789
333445555
666884444
453453453
333445555
888665555
333445555
333445555

The final table will include SSNs of all employees who meet either condition. For instance, if RESULT1 contains {123, 456} and RESULT2 contains {456, 789}, the UNION result is {123, 456, 789}.

The INTERSECTION Operator

The INTERSECTION operator identifies rows present in both relations. It is like finding the overlap between two groups.

Syntax and Representation

INTERSECTION is denoted as −

R ∩ S

Finding SSN which are common in manager and employee

Consider the table of SSN of employees who work for department 1 and 5 −

RESULT1 ← π_Ssn(σ_{Dno = 5 ^ Dno = 1}(EMPLOYEE))

Ssn
123456789
333445555
666884444
453453453
888665555

Retrieve SSNs of managers supervising department 5 employees −

RESULT2 ← π_{Super_ssn}(σ_{Dno = 5}(EMPLOYEE))

Super_ssn
333445555
888665555
333445555
333445555

The common SSN values are −

RESULT ← RESULT1 ∩ RESULT2

Ssn
333445555
888665555

The MINUS (SET DIFFERENCE) Operator

The MINUS operator, also called SET DIFFERENCE, is used to remove rows in one relation that appear in another. It essentially answers the question, "What is in one table but not the other?"

Syntax and Representation

In relational algebra, MINUS is represented as −

R − S

SSN of employees who are not manager

Consider the table of SSN of employees who work for department 1 and 5 −

RESULT1 ← π_Ssn(σ_{Dno = 5 ^ Dno = 1}(EMPLOYEE))

Ssn
123456789
333445555
666884444
453453453
888665555

Retrieve SSNs of managers supervising department 5 employees −

RESULT2 ← π_{Super_ssn}(σ_{Dno = 5}(EMPLOYEE))

Super_ssn
333445555
888665555
333445555
333445555

The SSN of non-managers are −

RESULT ← RESULT1 − RESULT2

Ssn
123456789
666884444
453453453

The CARTESIAN PRODUCT (CROSS PRODUCT) Operator

The CARTESIAN PRODUCT operator combines every row of one table with every row of another. This operation produces all possible combinations of rows between the two tables. This can result in a large and often unwieldy table.

Syntax and Representation

In relational algebra, CARTESIAN PRODUCT is represented as −

R × S

Combining Employees with Dependents

Suppose we have a table EMPLOYEE and another table DEPENDENT. To pair every employee with every dependent −

EMP_DEPENDENTS ← EMPLOYEE × DEPENDENT

The resulting table will include every possible combination of rows from EMPLOYEE and DEPENDENT. However, this raw result is rarely useful on its own. Usually, we filter the combined rows using a SELECT operation to create meaningful connections.

Fname	Lname	Ssn	Essn	Dependent_name	Sex	Bdate	...
Alicia	Zelaya	999887777	333445555	Alice	F	1986-04-05	...
Alicia	Zelaya	999887777	333445555	Theodore	M	1983-10-25	...
Alicia	Zelaya	999887777	333445555	Joy	F	1958-05-03	...
Alicia	Zelaya	999887777	987654321	Abner	M	1942-02-28	...
Alicia	Zelaya	999887777	123456789	Michael	M	1988-01-04	...
Alicia	Zelaya	999887777	123456789	Alice	F	1988-12-30	...
Alicia	Zelaya	999887777	123456789	Elizabeth	F	1967-05-05	...
Jennifer	Wallace	987654321	333445555	Alice	F	1986-04-05	...
Jennifer	Wallace	987654321	333445555	Theodore	M	1983-10-25	...
Jennifer	Wallace	987654321	333445555	Joy	F	1958-05-03	...
Jennifer	Wallace	987654321	987654321	Abner	M	1942-02-28	...
Jennifer	Wallace	987654321	123456789	Michael	M	1988-01-04	...
Jennifer	Wallace	987654321	123456789	Alice	F	1988-12-30	...
Jennifer	Wallace	987654321	123456789	Elizabeth	F	1967-05-05	...
Joyce	English	453453453	333445555	Alice	F	1986-04-05	...
Joyce	English	453453453	333445555	Theodore	M	1983-10-25	...
Joyce	English	453453453	333445555	Joy	F	1958-05-03	...
Joyce	English	453453453	987654321	Abner	M	1942-02-28	...
Joyce	English	453453453	123456789	Michael	M	1988-01-04	...
Joyce	English	453453453	123456789	Alice	F	1988-12-30	...
Joyce	English	453453453	123456789	Elizabeth	F	1967-05-05	...

The table is truncated for space otherwise it will be much larger.

Sequences and Combined Operations

Relational algebra becomes useful when we combine operators. Sometimes, the result of one operation feeds into another. This is allowing us to answer complex queries.

Female Employees and Their Dependents

To find dependents of female employees −

Filter female employees −

FEMALE_EMPS ← σ_Sex=_′F_′(EMPLOYEE)

Retrieve names and SSNs of female employees −

EMP_NAMES ← π_{Fname,Lname,Ssn}(FEMALE_EMPS)

Combine with the DEPENDENT table −

EMP_DEPENDENTS ← EMP_NAMES × DEPENDENT

Match dependents to employees by SSN −

ACTUAL_DEPENDENTS ← σ_Ssn=Essn(EMP_DEPENDENTS)

Extract names of employees and their dependents −

π_{Fname,Lname,Dependent_name}(ACTUAL_DEPENDENTS)

This sequence demonstrates how CARTESIAN PRODUCT and SELECT work together to filter meaningful relationships from raw combinations.

SQL Equivalents of Set Theory Operators

These relational operators have direct counterparts in SQL −

UNION − Combines results from two queries, removing duplicates unless UNION ALL is used.
INTERSECT − Retrieves rows common to both queries.
EXCEPT (or MINUS) − Retrieves rows in one query but not the other.
CROSS JOIN − Implements CARTESIAN PRODUCT in SQL.

Example in SQL

For the UNION operation −

SELECT Ssn
FROM EMPLOYEE
WHERE Dno = 5
UNION
SELECT Super_ssn
FROM EMPLOYEE
WHERE Dno = 5;

Conclusion

In this chapter, we explored relational algebra operations derived from set theory — UNION, INTERSECTION, MINUS, and CARTESIAN PRODUCT. We demonstrated how these operators work using practical database examples and also discussed their equivalents in SQL. Understanding these foundational concepts is essential for querying and managing relational databases effectively.

Print Page