SQL - Self Join



Self Join, as its name suggests, is a type of join that combines the records of a table with itself.

Suppose an organization, while organizing a Christmas party, is choosing a Secret Santa among its employees based on some colors. It is designed to be done by assigning one color to each of its employees and having them pick a color from the pool of various colors. In the end, they will become the Secret Santa of an employee this color is assigned to.

As we can see in the figure below, the information regarding the colors assigned and a color each employee picked is entered into a table. The table is joined to itself using self join over the color columns to match employees with their Secret Santa.

Self Join

The SQL Self Join

The SQL Self Join is used to join a table to itself as if the table were two tables. To carry this out, alias of the tables should be used at least once.

Self Join is a type of inner join, which is performed in cases where the comparison between two columns of a same table is required; probably to establish a relationship between them. In other words, a table is joined with itself when it contains both Foreign Key and Primary Key in it.

Unlike queries of other joins, we use WHERE clause to specify the condition for the table to combine with itself; instead of the ON clause.

Syntax

Following is the basic syntax of SQL Self Join −

SELECT column_name(s)
FROM table1 a, table1 b
WHERE a.common_field = b.common_field;

Here, the WHERE clause could be any given expression based on your requirement.

Example

Self Join only requires one table, so, let us create a CUSTOMERS table containing the customer details like their names, age, address and the salary they earn.

CREATE TABLE CUSTOMERS (
   ID INT NOT NULL,
   NAME VARCHAR (20) NOT NULL,
   AGE INT NOT NULL,
   ADDRESS CHAR (25),
   SALARY DECIMAL (18, 2),       
   PRIMARY KEY (ID)
);

Now, insert values into this table using the INSERT statement as follows −

INSERT INTO CUSTOMERS VALUES
(1, 'Ramesh', 32, 'Ahmedabad', 2000.00 ),
(2, 'Khilan', 25, 'Delhi', 1500.00 ),
(3, 'Kaushik', 23, 'Kota', 2000.00 ),
(4, 'Chaitali', 25, 'Mumbai', 6500.00 ),
(5, 'Hardik', 27, 'Bhopal', 8500.00 ),
(6, 'Komal', 22, 'Hyderabad', 4500.00 ),
(7, 'Muffy', 24, 'Indore', 10000.00 );

The table will be created as −

ID NAME AGE ADDRESS SALARY
1 Ramesh 32 Ahmedabad 2000.00
2 Khilan 25 Delhi 1500.00
3 Kaushik 23 Kota 2000.00
4 Chaitali 25 Mumbai 6500.00
5 Hardik 27 Bhopal 8500.00
6 Komal 22 Hyderabad 4500.00
7 Muffy 24 Indore 10000.00

Now, let us join this table using the following Self Join query. Our aim is to establish a relationship among the said Customers on the basis of their earnings. We are doing this with the help of the WHERE clause.

SELECT a.ID, b.NAME as EARNS_HIGHER, a.NAME 
as EARNS_LESS, a.SALARY as LOWER_SALARY
FROM CUSTOMERS a, CUSTOMERS b
WHERE a.SALARY < b.SALARY;

Output

The resultant table displayed will list out all the customers that earn lesser than other customers −

ID EARNS_HIGHER EARNS_LESS LOWER_SALARY
2 Ramesh Khilan 1500.00
2 Kaushik Khilan 1500.00
6 Chaitali Komal 4500.00
3 Chaitali Kaushik 2000.00
2 Chaitali Khilan 1500.00
1 Chaitali Ramesh 2000.00
6 Hardik Komal 4500.00
4 Hardik Chaitali 6500.00
3 Hardik Kaushik 2000.00
2 Hardik Khilan 1500.00
1 Hardik Ramesh 2000.00
3 Komal Kaushik 2000.00
2 Komal Khilan 1500.00
1 Komal Ramesh 2000.00
6 Muffy Komal 4500.00
5 Muffy Hardik 8500.00
4 Muffy Chaitali 6500.00
3 Muffy Kaushik 2000.00
2 Muffy Khilan 1500.00
1 Muffy Ramesh 2000.00

Self Join with ORDER BY Clause

After joining a table with itself using self join, the records in the combined table can also be sorted in an order, using the ORDER BY clause.

Syntax

Following is the syntax for it −

SELECT column_name(s)
FROM table1 a, table1 b
WHERE a.common_field = b.common_field
ORDER BY column_name;

Example

Let us join the CUSTOMERS table with itself using self join on a WHERE clause; then, arrange the records in an ascending order using the ORDER BY clause with respect to a specified column, as shown in the following query.

SELECT  a.ID, b.NAME as EARNS_HIGHER, a.NAME 
as EARNS_LESS, a.SALARY as LOWER_SALARY
FROM CUSTOMERS a, CUSTOMERS b
WHERE a.SALARY < b.SALARY
ORDER BY a.SALARY;

Output

The resultant table is displayed as follows −

ID EARNS_HIGHER EARNS_LESS LOWER_SALARY
2 Ramesh Khilan 1500.00
2 Kaushik Khilan 1500.00
2 Chaitali Khilan 1500.00
2 Hardik Khilan 1500.00
2 Komal Khilan 1500.00
2 Muffy Khilan 1500.00
3 Chaitali Kaushik 2000.00
1 Chaitali Ramesh 2000.00
3 Hardik Kaushik 2000.00
1 Hardik Ramesh 2000.00
3 Komal Kaushik 2000.00
1 Komal Ramesh 2000.00
3 Muffy Kaushik 2000.00
1 Muffy Ramesh 2000.00
6 Chaitali Komal 4500.00
6 Hardik Komal 4500.00
6 Muffy Komal 4500.00
4 Hardik Chaitali 6500.00
4 Muffy Chaitali 6500.00
5 Muffy Hardik 8500.00

Not just the salary column, the records can be sorted based on the alphabetical order of names, numerical order of Customer IDs etc.

Advertisements