SQLAlchemy Core - Using Set Operations


In the last chapter, we have learnt about various functions such as max(), min(), count(), etc., here, we will learn about set operations and their uses.

Set operations such as UNION and INTERSECT are supported by standard SQL and most of its dialect. SQLAlchemy implements them with the help of following functions −

union()

While combining results of two or more SELECT statements, UNION eliminates duplicates from the resultset. The number of columns and datatype must be same in both the tables.

The union() function returns a CompoundSelect object from multiple tables. Following example demonstrates its use −

from sqlalchemy import create_engine, MetaData, Table, Column, Integer, String, union
engine = create_engine('sqlite:///college.db', echo = True)

meta = MetaData()
conn = engine.connect()
addresses = Table(
   'addresses', meta, 
   Column('id', Integer, primary_key = True), 
   Column('st_id', Integer), 
   Column('postal_add', String), 
   Column('email_add', String)
)

u = union(addresses.select().where(addresses.c.email_add.like('%@gmail.com addresses.select().where(addresses.c.email_add.like('%@yahoo.com'))))

result = conn.execute(u)
result.fetchall()

The union construct translates to following SQL expression −

SELECT addresses.id, 
   addresses.st_id, 
   addresses.postal_add, 
   addresses.email_add
FROM addresses
WHERE addresses.email_add LIKE ? UNION SELECT addresses.id, 
   addresses.st_id, 
   addresses.postal_add, 
   addresses.email_add
FROM addresses
WHERE addresses.email_add LIKE ?

From our addresses table, following rows represent the union operation −

[
   (1, 1, 'Shivajinagar Pune', 'ravi@gmail.com'),
   (2, 1, 'ChurchGate Mumbai', 'kapoor@gmail.com'),
   (3, 3, 'Jubilee Hills Hyderabad', 'komal@gmail.com'),
   (4, 5, 'MG Road Bangaluru', 'as@yahoo.com')
]

union_all()

UNION ALL operation cannot remove the duplicates and cannot sort the data in the resultset. For example, in above query, UNION is replaced by UNION ALL to see the effect.

u = union_all(addresses.select().where(addresses.c.email_add.like('%@gmail.com')), addresses.select().where(addresses.c.email_add.like('%@yahoo.com')))

The corresponding SQL expression is as follows −

SELECT addresses.id, 
   addresses.st_id, 
   addresses.postal_add, 
   addresses.email_add
FROM addresses
WHERE addresses.email_add LIKE ? UNION ALL SELECT addresses.id, 
   addresses.st_id, 
   addresses.postal_add, 
   addresses.email_add
FROM addresses
WHERE addresses.email_add LIKE ?

except_()

The SQL EXCEPT clause/operator is used to combine two SELECT statements and return rows from the first SELECT statement that are not returned by the second SELECT statement. The except_() function generates a SELECT expression with EXCEPT clause.

In the following example, the except_() function returns only those records from addresses table that have ‘gmail.com’ in email_add field but excludes those which have ‘Pune’ as part of postal_add field.

u = except_(addresses.select().where(addresses.c.email_add.like('%@gmail.com')), addresses.select().where(addresses.c.postal_add.like('%Pune')))

Result of the above code is the following SQL expression −

SELECT addresses.id, 
   addresses.st_id, 
   addresses.postal_add, 
   addresses.email_add
FROM addresses
WHERE addresses.email_add LIKE ? EXCEPT SELECT addresses.id, 
   addresses.st_id, 
   addresses.postal_add, 
   addresses.email_add
FROM addresses
WHERE addresses.postal_add LIKE ?

Assuming that addresses table contains data used in earlier examples, it will display following output −

[(2, 1, 'ChurchGate Mumbai', 'kapoor@gmail.com'),
   (3, 3, 'Jubilee Hills Hyderabad', 'komal@gmail.com')]

intersect()

Using INTERSECT operator, SQL displays common rows from both the SELECT statements. The intersect() function implements this behaviour.

In following examples, two SELECT constructs are parameters to intersect() function. One returns rows containing ‘gmail.com’ as part of email_add column, and other returns rows having ‘Pune’ as part of postal_add column. The result will be common rows from both resultsets.

u = intersect(addresses.select().where(addresses.c.email_add.like('%@gmail.com')), addresses.select().where(addresses.c.postal_add.like('%Pune')))

In effect, this is equivalent to following SQL statement −

SELECT addresses.id, 
   addresses.st_id, 
   addresses.postal_add, 
   addresses.email_add
FROM addresses
WHERE addresses.email_add LIKE ? INTERSECT SELECT addresses.id, 
   addresses.st_id, 
   addresses.postal_add, 
   addresses.email_add
FROM addresses
WHERE addresses.postal_add LIKE ?

The two bound parameters ‘%gmail.com’ and ‘%Pune’ generate a single row from original data in addresses table as shown below −

[(1, 1, 'Shivajinagar Pune', 'ravi@gmail.com')]
Advertisements