Duplicate Emails - Problem
Imagine you're working as a database administrator for a large company, and you need to identify duplicate email addresses in the employee database to clean up data inconsistencies.
You have a Person table with the following structure:
| Column Name | Type |
|---|---|
| id | int |
| varchar |
Where:
idis the primary key (unique identifier)emailcontains the email addresses (all lowercase, never NULL)
Your task: Write a SQL query to find all email addresses that appear more than once in the table.
Example:
Input table:
| id | |
|---|---|
| 1 | a@b.com |
| 2 | c@d.com |
| 3 | a@b.com |
Expected output:
| a@b.com |
Input & Output
example_1.sql โ Basic Duplicates
$
Input:
Person table:
| id | email |
| 1 | a@b.com |
| 2 | c@d.com |
| 3 | a@b.com |
โบ
Output:
| Email |
| a@b.com |
๐ก Note:
a@b.com appears twice (id 1 and 3), so it's a duplicate. c@d.com appears only once, so it's not included.
example_2.sql โ Multiple Duplicates
$
Input:
Person table:
| id | email |
| 1 | x@y.com |
| 2 | a@b.com |
| 3 | x@y.com |
| 4 | c@d.com |
| 5 | a@b.com |
| 6 | x@y.com |
โบ
Output:
| Email |
| x@y.com |
| a@b.com |
๐ก Note:
x@y.com appears 3 times and a@b.com appears 2 times, both are duplicates. c@d.com appears only once.
example_3.sql โ No Duplicates
$
Input:
Person table:
| id | email |
| 1 | john@x.com|
| 2 | jane@y.com|
| 3 | bob@z.com |
โบ
Output:
(empty result)
๐ก Note:
Each email appears exactly once, so there are no duplicates to report.
Constraints
- 1 โค table size โค 1000
- Email field is never NULL
- All emails are in lowercase
- id is the primary key (always unique)
- Return result in any order
Visualization
Tap to expand
Understanding the Visualization
1
Collect all emails
Gather all email addresses from the Person table
2
Group identical emails
Use GROUP BY to automatically sort emails into groups
3
Count group sizes
COUNT(*) tells us how many emails in each group
4
Filter large groups
HAVING COUNT(*) > 1 keeps only groups with duplicates
5
Extract email addresses
SELECT email returns the duplicate email addresses
Key Takeaway
๐ฏ Key Insight: GROUP BY automatically handles the grouping and counting logic, while HAVING filters the results based on aggregate conditions - perfect for finding duplicates efficiently!
๐ก
Explanation
AI Ready
๐ก Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code