Delete Duplicate Emails - Problem
Email Database Cleanup Challenge
You're working as a database administrator for a company that has been collecting user emails for years. Over time, the same email addresses have been registered multiple times with different IDs, creating duplicates in your
The Problem: Clean up the database by removing duplicate emails, but keep the record with the smallest ID (the oldest registration).
Table Structure:
Your Task: Write a
Note: This is a database manipulation problem - you're actually modifying the table structure, not just querying it!
You're working as a database administrator for a company that has been collecting user emails for years. Over time, the same email addresses have been registered multiple times with different IDs, creating duplicates in your
Person table.The Problem: Clean up the database by removing duplicate emails, but keep the record with the smallest ID (the oldest registration).
Table Structure:
Person +-------------+---------+ | Column Name | Type | +-------------+---------+ | id | int | | email | varchar | +-------------+---------+
Your Task: Write a
DELETE statement (not SELECT) that removes all duplicate emails, preserving only the entry with the smallest id for each unique email.Note: This is a database manipulation problem - you're actually modifying the table structure, not just querying it!
Input & Output
example_1.sql โ Basic Duplicate Removal
$
Input:
Person table:
+----+------------------+
| id | email |
+----+------------------+
| 1 | john@example.com |
| 2 | bob@example.com |
| 3 | john@example.com |
+----+------------------+
โบ
Output:
Person table after deletion:
+----+------------------+
| id | email |
+----+------------------+
| 1 | john@example.com |
| 2 | bob@example.com |
+----+------------------+
๐ก Note:
The row with id=3 is deleted because it's a duplicate of john@example.com with a higher ID than the existing row (id=1).
example_2.sql โ Multiple Duplicates
$
Input:
Person table:
+----+------------------+
| id | email |
+----+------------------+
| 1 | alice@example.com|
| 2 | bob@example.com |
| 3 | alice@example.com|
| 4 | charlie@ex.com |
| 5 | alice@example.com|
+----+------------------+
โบ
Output:
Person table after deletion:
+----+------------------+
| id | email |
+----+------------------+
| 1 | alice@example.com|
| 2 | bob@example.com |
| 4 | charlie@ex.com |
+----+------------------+
๐ก Note:
Rows with id=3 and id=5 are deleted because they are duplicates of alice@example.com. Only the record with the smallest ID (id=1) is kept.
example_3.sql โ No Duplicates
$
Input:
Person table:
+----+------------------+
| id | email |
+----+------------------+
| 1 | john@example.com |
| 2 | jane@example.com |
| 3 | bob@example.com |
+----+------------------+
โบ
Output:
Person table after deletion:
+----+------------------+
| id | email |
+----+------------------+
| 1 | john@example.com |
| 2 | jane@example.com |
| 3 | bob@example.com |
+----+------------------+
๐ก Note:
No rows are deleted because all email addresses are unique. The table remains unchanged.
Visualization
Tap to expand
Understanding the Visualization
1
Identify Duplicates
Find all email addresses that appear multiple times in the database
2
Compare IDs
For each duplicate email, compare the ID numbers to find the smallest (oldest)
3
Mark for Deletion
Mark all duplicate entries with larger IDs for removal
4
Execute Deletion
Remove the marked records, leaving only unique emails with smallest IDs
Key Takeaway
๐ฏ Key Insight: Using a self-join with DELETE allows us to compare each row with all other rows having the same email and remove duplicates with higher IDs in a single, efficient database operation.
Time & Space Complexity
Time Complexity
O(nยฒ)
Self-join operation compares every row with every other row
โ Quadratic Growth
Space Complexity
O(n)
Temporary space needed for join operation and result set
โก Linearithmic Space
Constraints
- 1 โค Person table rows โค 104
- All emails are in lowercase format
- id is unique and serves as the primary key
- Email length: 1 โค email.length โค 100
๐ก
Explanation
AI Ready
๐ก Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code