Delete Duplicate Emails - Problem

You are given a table Person that contains email addresses with their corresponding IDs.

Your task: Write a DELETE statement to remove all duplicate emails, keeping only the one with the smallest ID.

Important: You must write a DELETE statement, not a SELECT statement. The solution should modify the table in place.

  • Each email may appear multiple times with different IDs
  • Keep the row with the minimum ID for each email
  • Delete all other duplicate rows

Table Schema

Person
Column Name Type Description
id PK int Primary key, unique identifier
email varchar Email address (no uppercase letters)
Primary Key: id
Note: Each row represents a person with their email address

Input & Output

Example 1 — Basic Duplicate Removal
Input Table:
id email
1 john@example.com
2 bob@example.com
3 john@example.com
Output:
id email
1 john@example.com
2 bob@example.com
💡 Note:

The email john@example.com appears twice with IDs 1 and 3. We keep the row with the smaller ID (1) and delete the duplicate with ID 3.

Example 2 — Multiple Duplicates
Input Table:
id email
1 alice@example.com
2 bob@example.com
3 alice@example.com
4 charlie@example.com
5 bob@example.com
Output:
id email
1 alice@example.com
2 bob@example.com
4 charlie@example.com
💡 Note:

Multiple emails have duplicates: alice@example.com (IDs 1,3) and bob@example.com (IDs 2,5). We keep the rows with smaller IDs (1,2) and delete the duplicates (3,5).

Constraints

  • 1 ≤ id ≤ 1000
  • email contains no uppercase letters
  • email follows valid email format

Visualization

Tap to expand
Delete Duplicate Emails INPUT: Person Table ID Email 1 john@ex.com 2 bob@ex.com 3 bob@ex.com Red = Duplicate to DELETE Table Structure: Person ( Id INT PRIMARY KEY, Email VARCHAR ) ALGORITHM STEPS 1 Self-Join Table Join Person p1 with p2 2 Match Emails WHERE p1.Email = p2.Email 3 Find Larger IDs AND p1.Id > p2.Id 4 Delete Duplicates DELETE p1 (larger Id rows) SQL Query: DELETE p1 FROM Person p1, Person p2 WHERE p1.Email=p2.Email AND p1.Id > p2.Id; FINAL RESULT ID Email 1 john@ex.com 3 bob@ex.com Row with Id=2 DELETED (duplicate bob@ex.com) OK - Duplicates Removed Kept smallest ID per email Summary: Rows before: 3 Rows deleted: 1 Rows after: 2 Key Insight: Self-join identifies duplicates by matching emails where one row has a larger ID than another. By deleting p1 where p1.Id > p2.Id, we keep only the row with the smallest ID for each email. TutorialsPoint - Delete Duplicate Emails | Optimal Solution
Asked in
Google 28 Amazon 22 Microsoft 18
125.0K Views
High Frequency
~12 min Avg. Time
892 Likes
Ln 1, Col 1
Smart Actions
💡 Explanation
AI Ready
💡 Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen