Drop Duplicate Rows - Problem
You have a DataFrame called customers with the following structure:
| Column Name | Type |
|---|---|
| customer_id | int |
| name | object |
| object |
There are some duplicate rows in the DataFrame based on the email column.
Write a solution to remove these duplicate rows and keep only the first occurrence.
Return the cleaned DataFrame in the same format.
Input & Output
Example 1 — Basic Duplicate Removal
$
Input:
customers = [{"customer_id": 1, "name": "Ella", "email": "emily@example.com"}, {"customer_id": 2, "name": "David", "email": "michael@example.com"}, {"customer_id": 3, "name": "Zachary", "email": "sarah@example.com"}, {"customer_id": 4, "name": "Alice", "email": "emily@example.com"}]
›
Output:
[{"customer_id": 1, "name": "Ella", "email": "emily@example.com"}, {"customer_id": 2, "name": "David", "email": "michael@example.com"}, {"customer_id": 3, "name": "Zachary", "email": "sarah@example.com"}]
💡 Note:
Row with customer_id=4 (Alice) is removed because emily@example.com already exists in row with customer_id=1 (Ella). We keep the first occurrence.
Example 2 — Multiple Duplicates
$
Input:
customers = [{"customer_id": 1, "name": "John", "email": "john@email.com"}, {"customer_id": 2, "name": "Bob", "email": "bob@email.com"}, {"customer_id": 3, "name": "Johnny", "email": "john@email.com"}, {"customer_id": 4, "name": "Robert", "email": "bob@email.com"}]
›
Output:
[{"customer_id": 1, "name": "John", "email": "john@email.com"}, {"customer_id": 2, "name": "Bob", "email": "bob@email.com"}]
💡 Note:
Both john@email.com and bob@email.com have duplicates. We keep only the first occurrence of each email: John (ID=1) and Bob (ID=2).
Example 3 — No Duplicates
$
Input:
customers = [{"customer_id": 1, "name": "Alice", "email": "alice@email.com"}, {"customer_id": 2, "name": "Bob", "email": "bob@email.com"}]
›
Output:
[{"customer_id": 1, "name": "Alice", "email": "alice@email.com"}, {"customer_id": 2, "name": "Bob", "email": "bob@email.com"}]
💡 Note:
All emails are unique, so no rows are removed. The DataFrame remains unchanged.
Constraints
- 1 ≤ customers.length ≤ 104
- customer_id, name, and email are non-empty
- All customer_id values are unique
Visualization
Tap to expand
💡
Explanation
AI Ready
💡 Suggestion
Tab
to accept
Esc
to dismiss
// Output will appear here after running code