Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Python Pandas - Return Index with duplicate values removed keeping the last occurrence
To return Index with duplicate values removed keeping the last occurrence, use the index.drop_duplicates() method. Use the keep parameter with value last.
Syntax
Index.drop_duplicates(keep='first')
Parameters
The keep parameter accepts the following values ?
- 'first' ? Keep the first occurrence (default)
- 'last' ? Keep the last occurrence
- False ? Remove all duplicates
Creating an Index with Duplicates
First, let's create a Pandas Index containing duplicate values ?
import pandas as pd
# Creating the index with some duplicates
index = pd.Index(['Car', 'Bike', 'Airplane', 'Ship', 'Airplane'])
# Display the index
print("Pandas Index with duplicates...")
print(index)
Pandas Index with duplicates... Index(['Car', 'Bike', 'Airplane', 'Ship', 'Airplane'], dtype='object')
Removing Duplicates - Keep Last Occurrence
Use drop_duplicates(keep='last') to keep the last occurrence of each duplicate ?
import pandas as pd
# Creating the index with some duplicates
index = pd.Index(['Car', 'Bike', 'Airplane', 'Ship', 'Airplane'])
# Remove duplicates keeping the last occurrence
result = index.drop_duplicates(keep='last')
print("Index with duplicate values removed (keeping the last occurrence)...")
print(result)
Index with duplicate values removed (keeping the last occurrence)... Index(['Car', 'Bike', 'Ship', 'Airplane'], dtype='object')
Complete Example
Here's a complete example showing the original index and the result after removing duplicates ?
import pandas as pd
# Creating the index with some duplicates
index = pd.Index(['Car', 'Bike', 'Airplane', 'Ship', 'Airplane'])
# Display the original index
print("Original Index:")
print(index)
# Display index properties
print("\nIndex dtype:", index.dtype)
print("Index dimensions:", index.ndim)
# Remove duplicates keeping the last occurrence
result = index.drop_duplicates(keep='last')
print("\nAfter removing duplicates (keep='last'):")
print(result)
Original Index: Index(['Car', 'Bike', 'Airplane', 'Ship', 'Airplane'], dtype='object') Index dtype: object Index dimensions: 1 After removing duplicates (keep='last'): Index(['Car', 'Bike', 'Ship', 'Airplane'], dtype='object')
Comparison of Different Keep Options
| Keep Parameter | Description | Result for 'Airplane' |
|---|---|---|
'first' |
Keep first occurrence | Position 2 (first 'Airplane') |
'last' |
Keep last occurrence | Position 4 (last 'Airplane') |
False |
Remove all duplicates | No 'Airplane' in result |
Conclusion
The drop_duplicates(keep='last') method efficiently removes duplicate values from a Pandas Index while preserving the last occurrence of each duplicate. This is useful when the order of elements matters and you want to retain the most recent occurrence.
Advertisements
