Python Pandas - Return Index with duplicate values removed except the first occurrence

To return a Pandas Index with duplicate values removed except the first occurrence, use the index.drop_duplicates() method with the keep parameter set to 'first'.

Basic Syntax

The drop_duplicates() method syntax is ?

index.drop_duplicates(keep='first')

Creating an Index with Duplicates

Let's create a Pandas Index containing duplicate values ?

import pandas as pd

# Creating the index with some duplicates
index = pd.Index(['Car', 'Bike', 'Airplane', 'Ship', 'Airplane'])
print("Original Index with duplicates:")
print(index)
Original Index with duplicates:
Index(['Car', 'Bike', 'Airplane', 'Ship', 'Airplane'], dtype='object')

Removing Duplicates (Keep First)

Use drop_duplicates(keep='first') to keep only the first occurrence of each duplicate ?

import pandas as pd

index = pd.Index(['Car', 'Bike', 'Airplane', 'Ship', 'Airplane'])
print("Original Index:")
print(index)

# Remove duplicates keeping first occurrence
result = index.drop_duplicates(keep='first')
print("\nIndex with duplicates removed:")
print(result)
Original Index:
Index(['Car', 'Bike', 'Airplane', 'Ship', 'Airplane'], dtype='object')

Index with duplicates removed:
Index(['Car', 'Bike', 'Airplane', 'Ship'], dtype='object')

Different Keep Options

The keep parameter accepts different values for handling duplicates ?

import pandas as pd

index = pd.Index(['A', 'B', 'C', 'B', 'A', 'D'])
print("Original Index:")
print(index)

# Keep first occurrence
first = index.drop_duplicates(keep='first')
print("\nKeep first:")
print(first)

# Keep last occurrence  
last = index.drop_duplicates(keep='last')
print("\nKeep last:")
print(last)

# Remove all duplicates
none = index.drop_duplicates(keep=False)
print("\nKeep none (remove all duplicates):")
print(none)
Original Index:
Index(['A', 'B', 'C', 'B', 'A', 'D'], dtype='object')

Keep first:
Index(['A', 'B', 'C', 'D'], dtype='object')

Keep last:
Index(['C', 'B', 'A', 'D'], dtype='object')

Keep none (remove all duplicates):
Index(['C', 'D'], dtype='object')

Comparison Table

keep Parameter Behavior Use Case
'first' Keep first occurrence Default behavior, maintains original order
'last' Keep last occurrence When latest value is more relevant
False Remove all duplicates Keep only unique values

Conclusion

Use index.drop_duplicates(keep='first') to remove duplicate values while preserving the first occurrence. The method maintains the original data type and order of the remaining elements.

Updated on: 2026-03-26T16:17:12+05:30

225 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements