Article Categories

Selected Reading

Python Pandas - Return Index with duplicate values removed except the first occurrence

Python Pandas Server Side Programming Programming

To return a Pandas Index with duplicate values removed except the first occurrence, use the index.drop_duplicates() method with the keep parameter set to 'first'.

Basic Syntax

The drop_duplicates() method syntax is ?

index.drop_duplicates(keep='first')

Creating an Index with Duplicates

Let's create a Pandas Index containing duplicate values ?

import pandas as pd

# Creating the index with some duplicates
index = pd.Index(['Car', 'Bike', 'Airplane', 'Ship', 'Airplane'])
print("Original Index with duplicates:")
print(index)

Original Index with duplicates:
Index(['Car', 'Bike', 'Airplane', 'Ship', 'Airplane'], dtype='object')

Removing Duplicates (Keep First)

Use drop_duplicates(keep='first') to keep only the first occurrence of each duplicate ?

import pandas as pd

index = pd.Index(['Car', 'Bike', 'Airplane', 'Ship', 'Airplane'])
print("Original Index:")
print(index)

# Remove duplicates keeping first occurrence
result = index.drop_duplicates(keep='first')
print("\nIndex with duplicates removed:")
print(result)

Original Index:
Index(['Car', 'Bike', 'Airplane', 'Ship', 'Airplane'], dtype='object')

Index with duplicates removed:
Index(['Car', 'Bike', 'Airplane', 'Ship'], dtype='object')

Different Keep Options

The keep parameter accepts different values for handling duplicates ?

import pandas as pd

index = pd.Index(['A', 'B', 'C', 'B', 'A', 'D'])
print("Original Index:")
print(index)

# Keep first occurrence
first = index.drop_duplicates(keep='first')
print("\nKeep first:")
print(first)

# Keep last occurrence  
last = index.drop_duplicates(keep='last')
print("\nKeep last:")
print(last)

# Remove all duplicates
none = index.drop_duplicates(keep=False)
print("\nKeep none (remove all duplicates):")
print(none)

Original Index:
Index(['A', 'B', 'C', 'B', 'A', 'D'], dtype='object')

Keep first:
Index(['A', 'B', 'C', 'D'], dtype='object')

Keep last:
Index(['C', 'B', 'A', 'D'], dtype='object')

Keep none (remove all duplicates):
Index(['C', 'D'], dtype='object')

Comparison Table

keep Parameter	Behavior	Use Case
`'first'`	Keep first occurrence	Default behavior, maintains original order
`'last'`	Keep last occurrence	When latest value is more relevant
`False`	Remove all duplicates	Keep only unique values

Conclusion

Use index.drop_duplicates(keep='first') to remove duplicate values while preserving the first occurrence. The method maintains the original data type and order of the remaining elements.

AmitDiwan

Updated on: 2026-03-26T16:17:12+05:30

342 Views

Previous Next