Python Pandas - Compute indexer and mask for new index even for non-uniquely valued objects

To compute indexer and mask for new index even for non-uniquely valued objects, use the index.get_indexer_non_unique() method. This method handles duplicate values in the index and returns both the positions of matches and missing values.

Syntax

The syntax for the get_indexer_non_unique() method is ?

index.get_indexer_non_unique(target)

Parameters

  • target − An array-like object containing the values to search for in the index

Return Value

The method returns a tuple containing ?

  • indexer − Array of indices where matches are found (-1 for missing values)
  • missing − Array of indices in the target that were not found

Example

Let's create a Pandas index with duplicate values and compute indexer and mask ?

import pandas as pd

# Creating Pandas index with some non-unique values
index = pd.Index([10, 20, 30, 40, 40, 50, 60, 60, 60, 70])

# Display the Pandas index
print("Pandas Index...\n", index)

# Return the number of elements in the index
print("\nNumber of elements in the index...\n", index.size)

# Compute indexer and mask for target values
# Returns (-1) for values not found in index
# Handles non-unique values by returning all matching positions
target_values = [30, 40, 90, 100, 50, 60]
indexer, missing = index.get_indexer_non_unique(target_values)

print("\nTarget values:", target_values)
print("Indexer array:", indexer)
print("Missing indices:", missing)
Pandas Index...
 Int64Index([10, 20, 30, 40, 40, 50, 60, 60, 60, 70], dtype='int64')

Number of elements in the index...
10

Target values: [30, 40, 90, 100, 50, 60]
Indexer array: [ 2  3  4 -1 -1  5  6  7  8]
Missing indices: [2 3]

How It Works

The method processes each target value as follows ?

  • Value 30 − Found at index position 2
  • Value 40 − Found at positions 3 and 4 (both occurrences returned)
  • Value 90 − Not found, marked as -1
  • Value 100 − Not found, marked as -1
  • Value 50 − Found at index position 5
  • Value 60 − Found at positions 6, 7, and 8 (all occurrences returned)

The missing array [2, 3] indicates that the 3rd and 4th elements in the target array (90 and 100) were not found.

Key Points

  • Unlike get_indexer(), this method handles duplicate values properly
  • Returns all matching positions for duplicate values
  • Missing values are marked with -1 in the indexer array
  • The missing array contains indices of target values that weren't found

Conclusion

The get_indexer_non_unique() method is essential for handling non-unique index values in Pandas. It returns both the positions of matches and identifies missing values, making it ideal for data alignment operations with duplicate entries.

Updated on: 2026-03-26T16:32:19+05:30

179 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements