Find Duplicates of array using bit array in Python

Suppose we have an array of n different numbers; n can be 32,000 at max. The array may have duplicate entries and we do not know what is the value of n. Now if we have only 4-Kilobytes of memory, how would display all duplicates in the array?

So, if the input is like [2, 6, 2, 11, 13, 11], then the output will be [2,11] as 2 and 11 appear more than once in given array.

How Bit Array Works

A bit array uses individual bits to track whether a number has been seen before. Each bit position represents a specific number, making it memory-efficient for tracking duplicates.

Algorithm Steps

To solve this, we will follow these steps ?

  • Create a bit array data structure with methods to get and set bit values

  • For each number in the input array, check if its bit is already set

  • If the bit is set, the number is a duplicate

  • If not set, mark the bit to indicate we've seen this number

Implementation

Let us see the following implementation to get better understanding ?

class BitArray:
    def __init__(self, n):
        # Create array of integers to store bits
        # Each integer stores 32 bits, so we need (n >> 5) + 1 integers
        self.arr = [0] * ((n >> 5) + 1)
    
    def get_val(self, pos):
        # Find which integer contains our bit
        index = pos >> 5  # Divide by 32
        bit_no = pos & 31  # Get remainder (pos % 32)
        # Check if bit is set
        return (self.arr[index] & (1 << bit_no)) != 0
    
    def set_val(self, pos):
        # Find which integer contains our bit
        index = pos >> 5
        bit_no = pos & 31
        # Set the bit using OR operation
        self.arr[index] |= (1 << bit_no)

def find_duplicates(input_arr):
    # Create bit array for numbers up to 32000
    bit_arr = BitArray(32000)
    duplicates = []
    
    for num in input_arr:
        if bit_arr.get_val(num):
            # Number already seen, it's a duplicate
            duplicates.append(num)
        else:
            # First time seeing this number, mark it
            bit_arr.set_val(num)
    
    return duplicates

# Test the function
arr = [2, 6, 2, 11, 13, 11]
result = find_duplicates(arr)
print("Duplicates:", result)
Duplicates: [2, 11]

How It Works

The bit array uses bitwise operations for efficient memory usage:

  • pos >> 5 is equivalent to pos // 32 to find the array index

  • pos & 31 is equivalent to pos % 32 to find the bit position

  • 1 creates a mask with only the target bit set

  • arr[index] |= mask sets the bit using OR operation

  • arr[index] & mask checks if the bit is set

Memory Efficiency

This approach uses only 4KB of memory (32,000 bits รท 8 bits per byte = 4,000 bytes) to track all possible numbers, making it suitable for the given memory constraint.

Example with Step-by-Step Process

def find_duplicates_verbose(input_arr):
    bit_arr = BitArray(32000)
    duplicates = []
    
    for i, num in enumerate(input_arr):
        print(f"Processing element {i}: {num}")
        
        if bit_arr.get_val(num):
            print(f"  ? {num} already seen, adding to duplicates")
            duplicates.append(num)
        else:
            print(f"  ? {num} not seen before, marking as seen")
            bit_arr.set_val(num)
    
    return duplicates

# Test with verbose output
arr = [2, 6, 2, 11, 13, 11]
result = find_duplicates_verbose(arr)
print(f"\nFinal duplicates: {result}")
Processing element 0: 2
  ? 2 not seen before, marking as seen
Processing element 1: 6
  ? 6 not seen before, marking as seen
Processing element 2: 2
  ? 2 already seen, adding to duplicates
Processing element 3: 11
  ? 11 not seen before, marking as seen
Processing element 4: 13
  ? 13 not seen before, marking as seen
Processing element 5: 11
  ? 11 already seen, adding to duplicates

Final duplicates: [2, 11]

Conclusion

The bit array approach efficiently finds duplicates using minimal memory by representing each number as a single bit. This technique is ideal when working with memory constraints and large ranges of possible values.

Updated on: 2026-03-25T09:37:06+05:30

243 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements