Find same contacts in a list of contacts in Python

Finding duplicate contacts in a list is a common problem where we need to group contacts that belong to the same person. Two contacts are considered the same if they share any common field: username, email, or phone number.

This problem can be solved using graph theory with Depth First Search (DFS). We create an adjacency matrix where contacts are nodes, and edges connect contacts that share any field.

Problem Statement

Given a list of contacts with three fields each (username, email, phone), we need to ?

  • A contact can store username, email and phone fields in any order

  • Two contacts are the same if they have either same username, email, or phone number

  • Return groups of contact indices that belong to the same person

Solution Approach

We'll use a graph-based approach with these steps ?

  1. Create adjacency matrix: Build a graph where contacts are connected if they share any field

  2. Apply DFS: Use depth-first search to find connected components

  3. Group results: Each connected component represents contacts of the same person

Implementation

class Contact:
    def __init__(self, slot1, slot2, slot3):
        self.slot1 = slot1
        self.slot2 = slot2
        self.slot3 = slot3

def generate_graph(contacts, n, matrix):
    # Initialize matrix with zeros
    for i in range(n):
        for j in range(n):
            matrix[i][j] = 0
    
    # Check each pair of contacts
    for i in range(n):
        for j in range(i + 1, n):
            # Check if any field matches
            if (contacts[i].slot1 == contacts[j].slot1 or 
                contacts[i].slot1 == contacts[j].slot2 or 
                contacts[i].slot1 == contacts[j].slot3 or
                contacts[i].slot2 == contacts[j].slot1 or 
                contacts[i].slot2 == contacts[j].slot2 or 
                contacts[i].slot2 == contacts[j].slot3 or
                contacts[i].slot3 == contacts[j].slot1 or 
                contacts[i].slot3 == contacts[j].slot2 or 
                contacts[i].slot3 == contacts[j].slot3):
                
                matrix[i][j] = 1
                matrix[j][i] = 1
                break

def visit_using_dfs(i, matrix, visited, group, n):
    visited[i] = True
    group.append(i)
    
    for j in range(n):
        if matrix[i][j] and not visited[j]:
            visit_using_dfs(j, matrix, visited, group, n)

def find_similar_contacts(contacts):
    n = len(contacts)
    matrix = [[0] * n for i in range(n)]
    visited = [False] * n
    result = []
    
    # Generate adjacency matrix
    generate_graph(contacts, n, matrix)
    
    # Find connected components using DFS
    for i in range(n):
        if not visited[i]:
            group = []
            visit_using_dfs(i, matrix, visited, group, n)
            result.append(group)
    
    return result

# Example usage
contacts = [
    Contact("Amal", "amal@gmail.com", "+915264"),
    Contact("Bimal", "bimal321@yahoo.com", "+1234567"),
    Contact("Amal123", "+915264", "amal_new@gmail.com"),
    Contact("AmalAnother", "+962547", "amal_new@gmail.com")
]

groups = find_similar_contacts(contacts)
for group in groups:
    print(group)
[0, 2, 3]
[1]

How It Works

The algorithm works in three phases ?

  1. Graph Construction: Create an adjacency matrix where matrix[i][j] = 1 if contacts i and j share any field

  2. DFS Traversal: For each unvisited contact, perform DFS to find all connected contacts

  3. Grouping: Each DFS traversal gives us one group of related contacts

Example Explanation

In our example ?

  • Contact 0 ("Amal") and Contact 2 ("Amal123") share phone "+915264"

  • Contact 2 and Contact 3 share email "amal_new@gmail.com"

  • Contact 1 ("Bimal") shares no fields with others

This creates two groups: [0, 2, 3] and [1].

Time Complexity

The time complexity is O(n²) where n is the number of contacts, due to the nested loops in graph construction and DFS traversal.

Conclusion

This graph-based approach efficiently groups contacts that belong to the same person by finding connected components. The DFS algorithm ensures all related contacts are grouped together, even when connections are indirect.

Updated on: 2026-03-25T09:23:36+05:30

332 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements