Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Find the Duplicate Number in Python
When you have an array containing n + 1 integers where each number is between 1 and n, there must be at least one duplicate number. This problem can be solved efficiently using Floyd's Cycle Detection Algorithm (also known as the tortoise and hare algorithm).
The key insight is to treat the array as a linked list where each element points to the index of its value. Since there's a duplicate, there will be a cycle in this "linked list".
Algorithm Steps
The solution works in two phases ?
- Phase 1: Detect if a cycle exists using slow and fast pointers
- Phase 2: Find the start of the cycle (which is our duplicate number)
Implementation
def find_duplicate(nums):
# Phase 1: Detect cycle using Floyd's algorithm
slow = nums[0]
fast = nums[0]
# Move slow one step, fast two steps
while True:
slow = nums[slow]
fast = nums[nums[fast]]
if slow == fast:
break
# Phase 2: Find the start of cycle
ptr = nums[0]
while ptr != slow:
ptr = nums[ptr]
slow = nums[slow]
return ptr
# Test with examples
test_arrays = [
[1, 3, 4, 2, 2],
[3, 1, 3, 4, 2],
[1, 1, 2],
[2, 2, 2, 2, 2]
]
for arr in test_arrays:
duplicate = find_duplicate(arr)
print(f"Array: {arr} ? Duplicate: {duplicate}")
Array: [1, 3, 4, 2, 2] ? Duplicate: 2 Array: [3, 1, 3, 4, 2] ? Duplicate: 3 Array: [1, 1, 2] ? Duplicate: 1 Array: [2, 2, 2, 2, 2] ? Duplicate: 2
How It Works
Consider the array [1, 3, 4, 2, 2] as a linked list ?
- Index 0 ? value 1 ? go to index 1
- Index 1 ? value 3 ? go to index 3
- Index 3 ? value 2 ? go to index 2
- Index 2 ? value 4 ? go to index 4
- Index 4 ? value 2 ? go to index 2 (cycle detected!)
Class-Based Solution
class DuplicateFinder:
def find_duplicate(self, nums):
"""
Find duplicate number using Floyd's Cycle Detection
Time Complexity: O(n), Space Complexity: O(1)
"""
# Phase 1: Detect cycle
tortoise = nums[0]
hare = nums[0]
while True:
tortoise = nums[tortoise]
hare = nums[nums[hare]]
if tortoise == hare:
break
# Phase 2: Find cycle start
ptr = nums[0]
while ptr != tortoise:
ptr = nums[ptr]
tortoise = nums[tortoise]
return ptr
# Example usage
finder = DuplicateFinder()
result = finder.find_duplicate([3, 1, 3, 4, 2])
print(f"Duplicate number: {result}")
Duplicate number: 3
Comparison with Other Methods
| Method | Time Complexity | Space Complexity | Modifies Array? |
|---|---|---|---|
| Floyd's Algorithm | O(n) | O(1) | No |
| Hash Set | O(n) | O(n) | No |
| Sorting | O(n log n) | O(1) | Yes |
Conclusion
Floyd's Cycle Detection Algorithm provides an optimal O(n) time and O(1) space solution for finding duplicates. It works by treating the array as a linked list and detecting cycles, making it both memory-efficient and elegant.
