Find the Longest Duplicate Substring

You're given a string s and need to find the longest substring that appears at least twice in the string. The duplicate occurrences can overlap with each other.

Goal: Return any duplicate substring with the maximum possible length. If no duplicates exist, return an empty string "".

Example: In the string "banana", both "an" and "ana" appear twice, but "ana" is longer, so we return "ana".

This problem tests your understanding of string processing, hashing techniques, and binary search optimization.

Input & Output

example_1.py — Basic Duplicate
$ Input: s = "banana"
Output: "ana"
💡 Note: Both "an" and "ana" appear twice in "banana", but "ana" is longer so we return it. The duplicates are at positions (1-3) and (3-5).
example_2.py — No Duplicates
$ Input: s = "abcd"
Output: ""
💡 Note: All characters are unique, so no substring appears more than once. Return empty string.
example_3.py — Overlapping Duplicates
$ Input: s = "aa"
Output: "a"
💡 Note: The character "a" appears twice (overlapping is allowed). This is the longest possible duplicate.

Visualization

Tap to expand
Find longest duplicate in: "banana"Binary Search ProcessL=0M=2R=5Test length 3...Rolling Hash for Length 3banh₁=123pos 0anah₂=456pos 1nanh₃=789pos 2anah₄=456pos 3Match!h₂ = h₄ → Found duplicate "ana" at positions 1 and 3Result: "ana" (length 3)
Understanding the Visualization
1
Set Search Range
Binary search between minimum (0) and maximum possible duplicate length (n-1)
2
Test Middle Length
For the middle length, use rolling hash to check all substrings of that length
3
Rolling Hash Magic
Calculate hash in O(1) time for each new substring by removing leftmost and adding rightmost character
4
Duplicate Detection
If same hash appears twice, verify it's a real match (not just hash collision)
5
Adjust Search
If duplicates found, try longer lengths; otherwise try shorter lengths
Key Takeaway
🎯 Key Insight: Binary search narrows down the optimal length efficiently, while rolling hash avoids expensive string comparisons by computing hashes in constant time.

Time & Space Complexity

Time Complexity
⏱️
O(n² log n)

Binary search O(log n) times, each taking O(n²) for rolling hash

n
2n
Quadratic Growth
Space Complexity
O(n)

Hash set to store rolling hash values

n
2n
Linearithmic Space

Constraints

  • 2 ≤ s.length ≤ 3 × 104
  • s consists of lowercase English letters only
  • Time limit: Solution must run efficiently for maximum constraints
Asked in
Google 25 Amazon 18 Microsoft 15 Meta 12
67.0K Views
Medium Frequency
~35 min Avg. Time
1.9K Likes
Ln 1, Col 1
Smart Actions
💡 Explanation
AI Ready
💡 Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen