Find the Shortest Superstring - Problem
Find the Shortest Superstring is a fascinating string optimization problem that challenges you to find the most efficient way to combine multiple strings into one compact superstring.

Given an array of strings words, your task is to construct the shortest possible string that contains each string in words as a substring. Think of it as creating a master string that encompasses all input strings with maximum overlap.

Key Points:
• If multiple valid strings exist with the same minimum length, return any of them
• No string in the input is a substring of another string
• The goal is to maximize overlaps between strings to minimize total length

Example: For ["catg", "ctaagt", "gcta"], one possible superstring is "gctaagttcatg" which contains all three strings with optimal overlapping.

Input & Output

example_1.py — Basic Case
$ Input: ["catg", "ctaagt", "gcta"]
Output: "gctaagttcatg"
💡 Note: The strings can be arranged as gcta → catg → ctaagt with overlaps: gcta+catg overlap by 'cat' (3 chars), catg+ctaagt overlap by 'ct' (2 chars). Result: gcta + g + aagt = gctaagttcatg
example_2.py — Simple Chain
$ Input: ["ab", "bc", "cd"]
Output: "abcd"
💡 Note: Perfect chain with single character overlaps: ab → bc → cd becomes ab + c + d = abcd
example_3.py — No Overlaps
$ Input: ["abc", "def", "ghi"]
Output: "abcdefghi"
💡 Note: When no overlaps exist between any strings, we simply concatenate them in any order. Total length equals sum of all string lengths.

Constraints

  • 1 ≤ words.length ≤ 12
  • 1 ≤ words[i].length ≤ 20
  • words[i] consists of lowercase English letters
  • No string is a substring of another string
  • The answer is guaranteed to be unique

Visualization

Tap to expand
Find the Shortest Superstring INPUT String Array: words[] "catg" index 0 "ctaagt" index 1 "gcta" index 2 Overlap Graph catg ctaagt gcta ALGORITHM STEPS (DP) 1 Build Overlap Matrix Compute overlap[i][j] i\j 0 1 2 0 0 1 0 1 0 0 0 2 0 3 0 2 DP with Bitmask dp[mask][i] = min length ending at string i 3 Find Optimal Order Track parent pointers Order: gcta--ctaagt--catg 4 Build Superstring Concatenate with overlaps "gcta" + "aagt" + "tcatg" overlap 3 overlap 0 = "gctaagttcatg" FINAL RESULT Shortest Superstring: "gctaagttcatg" Length: 12 characters Substring Verification: g c t a a g t t c a t g "gcta" at pos 0-3 [OK] g c t a a g t t c a t g "ctaagt" at pos 1-6 [OK] g c t a a g t t c a t g "catg" at pos 8-11 [OK] All Found! Key Insight: This is an NP-hard problem similar to Traveling Salesman. The DP bitmask approach has O(n^2 * 2^n) complexity. We treat strings as nodes in a graph where edge weights are overlaps. Maximizing total overlap minimizes superstring length. The bitmask tracks which strings are already included in the current partial solution. TutorialsPoint - Find the Shortest Superstring | DP Approach
Asked in
Google 38 Amazon 24 Meta 19 Microsoft 15
34.7K Views
Medium Frequency
~35 min Avg. Time
892 Likes
Ln 1, Col 1
Smart Actions
💡 Explanation
AI Ready
💡 Suggestion Tab to accept Esc to dismiss
// Output will appear here after running code
Code Editor Closed
Click the red button to reopen