Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to find the longest common substring from more than two strings in Python?
The substrings are sequences of characters that appear within a string. When working with multiple strings, finding the longest common substring helps identify shared patterns across all strings.
There are various ways to find the longest common substring from more than two strings. One of the most common approaches is using dynamic programming. Let's explore this approach in detail.
Using Dynamic Programming
In this approach, we use dynamic programming (DP) to compare strings pairwise. We start with the first two strings and find their longest common substring using a DP table.
The algorithm works as follows:
- Create a 2D DP table where
dp[i][j]represents the length of common substring ending at positionsiandj - If characters match at position
(i,j), extend the current substring:dp[i][j] = dp[i-1][j-1] + 1 - Track the maximum length and ending position to extract the actual substring
- Repeat this process with the result and the next string until all strings are processed
Example 1: Finding Common Substring
Let's look at a basic example where strings have a common substring ?
def find_longest_common_substring(strings):
def lcs_two_strings(str1, str2):
m, n = len(str1), len(str2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
max_length, ending_pos = 0, 0
for i in range(1, m + 1):
for j in range(1, n + 1):
if str1[i - 1] == str2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
if dp[i][j] > max_length:
max_length = dp[i][j]
ending_pos = i
return str1[ending_pos - max_length: ending_pos]
# Start with first string, then compare with others
common_substring = strings[0]
for i in range(1, len(strings)):
common_substring = lcs_two_strings(common_substring, strings[i])
if not common_substring: # No common substring found
break
return common_substring
# Test with strings having common substring
strings = ["1123212", "2311232", "2112312"]
result = find_longest_common_substring(strings)
print(f"Longest common substring: '{result}'")
Longest common substring: '1123'
Example 2: No Common Substring
Let's see what happens when strings have no common substring ?
def find_longest_common_substring(strings):
def lcs_two_strings(str1, str2):
m, n = len(str1), len(str2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
max_length, ending_pos = 0, 0
for i in range(1, m + 1):
for j in range(1, n + 1):
if str1[i - 1] == str2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
if dp[i][j] > max_length:
max_length = dp[i][j]
ending_pos = i
return str1[ending_pos - max_length: ending_pos]
common_substring = strings[0]
for i in range(1, len(strings)):
common_substring = lcs_two_strings(common_substring, strings[i])
if not common_substring:
break
return common_substring
# Test with strings having no common substring
strings = ["TP", "TutorialsPoint", "Welcome"]
result = find_longest_common_substring(strings)
print(f"Longest common substring: '{result}'")
Longest common substring: ''
Example 3: Multiple Common Substrings
Here's an example with multiple strings containing various common substrings ?
def find_longest_common_substring(strings):
def lcs_two_strings(str1, str2):
m, n = len(str1), len(str2)
dp = [[0] * (n + 1) for _ in range(m + 1)]
max_length, ending_pos = 0, 0
for i in range(1, m + 1):
for j in range(1, n + 1):
if str1[i - 1] == str2[j - 1]:
dp[i][j] = dp[i - 1][j - 1] + 1
if dp[i][j] > max_length:
max_length = dp[i][j]
ending_pos = i
return str1[ending_pos - max_length: ending_pos]
common_substring = strings[0]
for i in range(1, len(strings)):
common_substring = lcs_two_strings(common_substring, strings[i])
if not common_substring:
break
return common_substring
# Test with strings having longer common patterns
strings = ["programming", "programmer", "program"]
result = find_longest_common_substring(strings)
print(f"Longest common substring: '{result}'")
print(f"Length: {len(result)}")
Longest common substring: 'program' Length: 7
How It Works
The dynamic programming approach has O(n*m) time complexity for each pair comparison, where n and m are the lengths of the strings being compared. The space complexity is also O(n*m) for the DP table.
The algorithm maintains the invariant that at each step, we have the longest common substring among all strings processed so far. This ensures that the final result is indeed the longest substring common to all input strings.
Conclusion
Dynamic programming provides an efficient solution to find the longest common substring across multiple strings. The algorithm compares strings pairwise and maintains the longest common substring at each step, ensuring the final result is common to all input strings.
