Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Program to find the indexes where the substrings of a string match with another string fully or differ at one position in python
When working with string matching problems, we often need to find substrings that either match exactly or differ by only one character. This article demonstrates how to find starting indexes in a string where substrings match another string completely or differ at exactly one position.
Given two strings where the first string is longer than the second, we need to identify all starting positions where substrings from the first string either match the second string exactly or differ by only one character.
Problem Example
If we have string1 = 'tpoint' and string2 = 'pi', the output should be 1 2.
The substrings starting at index 1 and 2 are:
- Index 1: 'po' (differs from 'pi' at one position)
- Index 2: 'oi' (differs from 'pi' at one position)
Algorithm Overview
The solution uses the Z-algorithm for efficient string matching. The approach involves:
- Creating a Z-array to find longest common prefixes
- Searching forward and backward to detect matches and single-character differences
- Combining results to identify valid starting positions
Implementation
def search(string1, string2):
str_cat = string1 + string2
z_list = [0] * len(str_cat)
z_list[0] = len(str_cat)
right = 0
left = 0
for i in range(1, len(str_cat)):
if i > right:
j = 0
while j + i < len(str_cat) and str_cat[j] == str_cat[j+i]:
j += 1
z_list[i] = j
if j > 0:
left = i
right = i + j - 1
else:
k = i - left
r_len = right - i + 1
if z_list[k] < r_len:
z_list[i] = z_list[k]
else:
m = right + 1
while m < len(str_cat) and str_cat[m] == str_cat[m - i]:
m += 1
z_list[i] = m - i
left = i
right = m - 1
z_list[i] = min(len(string1), z_list[i])
return z_list[len(string1):]
def solve(str1, str2):
# Forward search
fwd = search(str2, str1)
# Backward search with reversed strings
bwrd = search(str2[::-1], str1[::-1])
bwrd.reverse()
# Find valid starting indexes
idx = []
for i in range(len(str1) - len(str2) + 1):
if fwd[i] + bwrd[i + len(str2) - 1] >= len(str2) - 1:
idx.append(str(i))
if len(idx) == 0:
return False
else:
return " ".join(idx)
# Test the function
print(solve('tpoint', 'pi'))
1 2
How It Works
The algorithm works in three main steps:
- Forward Search: Uses Z-algorithm to find how many characters match from the beginning of each substring
- Backward Search: Performs the same operation on reversed strings to find matches from the end
- Combination: If forward matches + backward matches ? (pattern length - 1), then the substring either matches exactly or differs by one character
Additional Example
# Test with different strings
print("Example 1:", solve('hello', 'el'))
print("Example 2:", solve('programming', 'ram'))
print("Example 3:", solve('python', 'xyz'))
Example 1: 1 Example 2: 4 Example 3: False
Conclusion
This algorithm efficiently finds substring positions that match exactly or differ by one character using the Z-algorithm approach. The combination of forward and backward searches ensures accurate detection of both exact matches and single-character mismatches in linear time complexity.
