Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Program to perform prefix compression from two strings in Python
Suppose we have two strings s and t (both contain lowercase English letters). We need to find a list of three pairs, where each pair is in the form (l, k) where k is a string and l is its length. The three pairs represent: the longest common prefix of both strings, the remaining part of string s, and the remaining part of string t.
So, if the input is like s = "science" and t = "school", then the output will be [(2, 'sc'), (5, 'ience'), (4, 'hool')]
Algorithm
To solve this, we will follow these steps −
- Initialize lcp as an empty string
- Iterate from 0 to minimum of length of s or length of t
- If s[i] is same as t[i], add s[i] to lcp
- Otherwise, break the loop
- Extract remaining part of s from index (length of lcp) to end
- Extract remaining part of t from index (length of lcp) to end
- Return a list of three pairs: [(length of lcp, lcp), (length of s_rem, s_rem), (length of t_rem, t_rem)]
Example
Let us see the following implementation to get better understanding −
def solve(s, t):
lcp = ''
for i in range(min(len(s), len(t))):
if s[i] == t[i]:
lcp += s[i]
else:
break
s_rem = s[len(lcp):]
t_rem = t[len(lcp):]
return [(len(lcp), lcp), (len(s_rem), s_rem), (len(t_rem), t_rem)]
s = "science"
t = "school"
print(solve(s, t))
The output of the above code is −
[(2, 'sc'), (5, 'ience'), (4, 'hool')]
How It Works
The function compares characters at the same positions in both strings. When it finds the first mismatch, it stops and considers everything before that position as the longest common prefix. Then it extracts the remaining parts of both strings after removing the common prefix.
Another Example
Let's test with different strings to see how the algorithm works −
def solve(s, t):
lcp = ''
for i in range(min(len(s), len(t))):
if s[i] == t[i]:
lcp += s[i]
else:
break
s_rem = s[len(lcp):]
t_rem = t[len(lcp):]
return [(len(lcp), lcp), (len(s_rem), s_rem), (len(t_rem), t_rem)]
# Test with strings having no common prefix
s1 = "hello"
t1 = "world"
print("No common prefix:", solve(s1, t1))
# Test with one string being prefix of another
s2 = "test"
t2 = "testing"
print("One is prefix:", solve(s2, t2))
The output of the above code is −
No common prefix: [(0, ''), (5, 'hello'), (5, 'world')] One is prefix: [(4, 'test'), (0, ''), (3, 'ing')]
Conclusion
This prefix compression algorithm efficiently finds the longest common prefix between two strings and returns the compressed representation as three pairs. The solution has O(min(m,n)) time complexity where m and n are the lengths of the input strings.
