Program to find total similarities of a string and its substrings in Python

PythonServer Side ProgrammingProgramming

Suppose we have a string s. We have to find the sum of similarities of string s with each of it's suffixes. Here the similarity between two strings are the length of the longest prefix common to both strings.

So, if the input is like s = "pqpqpp", then the output will be 11 because the suffixes of the string are "pqpqpp", "qpqpp", "pqpp", "qpp", "pp" and "p". The similarities of these substrings with the string "pqpqpp" are 6,0,3,0,1, and 1. So the summation is 6 + 0 + 3 + 0 + 1 + 1 = 11.

To solve this, we will follow these steps −

  • length := size of s
  • total := length
  • z := a list containing 0 initially
  • l := 0, r := 0
  • for k in range 1 to length - 1, do
    • if k > r, then
      • match:= 0
      • index := k
    • while index < length, do
      • if s[index] is same as s[match], then
        • match := match + 1
        • index := index + 1
      • otherwise,
        • come out from loop
    • insert match at the end of z
    • if match > 0, then
      • total := total + match
      • l := k
      • r := index-1
    • otherwise,
      • if z[k-l] < (r-k)+1, then
        • insert z[k-l] at the end of z
        • total := total + z[k-l]
      • otherwise,
        • match := r-k
        • index := r
        • while index < length, do
          • if s[index] is same as s[match], then
            • match := match + 1
            • index := index + 1
          • otherwise,
            • come out from loop
        • insert match at the end of z
        • total := total + match
        • l := k
        • r := index-1
  • return total

Example

Let us see the following implementation to get better understanding −

def solve(s):
   length = len(s)
   total = length

   z = [0]
   l = 0
   r = 0

   for k in range(1,length):
      if k > r:
         match=0
         index = k
         while index < length:
            if s[index] == s[match]:
               match +=1
               index +=1
            else:
               break
         z.append(match)
         if match > 0:
            total+=match
            l = k
            r = index-1
      else:
         if z[k-l] < (r-k)+1:
            z.append(z[k-l])
            total+=z[k-l]
         else:
            match = r-k
            index = r
            while index < length:
               if s[index] == s[match]:
                  match +=1
                  index +=1
               else:
                  break
            z.append(match)
            total+=match
            l = k
            r = index-1
   return total

s = "pqpqpp"
print(solve(s))

Input

"pqpqpp"

Output

11
raja
Published on 07-Oct-2021 12:55:37
Advertisements