Program to find total similarities of a string and its substrings in Python


Suppose we have a string s. We have to find the sum of similarities of string s with each of it's suffixes. Here the similarity between two strings are the length of the longest prefix common to both strings.

So, if the input is like s = "pqpqpp", then the output will be 11 because the suffixes of the string are "pqpqpp", "qpqpp", "pqpp", "qpp", "pp" and "p". The similarities of these substrings with the string "pqpqpp" are 6,0,3,0,1, and 1. So the summation is 6 + 0 + 3 + 0 + 1 + 1 = 11.

To solve this, we will follow these steps −

  • length := size of s
  • total := length
  • z := a list containing 0 initially
  • l := 0, r := 0
  • for k in range 1 to length - 1, do
    • if k > r, then
      • match:= 0
      • index := k
    • while index < length, do
      • if s[index] is same as s[match], then
        • match := match + 1
        • index := index + 1
      • otherwise,
        • come out from loop
    • insert match at the end of z
    • if match > 0, then
      • total := total + match
      • l := k
      • r := index-1
    • otherwise,
      • if z[k-l] < (r-k)+1, then
        • insert z[k-l] at the end of z
        • total := total + z[k-l]
      • otherwise,
        • match := r-k
        • index := r
        • while index < length, do
          • if s[index] is same as s[match], then
            • match := match + 1
            • index := index + 1
          • otherwise,
            • come out from loop
        • insert match at the end of z
        • total := total + match
        • l := k
        • r := index-1
  • return total

Example

Let us see the following implementation to get better understanding −

def solve(s):
   length = len(s)
   total = length

   z = [0]
   l = 0
   r = 0

   for k in range(1,length):
      if k > r:
         match=0
         index = k
         while index < length:
            if s[index] == s[match]:
               match +=1
               index +=1
            else:
               break
         z.append(match)
         if match > 0:
            total+=match
            l = k
            r = index-1
      else:
         if z[k-l] < (r-k)+1:
            z.append(z[k-l])
            total+=z[k-l]
         else:
            match = r-k
            index = r
            while index < length:
               if s[index] == s[match]:
                  match +=1
                  index +=1
               else:
                  break
            z.append(match)
            total+=match
            l = k
            r = index-1
   return total

s = "pqpqpp"
print(solve(s))

Input

"pqpqpp"

Output

11

Updated on: 07-Oct-2021

221 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements