Program to find number of different substrings of a string for different queries in Python


Suppose we have a string s whose length is n. We also have a list of queries Q, where Q[i] contains a pair (l, r). For each query we have to count number of different substrings of s in the inclusive range between l and r.

So, if the input is like s = "ppqpp" Q = [(1,1),(1,4),(1,1),(0,2)], then the output will be [1,8,1,5] because

  • For query (1, 1) the only substring is 'p' so output is 1

  • For query (1, 4) the substrings are 'p', 'q', 'pq', 'qp', 'pp', 'pqp', 'qpp' and 'pqpp', so output is 8

  • Again for query (1, 1) the only substring is 'p' so output is 1

  • For query (0, 2) the substrings are 'p', 'q', 'pp', 'pq', 'ppq', so the output is 8.

To solve this, we will follow these steps −

  • Define a function kasai() . This will take s, suff, n
  • lcp := an array of size n and fill with 0
  • inv := an array of size n and fill with 0
  • for i in range 0 to n - 1, do
    • inv[suff [i]] := i
  • k := 0
  • for i in range 0 to n - 1, do
    • if inv [i] is same as n-1, then
      • k := 0
      • go for next iteration
    • j := suff[inv [i] + 1]
    • while i + k < n and j + k < n and s[i + k] is same as s[j + k], do
      • k := k + 1
    • lcp[inv [i]] := k
    • if k > 0, then
      • k := k - 1
  • return lcp
  • From the main method, do the following −
  • res := a new list
  • for i in range 0 to size of Q - 1, do
    • (left, right) := Q[i]
    • sub := substring of s from index left to right
    • length := right-left + 1
    • suffix := a list of pairs (i, substring of sub from index i to end) for each i in range 0 to length - 1
    • sort suffix based on the second item of the pair substring
  • (suff, suffix) = the pairs of indices and corresponding substrings from suffix
    • lcp := kasai(sub, suff, length)
    • count := size of suffix[0]
    • for i in range 0 to length-2, do
      • count := count + size of suffix[i + 1] - lcp[i]
    • insert count at the end of res
  • return res

Example

Let us see the following implementation to get better understanding −

def kasai (s, suff, n):
   lcp = [0] * n
   inv = [0] * n
   for i in range (n):
      inv [suff [i]] = i
   k = 0
   for i in range (n):
      if inv [i] == n-1:
         k = 0
         continue
      j = suff [inv [i] + 1]
      while i + k <n and j + k <n and s [i + k] == s [j + k]:
         k += 1
      lcp [inv [i]] = k
      if k> 0:
         k -= 1
   return lcp

def solve(s, Q):
   res = []
   for i in range (len(Q)):
      left, right = Q[i]
      sub = s [left: right + 1]
      length = right-left + 1

      suffix = [[i, sub [i:]] for i in range (length)]

      suffix.sort (key = lambda x: x [1])
      suff, suffix = [list (t) for t in zip (* suffix)]

      lcp = kasai (sub, suff, length)
      count = len (suffix [0])
      for i in range (length-1):
         count += len (suffix [i + 1]) - lcp [i]

      res.append(count)
   return res

s = "pptpp"
Q = [(1,1),(1,4),(1,1),(0,2)]
print(solve(s, Q))

Input

"pptpp", [(1,1),(1,4),(1,1),(0,2)]

Output

[1, 8, 1, 5]

Updated on: 11-Oct-2021

192 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements