Program to find out the substrings of given strings at given positions in a set of all possible substrings in python


Suppose we are provided n number of strings; str1, str2, str3,.....,strn. Now, let's suppose that substri is a set that contains all the substrings of stri. The union of all the substr sets is substr_union. We now are given q number of queries, and we have to find the q-th element of the set substr_union. The set substr_union is lexicographically sorted and the indexes start from 1.

So, if the input is like list of strings are = ['pqr', 'pqt'], queries are = [4, 7, 9], then the output will be ['pqt', 'qt', 't']

The substrings from the first string are subs_str_1 = {p, pq, pqr, q, qr, r }, sub_str_2 = {p, pq, pqt, q, qt, t}.

The union of these two sets, or substr_union is {p, pq, pqr, pqt, q, qr, qt, r, t}.

So the items at index 4, 7, and 9 are 'pqt', qt', and 't' respectively.

To solve this, we will follow these steps −

  • Define a function lng_i() . This will take suff, lng, i
    • d := a new tuple containing (suff, lng)
    • lo := 0
    • hi := 0
    • for each tuple (suf, lng) in d, do
      • if lng is similar as null, then
        • lng := 0
      • hi := hi + size of suf - lng
      • if hi - 1 is same as i, then
        • return suf
      • otherwise when hi - 1 > i, then
        • for index p and item q in list of values from lng to size of suf, do
          • if lo + p is same as i, then
            • return suf[from index 0 to j+1]
      • lo := hi
    • return False
  • Define a function hlp_ii() . This will take str1,str2
    • ub := minimum of size of str1 , size of str2
    • cnt := 0
    • for i in range 0 to ub, do
      • if str1[i] is same as str2[i], then
        • cnt := cnt + 1
      • otherwise,
        • return cnt
      • return cnt
  • t_dict := a new map
  • suff := a new list
  • lng := a new list
  • for each str in strings, do
    • for i in range 0 to size of str, do

      • value := str[from index i to end]
      • if value is not present in t_dict, then
        • t_dict[value] := 1
        • insert value at the end of suff
  • sort the list suff
  • suff_len := size of suff
  • for i in range 0 to size of suff_len, do
    • if i is same as 0, then
      • insert null at the end of lng
    • otherwise,
      • insert hlp_ii(suff[i-1], suff[i]) at the end of lng
  • res := a new list
  • for each q in q_list, do
    • insert (lng_i(suff, lng, q-1)) at the end of res
  • return res

Example

Let us see the following implementation to get better understanding −

def lng_i(suff, lng, i):
   d = zip(suff,lng)
   lo = hi = 0
   for suf, lng in d:
      if lng is None:
         lng = 0
      hi += len(suf) - lng
      if hi - 1 == i:
         return suf
      elif hi - 1 > i:
         for p, q in enumerate(list(range(lng, len(suf)))):
            if lo + p == i:
               return suf[:q+1]
      lo = hi
   return False

def hlp_ii(str1,str2):
   ub = min(len(str1), len(str2))
   cnt = 0
   for i in range(ub):
      if str1[i] == str2[i]:
         cnt += 1
      else:
         return cnt
   return cnt

def solve(strings,q_list):
   t_dict = {}
   suff = []
   lng = []
   for str in strings:
      for i in range(len(str)):
         value = str[i:]
         if value not in t_dict:
            t_dict[value] = 1
            suff.append(value)
   suff.sort()
   suff_len = len(suff)
   for i in range(suff_len):
      if i == 0:
         lng.append(None)
      else:
         lng.append(hlp_ii(suff[i-1], suff[i]))
   res = []
   for q in q_list:
      (res.append(lng_i(suff, lng, q-1)))
   return res

print(solve(['pqr', 'pqt'], [4, 7, 9]))

Input

['pqr', 'pqt'], [4, 7, 9]

Output

['pqt', 'qt', 't']

Updated on: 11-Oct-2021

103 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements