Program to find maximum number of non-overlapping substrings in Python

Suppose we have a string s with only lowercase letters, we have to find find the maximum number of non-empty substrings of s that meet the following rules

  • Substrings are non-overlapping

  • A substring that contains a specific character ch must also contain all occurrences of ch.

We have to find the maximum number of substrings that meet these two conditions. If there are more than one such solutions with the same number of substrings, then return that with minimum total length.

So, if the input is like s = "pqstpqqprrr", then the output will be ["s","t","rrr"] because all of the possible substrings that meet the conditions are [ "pqstpqqprrr", "pqstpqqp", "st", "s", "t", "rrr"]

To solve this, we will follow these steps −

  • right := sort the list of index position of all individual character ch from right of s

  • left := a list of indices of the characters s[i] for all i in right

  • has := an empty list, gen := an empty list

  • for i in range 0 to size of right - 1, do

    • insert a new set of characters from s[right[i]] at the end of gen

    • insert a new set of characters from substring of s[from index (left[i] + 1 to right[i]-1] - last item of gen) at the end of has

    • for j in range size of has - 2 to 0, decrease by 1, do

      • if (last item of has AND gen[j]) and (has[j] AND last item of gen) is non-zero, then

        • last item of gen := last item of gen OR gen[j]

        • last item of has :=(last item of has OR has[j]) - last item of gen

        • remove has[j], gen[j]

  • res := a new list, p_right := -1

  • for ind in range 0 to size of has - 1 , do

    • l := minimum of the list of elements i for all i in left if s[i] is present in gen[ind]

    • r := maximum of the list of elements i for all i in right if s[i] in gen[ind]]

    • if p_right < l, then

      • insert substring of s[from index l to r] at the end of res

      • p_right := r

  • return res


Let us see the following implementation to get better understanding

def solve(s):
   right = sorted([s.rindex(ch) for ch in set(s)])
   left = [s.index(s[i]) for i in right]
   has, gen = [], []
   for i in range(len(right)):
      has.append(set(s[left[i] + 1:right[i]]) - gen[-1])

   for j in range(len(has) - 2, -1, -1):
      if (has[-1] & gen[j]) and (has[j] & gen[-1]):
         gen[-1] = gen[-1] | gen[j]
         has[-1] = (has[-1] | has[j]) - gen[-1]
         del has[j], gen[j]

   res, p_right = [], -1
   for ind in range(len(has)):
      l = min([i for i in left if s[i] in gen[ind]])
      r = max([i for i in right if s[i] in gen[ind]])
      if p_right < l:
         res.append(s[l : r + 1])
         p_right = r

   return res

s = "pqstpqqprrr"




['s', 't', 'rrr']