Find the smallest window in a string containing all characters of another string in Python

PythonServer Side ProgrammingProgramming

Suppose we have two strings s1 and s2, we have to find the smallest substring in s1 such that all characters of s2 will be used efficiently.

So, if the input is like s1 = "I am a student", s2 = "mdn", then the output will be "m a studen"

To solve this, we will follow these steps −

  • N := 26

  • str_len := size of main_str, patt_len := size of pattern

  • if str_len < patt_len, then

    • return None

  • hash_pat := an array of size N and fill with 0

  • hash_str := an array of size N and fill with 0

  • for i in range 0 to patt_len, do

    • hash_pat[ASCII of(pattern[i]) ] := hash_pat[ASCII of(pattern[i]) ] + 1

  • start := 0, start_index := -1, min_len := inf

  • count := 0

  • for j in range 0 to str_len, do

    • hash_str[ASCII of(main_str[j]) ] := hash_str[ASCII of(main_str[j]) ] + 1

    • if hash_pat[ASCII of(main_str[j]) ] is not same as 0 and hash_str[ASCII of(main_str[j]) ] <= hash_pat[ASCII of(main_str[j]) ], then

      • count := count + 1

    • if count is same as patt_len, then

      • while hash_str[ASCII of(main_str[start]) ] > hash_pat[ASCII of(main_str[start]) ] or hash_pat[ASCII of(main_str[start]) ] is same as 0, do

        • if hash_str[ASCII of(main_str[start])] > hash_pat[ASCII of(main_str[start])], then

          • hash_str[ASCII of(main_str[start]) ] := hash_str[ASCII of(main_str[start]) ] - 1

        • start := start + 1

      • len_window := j - start + 1

      • if min_len > len_window, then

        • min_len := len_window

        • start_index := start

  • if start_index is same as -1, then

    • return None

  • return substring of main_str[from index start_index to start_index + min_len]

Example

Let us see the following implementation to get better understanding −

N = 256
def get_pattern(main_str, pattern):
   str_len = len(main_str)
   patt_len = len(pattern)
   if str_len < patt_len:
      return None
   hash_pat = [0] * N
   hash_str = [0] * N
   for i in range(0, patt_len):
      hash_pat[ord(pattern[i])] += 1
   start, start_index, min_len = 0, -1, float('inf')
   count = 0
   for j in range(0, str_len):
      hash_str[ord(main_str[j])] += 1

      if (hash_pat[ord(main_str[j])] != 0 and hash_str[ord(main_str[j])] <= hash_pat[ord(main_str[j])]):
         count += 1
      if count == patt_len:
         while (hash_str[ord(main_str[start])] > hash_pat[ord(main_str[start])] or hash_pat[ord(main_str[start])] == 0):
      if (hash_str[ord(main_str[start])] > hash_pat[ord(main_str[start])]):
         hash_str[ord(main_str[start])] -= 1
         start += 1
      len_window = j - start + 1
      if min_len > len_window:
         min_len = len_window
         start_index = start
   if start_index == -1:
      return None
   return main_str[start_index : start_index + min_len]
main_str = "I am a student"
pattern = "mdn"
print(get_pattern(main_str, pattern))

Input

"I am a student", "mdn"

Output

m a studen
raja
Published on 27-Aug-2020 15:37:14
Advertisements