Find the smallest window in a string containing all characters of another string in Python


Suppose we have two strings s1 and s2, we have to find the smallest substring in s1 such that all characters of s2 will be used efficiently.

So, if the input is like s1 = "I am a student", s2 = "mdn", then the output will be "m a studen"

To solve this, we will follow these steps −

  • N := 26

  • str_len := size of main_str, patt_len := size of pattern

  • if str_len < patt_len, then

    • return None

  • hash_pat := an array of size N and fill with 0

  • hash_str := an array of size N and fill with 0

  • for i in range 0 to patt_len, do

    • hash_pat[ASCII of(pattern[i]) ] := hash_pat[ASCII of(pattern[i]) ] + 1

  • start := 0, start_index := -1, min_len := inf

  • count := 0

  • for j in range 0 to str_len, do

    • hash_str[ASCII of(main_str[j]) ] := hash_str[ASCII of(main_str[j]) ] + 1

    • if hash_pat[ASCII of(main_str[j]) ] is not same as 0 and hash_str[ASCII of(main_str[j]) ] <= hash_pat[ASCII of(main_str[j]) ], then

      • count := count + 1

    • if count is same as patt_len, then

      • while hash_str[ASCII of(main_str[start]) ] > hash_pat[ASCII of(main_str[start]) ] or hash_pat[ASCII of(main_str[start]) ] is same as 0, do

        • if hash_str[ASCII of(main_str[start])] > hash_pat[ASCII of(main_str[start])], then

          • hash_str[ASCII of(main_str[start]) ] := hash_str[ASCII of(main_str[start]) ] - 1

        • start := start + 1

      • len_window := j - start + 1

      • if min_len > len_window, then

        • min_len := len_window

        • start_index := start

  • if start_index is same as -1, then

    • return None

  • return substring of main_str[from index start_index to start_index + min_len]

Example

Let us see the following implementation to get better understanding −

N = 256
def get_pattern(main_str, pattern):
   str_len = len(main_str)
   patt_len = len(pattern)
   if str_len < patt_len:
      return None
   hash_pat = [0] * N
   hash_str = [0] * N
   for i in range(0, patt_len):
      hash_pat[ord(pattern[i])] += 1
   start, start_index, min_len = 0, -1, float('inf')
   count = 0
   for j in range(0, str_len):
      hash_str[ord(main_str[j])] += 1

      if (hash_pat[ord(main_str[j])] != 0 and hash_str[ord(main_str[j])] <= hash_pat[ord(main_str[j])]):
         count += 1
      if count == patt_len:
         while (hash_str[ord(main_str[start])] > hash_pat[ord(main_str[start])] or hash_pat[ord(main_str[start])] == 0):
      if (hash_str[ord(main_str[start])] > hash_pat[ord(main_str[start])]):
         hash_str[ord(main_str[start])] -= 1
         start += 1
      len_window = j - start + 1
      if min_len > len_window:
         min_len = len_window
         start_index = start
   if start_index == -1:
      return None
   return main_str[start_index : start_index + min_len]
main_str = "I am a student"
pattern = "mdn"
print(get_pattern(main_str, pattern))

Input

"I am a student", "mdn"

Output

m a studen

Updated on: 27-Aug-2020

318 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements