Length of the smallest substring which contains all vowels


One common issue encountered during string manipulation assignments involves identifying the shortest substring containing every vowel atleast once. This task finds its application amongst varied domains such as data analytics, bioinformatics and natural language processing among others.The goal here is finding out which minimum contiguous section within an existing string has each of these five letters (a,e,i,o,u) atleast once.The selection process for resolving this challenge encompasses multitude techniques like implementing sliding window algorithms or incorporating hashing procedures or utilizing regular expressions etcetera.Finding a robust resolution for this problem typically becomes crucial since numerous real-world scenarios demand reliable text manipulation methods.

Methods

There are various methods to find the length of the smallest substring which contains all vowels.

Method 1. Sliding Window Approach

Method 2. Two Pointer Approach

Method 3. Frequency Array Approach

Method 1: Sliding Window Approach

To quickly determine the size of the shortest substring that contains every vowel in each string, use the sliding window approach. The method makes use of two pointers, commonly known as "left" and "right," to produce a sliding window that slides down the string.

Syntax

Here's the syntax of the Sliding Window approach to finding the length of the smallest substring which contains all vowels −

def find_smallest_substring(string):
   vowels = {'a', 'e', 'i', 'o', 'u'}
   unique_vowels = set()
   start = 0
   end = 0
   min_length = float('inf')
    
   while end < len(string):
      # Expand the window
      if string[end] in vowels:
         unique_vowels.add(string[end])
        
      # Contract the window
      while len(unique_vowels) == len(vowels):
         min_length = min(min_length, end - start + 1)
         if string[start] in vowels:
         unique_vowels.remove(string[start])
         start += 1
        
       end += 1
    
   return min_length

Algorithm

Step 1 − Make a sliding window with an underlying size of n (the length of the string), and then move it from left to right.

Step 2 − At each location in the window, make sure the substring is entirely composed of vowels. Update the minimum length of the substring discovered thus far if it does.

Step 3 − Use a hash table to keep record of the repeatition of each vowel in the substring to find whether the substring contains all of the vowels.

Step 4 − Continue the process until all potential substrings have been tested if the substring does not contain all vowels by moving the window to the right and repeating the process.

Example 1

To determine whether a given character is a vowel in this implementation, we define the helper function isVowel. To depict the sliding window, we also utilise two pointers to the left and right.

If the current character is a vowel, we first expand the window by adding it to the window set within the while loop. The size of the window set is then verified to be 5 (i.e., all vowels are present). If so, we alter the response and reduce the size of the window by eliminating the leftmost character from the window set until it is less than 5.

The length of the smallest substring containing all vowels is returned in the loop's result.

#include <iostream>
#include <unordered_set>
using namespace std;

bool isVowel(char c) {
   return c == 'a' || c == 'e' || c == 'i' || c == 'o' || c == 'u';
}
int smallestSubstring(string s) {
   unordered_set<char> vowels = {'a', 'e', 'i', 'o', 'u'};
   unordered_set<char> window;
   int n = s.length(), left = 0, right = 0, ans = n + 1;
    
   while (right < n) {
      // Expand the window by adding the current character
      char c = s[right];
      if (isVowel(c)) {
         window.insert(c);
      } 
      right++;
        
      // close the window by removing the leftmost character
      while (window.size() == 5) {
         ans = min(ans, right - left);
         char d = s[left];
         if (isVowel(d)) {
            window.erase(d);
         }
         left++;
      }
   }
   return ans <= n ? ans : 0;
}

int main() {
   string s = "aeeioubcdfuioaei";
   int len = smallestSubstring(s);
   cout << "Length of smallest substring containing all vowels: " << len << endl;
   return 0;
}

Output

Length of smallest substring containing all vowels: 6

Method 2: Two Pointer Approach

The Two Pointer approach is a well-liked method for quickly resolving a variety of string manipulation issues. The Two Pointer technique can be very helpful in determining the length of the smallest substring that contains all vowels.

Syntax

Here's the syntax of Two Pointer Approach to find length of the smallest substring that contains all vowels −

function findSmallestSubstring(str):
   vowels = {'a', 'e', 'i', 'o', 'u'}
   count = 0
   left = 0
   minLength = infinity

   for right in range(length of str):
      if str[right] is a vowel:
         count += 1

       while count is same as the total number of vowels:
         minLength = minimum (minLength, right - left + 1)

         if str[left] is a vowel:
         count -= 1

         left += 1

   return minLength

Algorithm

Step 1 − Set up the start and end pointers, which point to start of the string, respectively.

Step 2 − Continue to move the end pointer to right until a substring that contains only vowels is discovered.

Step 3 − If we locate a substring that contains all vowels, move the start cursor to the right until it no longer does.

Step 4 − Continue moving end pointer to the right until a new substring is discovered that contains all vowels, then move the start pointer to the right until the substring no longer does.

Step 5 − Refresh the shortest substring length thus far.

Example 2

To represent the sliding window in this example, we retain two pointers, left and right. From left to right, we iterate through the string str, checking each time whether the current character is a vowel. In order to maintain track of the vowels observed thus far, we add it to the set viewed if it is.

We move the left cursor to reduce the length of the substring once seen contains all of the vowels. This procedure is carried on until the right pointer reaches the string's end.

The length of the shortest substring that contains all vowels is then returned. In the not presence of such a substring, we return 0.

#include <iostream>
#include <string>
#include <unordered_set>
using namespace std;

int smallestSubstringLength(const string& str) {
   int n = str.length();
   unordered_set<char> vowels = {'a', 'e', 'i', 'o', 'u'};

   unordered_set<char> seen;
   int left = 0, right = 0;
   int smallestLength = n + 1;

   while (right < n) {
      if (vowels.find(str[right]) != vowels.end()) {
         seen.insert(str[right]);
      }

      if (seen.size() == vowels.size()) {
         while (seen.size() == vowels.size()) {
            if (right - left + 1 < smallestLength) {
               smallestLength = right - left + 1;
            }

            if (vowels.find(str[left]) != vowels.end()) {
               seen.erase(str[left]);
            }

            left++;
         }
      }
      right++;
   }
   return (smallestLength == n + 1) ? 0 : smallestLength;
}

int main() {
   string str = "aeeiiouuobcdaeiou";
   int length = smallestSubstringLength(str);
   cout << "Length of the smallest substring containing all vowels: " << length << endl;
   return 0;
}

Output

Length of the smallest substring containing all vowels: 7

Method 3. Frequency Array Approach

The shortest substring that contains all the vowels in each string is measured using the Frequency Array Approach. It requires building a frequency array to keep record of the appearance of vowels and then repeatedly iterating through the text to locate the required substring.

Syntax

The syntax for finding length of smallest substring that contains all vowels is as follows −

# Check if all vowels are present in the current substring
if all(freq[vowel] > 0 for vowel in vowels):
   # Update the minimum length if needed
   min_length = min(min_length, right - left + 1)
    
   # Move the left pointer to find a potentially smaller substring
   while left < right:
      freq[input_string[left]] -= 1
      if freq[input_string[left]] == 0:
      break
      left += 1

# Move the right pointer to expand the current substring
right += 1

Algorithm

Step 1 − To keep record of the repetition of each vowel (a, e, i, o, u), start with a frequency array of size 5.

Step 2 − Make start and end pointers, which highlight the beginning of the string, individually.

Step 3 − Continue to move the end pointer to the right till every vowel has been heard at least once.

Step 4 − Move start pointer to the right until the substring no longer contains all vowels after at least one repetition of each vowel.

Step 5 − Adjust the minimum length of the substring that has far been identified, and then shift end pointer to the right until a new substring that contains all vowels is discovered.

Step 6 − Update the frequency array at each location to verify that the current substring contains all vowels.

Example 3

In this example, function min Length Substring takes a string as input and calculates length of the smallest substring which contains all five vowels (a, e, i, o, u).

The function counts each vowel in the substring using a frequency array called vowelCount. It keeps track of the number of distinct vowels in the substring by maintaining a count distinctVowels.

With the use of two pointers, start and finish, the function loops through the string, increasing the frequency array's vowelCount for each vowel it encounters. Once every distinct vowel has been located, the substring begins to shrink from the starting place until no distinct vowel remains. If shorter substring is discovered, minimum length of the substring is updated.

The main function make the use of string to show how to use the min Length Substring method by inputing the length of the shortest substring which contains all vowels.

#include <iostream>
#include <climits>
using namespace std;

int minLengthSubstring(const string& str) {
   const string vowels = "aeiou";
   int vowelCount[5] = {0};  // Frequency array for vowels
   int distinctVowels = 0;  // Count of distinct vowels in the substring

   // Initialize the minimum length to maximum integer value
   int minLength = INT_MAX;

   int start = 0, end = 0;
   while (end < str.length()) {
      // Increment frequency for vowel at 'end' position
      for (int i = 0; i < 5; i++) {
         if (str[end] == vowels[i]) {
            if (vowelCount[i] == 0) {
               distinctVowels++;
            }
            vowelCount[i]++;
            break;
         }
      }

      // If all distinct vowels are found
      if (distinctVowels == 5) {

         while (start < end) {
            // Update minimum length if a shorter substring is found
            if (minLength > end - start + 1) {
               minLength = end - start + 1;
            }

            // Decrement frequency for vowel at 'start' position
               for (int i = 0; i < 5; i++) {
               if (str[start] == vowels[i]) {
                  vowelCount[i]--;
                  if (vowelCount[i] == 0) {
                     distinctVowels--;
                  }
                  break;
               }
            }
            start++;

            // Break if any distinct vowel is missing in the substring
            if (distinctVowels < 5) {
               break;
            }
         }
      }

      end++;
   }

   return minLength == INT_MAX ? -1 : minLength;
}

int main() {
   string str = "aeeioubcdofu";
   int length = minLengthSubstring(str);

   if (length == -1) {
      cout << "No substring containing all vowels found." << endl;
   } else {
      cout << "Length of the smallest substring containing all vowels: " << length << endl;
   }
   return 0;
}

Output

Length of the smallest substring containing all vowels: 6

Conclusion

In conclusion, finding the length of the smallest substring that contains all vowels is a problem that can be efficiently solved using various techniques. By employing a sliding window approach or hashing the occurrence of vowels, it is possible to iterate through the string and identify the smallest substring that meets the requirements. The time complexity of these methods is typically linear, making them suitable for large inputs. However, it is important to handle edge cases and consider additional constraints that may affect the solution. Overall, with the right algorithmic approach, it is possible to determine the length of the smallest substring containing all vowels effectively.

Updated on: 31-Jul-2023

106 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements