Count Substrings with at least one occurrence of first K alphabet


In this problem, we need to count substrings containing a minimum 1 occurrence of all K characters.

Here, we will use two different approaches to solve the problem. The first approach takes all substrings of the given string, checks whether the substring contains all K characters, and counts such substrings containing all K characters.

The second approach uses the sliding window technique to solve the problem.

Problem statement - We have given a string alpha containing N characters. Also, we have given K, representing the string containing multiple occurrences of only the first K alphabetical characters. We need to count total substrings containing the minimum 1 occurrence of all K characters.

Sample examples

Input

alpha = "abcdda", K = 4;

Output

4

Explanation - The substrings containing all 4 characters are 'abcd', 'abcdd', 'abcdda', and 'bcdda'.

Input

alpha = "abc", K = 5

Output

0

Explanation - There is no substring of the given string containing all 5 characters.

Input

alpha = "ccbba"; K = 3;

Output

2

Explanation - The strings 'ccbba' and 'cbba' contains all 3 characters.

Approach 1

In this approach, we will traverse the string to get all substrings and store them in the list. After that, we will count the number of strings containing all K characters at least once from the list of strings.

Algorithm

Step 1 - Initialize the 'subStr' list to store all substrings.

Step 2 - Start traversing the string. Also, use the nested loop to make iterations from 1 to string size - p.

Step 3 - Use the substr() method to get the substring from the pth index and length equal to q. Also, push the substring to the 'subStr' list.

Step 4 - Initialize the 'result' with 0 to store the count of valid substrings.

Step 5 - Start traversing the list of substrings, and define the 'freq' map to store the frequency of characters in the current string. Also, initialize the 'chars' with 0 to count distinct characters in the string.

Step 6 - Start traversing the current string. If the frequency of the character is 0 in the map, update the frequency, and increment the 'chars' value by 1.

Step 7 - If the value of chars is equal to K, increment the 'result' by 1.

Step 8 - Return the result value.

Example

#include <bits/stdc++.h>
using namespace std;

int numberOfSubstrings(string alpha, int K) {
   // Finding all substrings of the alpha
   vector<string> subStr;
   for (int p = 0; p < alpha.size(); p++) {
      for (int q = 1; q <= alpha.size() - p; q++) {
         // Get substring from p to q
         subStr.push_back(alpha.substr(p, q));
      }
   }
   // Counting the number of substrings containing all K characters
   int result = 0;
   for (int p = 0; p < subStr.size(); p++) {
      // To store the frequency of characters in the current substring
      map<char, int> freq;
      // To store the totally different characters
      int chars = 0;
      // Traverse substring
      for (int q = 0; q < subStr[p].size(); q++) {
         // If a character does not exist in the map, increment chars
         if (freq[subStr[p][q]] == 0) {
            freq[subStr[p][q]]++;
            chars++;
         }
      }
      // If different chars are the same as K, the string is valid.
      if (chars == K) {
         result++;
      }
   }
   return result;
}
int main() {
   string alpha = "abcdda";
   int K = 4;
   cout << "The number of substrings containing all K characters at least once is " << numberOfSubstrings(alpha, K);
   return 0;
}

Output

The number of substrings containing all K characters at least once is 4

Time complexity - O(N*N*M), where O(N*N) is to get all substrings, and O(M) is to traverse the string.

Space complexity - O(N*N) to store all substrings.

Approach 2

In this approach, we will use the sliding window technique to count the number of substrings containing all K characters at least once.

Algorithm

Step 1 - Initialize the 'freq' map to store the character frequency, 'left' and 'right' pointers with 0, representing the sliding window pointer, the 'len' with the string length, and 'cnt' with 0 to store the count of strings.

Step 2 - Make iterations until 'right' is less than the string length.

Step 3 - Increment the character frequency for the character which is at the 'right' index in the map.

Step 4 - If the size of the map is equal to K, it means we got the substring containing all K characters. So, follow the below steps.

Step 4.1 - Make iterations till size of the map is equal to K.

Step 4.2 - Add 'len - right' to the 'cnt'.

Step 4.3 - To remove the left character from the current window, decrease its frequency in the map. If the frequency of the character in the map is 0, remove the character from the map.

Step 4.4 - Increment the 'left' pointer in the nested loop.

Step 4.5 - Increment the 'right' pointer in the main loop.

Step 5 - Else, increment the 'right' pointer value.

Step 6 - Return the 'cnt' value.

Example

Let's understand how sliding window techniques works to solve the problem by sample input.

Input - 'abcdaabc', K = 4

  • The first window will be 'abcd' containing all 4 characters. So, we can say that (0, 3), (0, 4), (0, 5), (0, 6), and (0, 7) all substring contains all K characters.

  • After that, the next window from 1 to 4 contains all characters. So, (1, 4), (1,5), (1, 6), and (1, 7) all substrings contain all the K characters at least once.

  • This way, we can count the number of substrings for each valid window.

#include <bits/stdc++.h>
using namespace std;

int numberOfSubstrings(string alpha, int K) {
   // For storing the frequency of characters
   unordered_map<char, int> freq;
   int left = 0, right = 0, len = alpha.size(), cnt = 0;
   // Traveres the string
   while (right < len) {
     // Update character frequency
     freq[alpha[right]]++;
     // If the size of the map contains all K characters
     if (freq.size() == K) {
       // Traverse the map until the map size is k
       while (freq.size() == K) {
         // Add all valid substrings.
         // If (left, right) contains all K characters, (left, right + 1), (left + right + 2), ...  also contains.
         cnt += len - right;
         // Update character frequency
         freq[alpha[left]]--;
         // Remove the character if its frequency is 0.
         if (freq[alpha[left]] == 0)
            freq.erase(alpha[left]);
         // Move to the next character from left
         left++;
       }
       // Increment the right pointer.
       right++;
     }
     // Increment right by 1
     else {
       right++;
     }
   }
   // Return the value of cnt
   return cnt;
}
int main() {
   string alpha = "abcdda";
   int K = 4;
   cout << "The number of substrings containing all K characters at least once is " << numberOfSubstrings(alpha, K);
   return 0;
}

Output

The number of substrings containing all K characters at least once is 4

Time complexity - O(N), for the sliding window.

Space complexity - O(K), to store the frequency in the map.

Programmers can also use the array to store the frequency of characters instead of the map, as we can access the elements from the array in less time than the map. For more practice, programmers try to solve the problem in which we need to count the number of strings containing all K characters only once.

Updated on: 29-Aug-2023

46 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements