Count Substrings with even frequency of each character and one exception


In this problem, we will count the number of substrings of the given string containing all characters with even frequency or any single character with odd frequency.

We will use the bitmasking technique to solve the problem. In bitmasking, each bit of the binary string represents the character.

Problem Statement

We have given a string alpha of length N. It is also given that 'a' <= alpha[i] <= 't'. We need to count the number of substrings having all characters with even frequency or only a single character with odd frequency and other characters with even frequency.

Sample Example

Input

alpha = "pqq";

Output

5

Explanation − The valid substrings are pqq, p, q, q, and qq.

Input

alpha = "mnbn";

Output

5

Explanation − The valid substrings are nbn, m, n, b, and n.

Approach 1

This approach uses bitmasking to count the number of substrings containing all characters with even frequency or only one character with odd frequency.

Logic − When we take the XOR of two same integers, it becomes 0.

So, we will traverse the string and take the XOR of each character value with the initial bitmask value. If the same bitmask occurred previously, we could say that the substring contains all characters with even frequency as the difference of mask is 0. Also, we will add and remove each character to count substrings containing a single character with an odd frequency.

Algorithm

  • Step 1 − Initialize the matrix list of size 220.

  • Step 2 − Initialize the 'bitMask' with 0, representing the initial mask, and 'cnt' with 0 to store a number of valid substrings.

  • Step 3 − Initialize the matrix[0] with 1, as an empty string is always valid.

  • Step 4 − Start traversing the string.

  • Step 5 − Left shift the 1 by char value and take its XOR with bitmask value. It means we add the current character to the mask.

  • Step 6 − Add the number of the same bitMask from the matrix list to 'cnt'. For example, in the 'acbbe' string, the bitmask at the 1st and 3rd index becomes the same. So, we can take the 'bb' substring.

  • Step 7 − Make 0 to 20 iterations using the loop. In each iteration, left shift 1 by p, and take its XOR with bitmask to remove the character from the substring. Again add a number of previously occurred the same mask to the 'cnt' variable.

  • Step 8 − Increment the value of the current bitmask in the list.

  • Step 9 − Return the 'cnt' value.

Example

Following are the programs to the above algorithm −

#include <stdio.h>
#include <stdlib.h>

#define MAX_SIZE 220

long long matrix[1 << 20];

long long getSubStrings(char *alpha) {
   long long bitMask = 0, cnt = 0;
   // One possible way for empty mask
   matrix[0] = 1;
   for (int i = 0; alpha[i]; i++) {
      char c = alpha[i];
      // Change the bitmask at char - 'a' value
      bitMask ^= 1 << (c - 'a');
      // Get valid substrings with the same mask
      cnt += matrix[bitMask];
      // Traverse all the possible masks
      for (int p = 0; p < 20; p++) {
         // Change mask and add count of valid substrings to cnt
         cnt += matrix[bitMask ^ (1 << p)];
      }
      // Update frequency of mask
      matrix[bitMask]++;
   }
   return cnt;
}

int main() {
   char alpha[] = "pqq";
   printf("The total number of substrings according to the problem statement is %lld\n", getSubStrings(alpha));
   return 0;
}

Output

The total number of substrings according to the problem statement is 5
#include <bits/stdc++.h>
using namespace std;

vector<int> matrix;
long long getSubStrings(string &alpha) {
   matrix.resize(1 << 20);
   long long bitMask = 0, cnt = 0;
   // One possible way for empty mask
   matrix[0] = 1;
   for (auto c : alpha) {
      // Change the bitmask at char - 'a' value
      bitMask ^= 1 << (c - 'a');
      // Get valid substrings with the same mask
      cnt += matrix[bitMask];
      // Traverse all the possible masks
      for (int p = 0; p < 20; p++) {
         // Change mask and add count of valid substrings to cnt
         cnt += matrix[bitMask ^ (1 << p)];
      }
      // Update frequency of mask
      matrix[bitMask]++;
   }
   return cnt;
}
int main() {
   string alpha = "pqq";
   cout << "The total number of substrings according to the problem statement is " << getSubStrings(alpha);
   return 0;
}

Output

The total number of substrings according to the problem statement is 5
import java.util.Arrays;

public class Main {

   public static void main(String[] args) {
      String alpha = "pqq";
      System.out.println("The total number of substrings according to the problem statement is " + getSubStrings(alpha));
   }

   public static long getSubStrings(String alpha) {
      long[] matrix = new long[1 << 20];
      long bitMask = 0;
      long cnt = 0;
      // One possible way for empty mask
      matrix[0] = 1;
      for (char c : alpha.toCharArray()) {
         // Change the bitmask at char - 'a' value
         bitMask ^= 1 << (c - 'a');
         // Get valid substrings with the same mask
         cnt += matrix[(int) bitMask];
         // Traverse all the possible masks
         for (int p = 0; p < 20; p++) {
            // Change mask and add count of valid substrings to cnt
            cnt += matrix[(int) (bitMask ^ (1 << p))];
         }
         // Update frequency of mask
         matrix[(int) bitMask]++;
      }
      return cnt;
   }
}

Output

The total number of substrings according to the problem statement is 5
def getSubStrings(alpha):
   matrix = [0] * (1 << 20)
   bitMask = 0
   cnt = 0
   # One possible way for empty mask
   matrix[0] = 1
   for c in alpha:
      # Change the bitmask at char - 'a' value
      bitMask ^= 1 << (ord(c) - ord('a'))
      # Get valid substrings with the same mask
      cnt += matrix[bitMask]
      # Traverse all the possible masks
      for p in range(20):
         # Change mask and add count of valid substrings to cnt
         cnt += matrix[bitMask ^ (1 << p)]
      # Update frequency of mask
      matrix[bitMask] += 1
   return cnt

alpha = "pqq"
print("The total number of substrings according to the problem statement is", getSubStrings(alpha))

Output

The total number of substrings according to the problem statement is 5

Time complexity − O(N) for traversing the string.

Space complexity − O(2M), where M is unique characters in the string.

Programmers can solve the problem by taking each substring and checking whether the substring is valid according to the problem statement. However, bitmasking is the best technique to solve this kind of problem.

Updated on: 16-Oct-2023

101 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements