Queries to find Kth Greatest Character In A Range [L,R] From A String With Updates


The Fenwick Tree is a type of data structure, which enables range updates and range searches with O(log n) time complexity, also called as binary indexed tree (BIT)

The fundamental concept is to keep frequency array for every letter in string, with frequency of i-th character being recorded at index i in frequency array. The frequency array can then allow range updates and range queries using Fenwick Tree.

Problem Approach

You can use following queries to extract Kth biggest character from string with updates in range [L, R] −

  • Build segment tree − Create segment tree first, in which each character's frequency in string is stored. A frequency array containing the frequency of each letter in that range is stored in each node of segment tree, which represents range of indices in string.

  • Update − By decreasing frequency of some previous character and increasing frequency of new character, you can update character in string by updating matching leaf node in segment tree.

  • Kth greatest character search − Begin at segment tree's root and recursively go to relevant range of indices [L, R] to locate Kth greatest character in that range. The Kth greatest character in that range can be found at each node using modified Binary Search.

  • Time Complexity − It is O (log n), here n is Length of String. The segment tree has an O(n) space complexity.

Syntax

Assuming that string is initially given and it can be updated, queries are to find k th greatest character in interval [L, R] of string, following syntax can be used −

1. To initialize string −

string str = "initial_string";

2. To update string at index −

str[index] = new_character;

3. To find k th greatest character in interval [P, T] −

// initialize frequency array of size 26 
int freq[26] = {0};

// count the frequency of each character in the range
for (int i = P; i <= T; i++) {
   freq[str[i] - 'a']++;
}

// find k th greatest character
int cont = 0;
for (int i = 25; i >= 0; i--) {
   cont += freq[i];
   if (cont >= k) {
      return (char) (i + 'a');
   }
}

// if k th is larger than total no. of different characters in interval,

// give special character or throw exception

Algorithm

Algorithm to find the K th largest character in an interval [L, R] from given with some updates −

  • Step 1 − Initialize Array A of size 26, where each element A[i] represents count of the i-th character (0-indexed) in string.

  • Step 2 − Traverse the string S from left to right and update the count of each character in array A.

  • Step 3 − To handle updates, maintain separate array B of same size as A, initialized to zero.

  • Step 4 − Whenever update operation is performed, add difference between new and old character counts to corresponding element in B.

  • Step 5 − To find K th greatest character in interval [L, R], calculate cumulative sum of A and B up to index R, and subtract cumulative sum of A and B up to index L-1. This gives the count of each character in range [L, R] after applying updates.

  • Step 6 − Sort characters in range [L, R] in decreasing order of their count.

  • Step 7 − Return K th character in sorted order.

Approaches to Follow

Approach-1

In this example, the string "abacaba" is used as the initial string. The build function initializes the segment tree by counting the occurrence of each character in the string. The update function updates the string and the segment tree by first decrementing the count of the old character and then incrementing the count of the new character. The query function returns the k th greatest character in [L,R] using binary search.

Example-1

#include<bits/stdc++.h>
using namespace std;

const int N = 1e5+5;

struct NODE {
   int E, F, cnt[26];
} tree[4*N];

string W;

void build(int X, int E, int F) {
   tree[X].E = E, tree[X].F = F;
   if(E == F) {
      tree[X].cnt[W[E]-'a']++;
      return;
   }
   int mid = (E+F)/2;
   build(2*X, E, mid);
   build(2*X+1, mid+1, F);
   for(int i=0; i<26; i++) {
      tree[X].cnt[i] = tree[2*X].cnt[i] + tree[2*X+1].cnt[i];
   }
}

void update(int X, int E, int F, int idx, char ch) {
   if(E == F) {
      tree[X].cnt[W[E]-'a']--;
      W[E] = ch;
      tree[X].cnt[W[E]-'a']++;
      return;
   }
   int mid = (E+F)/2;
   if(idx <= mid) {
      update(2*X, E, mid, idx, ch);
   } else {
      update(2*X+1, mid+1, F, idx, ch);
   }
   for(int i=0; i<26; i++) {
      tree[X].cnt[i] = tree[2*X].cnt[i] + tree[2*X+1].cnt[i];
   }
}

int QUERY(int X, int E, int F, int k) {
   if(E == F) {
      return E;
   }
   int mid = (E+F)/2;
   int cnt = 0;
   for(int i=0; i<26; i++) {
      cnt += tree[2*X].cnt[i];
   }
   if(k <= cnt) {
      return QUERY(2*X, E, mid, k);
   } else {
      return QUERY(2*X+1, mid+1, F, k-cnt);
   }
}

int main() {
   W = "abacaba";
   int n = W.length();
   build(1, 0, n-1);

   cout << W << endl;

   update(1, 0, n-1, 4, 'd');

   cout << W << endl;

   int P = 5;
   int Q = 2;
   int R = 6;
   cout << QUERY(1, 0, n-1, R) << endl;
   cout << QUERY(1, 0, n-1, Q+P-1) << endl;
   return 0;
}

Output

abacaba
abacdba
5
5

Approach-2

This program first initializes a 2D array freq of size N x 26, where freq[i][j] represents the frequency of the j-th character in the prefix of the string s up to the i-th index. Then, for each index i, we update the freq array by incrementing the count of the character at i-th index and adding the counts of all previous characters.

After initializing the freq array, we perform two queries. In each query, we calculate the count of characters in the range [L, R] by subtracting the counts of the characters up to index L-1 from the counts to index R. We then iterate through the character frequencies from 0 to 25, keeping track of the count of characters seen so far. When we reach the Kth greatest character, we store its index and break out of the loop. Finally, we print the character corresponding to the stored index.

In between the queries, we update the string by changing the character at index 4 to 'a'. To update the freq array efficiently, we update the counts of the new and old characters at the corresponding indices, and then recalculate the counts of all subsequent characters using the updated prefix sums.

Example-1

#include <bits/stdc++.h>
using namespace std;

const int N = 1e5+5;
int Freq[N][26];

int main() {
   ios_base::sync_with_stdio(false);
   cin.tie(nullptr);

   string Y = "programming code";
   int U = Y.size();

   for (int i = 0; i < U; i++) {
      Freq[i+1][Y[i]-'a']++;
      for (int j = 0; j < 26; j++) {
         Freq[i+1][j] += Freq[i][j];
      }
   }

   int Q = 2;
   while (Q--) {
      int l = 2, r = 9, k = 3;
      int cont = 0, ans;
      for (int i = 0; i < 26; i++) {
         cont += Freq[r][i] - Freq[l-1][i];
         if (cont >= k) {
            ans = i;
            break;
         }
      }
      cout << "The " << k << "rd greatest character in range [" << l << "," << r << "] is " << char(ans+'a') << "\n";

      Y[4] = 'a'; // update
      for (int i = 4; i < U; i++) {
         Freq[i+1][Y[i]-'a']++;
         Freq[i+1][Y[i-4]-'a']--;
         for (int j = 0; j < 26; j++) {
            Freq[i+1][j] += Freq[i][j];
         }
      }
   }

   return 0;
}

Output

The 3rd greatest character in range [2,9] is i
The 3rd greatest character in range [2,9] is a

Conclusion

Finally, requests to identify the Kth biggest character in an interval [L, R] with updates can be effectively solved utilizing a mix of a segment tree and a binary search method. The binary search technique is used to locate the Kth greatest character in that range, and segment tree is used to keep track of the frequency of characters in a range.

Updated on: 10-May-2023

202 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements