Maximum number of Strings with Common Prefix of length K


In this problem, we need to count the maximum string having common prefix of length K. We can take prefix of length K from all strings and count maximum number of similar prefix using the map data structure. Also, we can use the Trie data structure to solve the problem.

Problem statement - We have given an strs[] array containing multiple strings. We need to count the maximum number of strings containing a common prefix of length K.

Sample Example

Input

strs = {"tutorialspoint", "tut", "abcd", "tumn", "tutorial", "PQR", "ttus", "tuto"};
 K = 3;

Output

4

Explanation - The 4 strings contains 'tut' prefix of length 3.

Input

strs = {"tutorialspoint", "tut", "abcd", "tumn", "tutorial", "PQR", "ttus", "tuto"};
 K = 8;

Output

2

Explanation - The only 2 strings contains the same prefix of length 8 which is 'tutorial'.

Input

strs = {"tutorialspoint", "tut", "abcd", "tumn", "tutorial", "PQR", "ttus", "tuto"};
 K = 8;

Output

1

Explanation - The array doesn't contain strings with common prefix of length 2. So, we can print 1.

Approach 1

In this approach, we will use the map data structure to count the frequency of the prefix of length K of each substring. At the end, we take the prefix with maximum frequency to show in the output.

Algorithm

Step 1 - Initialize the 'ans' with 0 to store the count of maximum strings having the common prefix. Also, define the 'pref' map to store the frequency of the prefixes of the string.

Step 2 - Start traversing the string.

Step 3 - Take a substring of length K starting from 0, and store to the 'temp' string.

Step 4 - Increment the frequency of 'temp' by 1 in the map.

Step 5 - Get the maximum answer from the 'ans' and frequency of the 'temp' string.

Step 6 - At last, return the 'ans' value.

Example

#include <bits/stdc++.h>
using namespace std;

int getMaxStrs(vector<string> strs, int K) {
   int ans = 0;
   // Map to store prefix of length K
   map<string, int> pref;
   // Traverse string
   for (int p = 0; p < strs.size(); p++) {
      // Taking substring of length K
      string temp = strs[p].substr(0, K);
      // Insert the prefix into the map
      pref[temp]++;
      // Get the maximum answer
      ans = max(ans, pref[temp]);
   }
   return ans;
}
int main() {
   vector<string> strs = {"tutorialspoint", "tut", "abcd", "tumn", "tutorial", "PQR", "ttus", "tuto"};
   int K = 3;
   cout << "The number of strings having a common prefix of length K is " << getMaxStrs(strs, K);
   return 0;
}

Output

The number of strings having a common prefix of length K is 4

Time complexity - O(N) for traversing the string.

Space complexity - O(N) to store the frequency of prefixes in the map.

Approach 2

In this approach, we will use the trie data structure to find the maximum number of strings having the common prefix. We will insert prefixes of all strings in the trie and check whether it occurred for the maximum time.

Algorithm

Step 1 - Initialize the Node for the trie containing the array of length 26 to point each alphabetical character from the current node, and the 'cnt' variable initialize with the 0, representing the number of common prefixes.

Step 2 - Start traversing each string of the array, and execute the insertNode() function to insert the prefix of length K of the string to the Trie. Also, pass the 'ans' variable as a reference to store a maximum number of strings having a common prefix.

Step 3 - In the insertNode() function, initialize the 'temp' node with the 'head' node and traverse the string to insert its prefix in the Trie.

Step 4 - Use the toLower() method to convert the character into the lowercase.

Step 5 - If the temp node's 'chars' array's (ch - 'a') index is null, assign the new node to it.

Step 6 - Increment the 'cnt' of the temp->chars[ch - 'a'] node.

Step 7 - If p + 1 is equal to K, we inserted the prefix of length K to the Trie. So, update the 'ans' with maximum value from 'ans' and temp->chars[ch - 'a']->cnt, and break the loop.

Step 8 - Move the temp node to the next node.

Step 9 - At last, print the 'ans' value.

Example

#include <bits/stdc++.h>
using namespace std;

struct Node {
   Node *chars[26];
   int cnt = 0;
};
Node* head;
void insertNode(string &str, int K, int &ans) {
   // Temporary node
   Node *temp = head;
   // Traverse string characters
   for (int p = 0; p < str.size(); p++) {
      // Change character to lowercase
      char ch = tolower(str[p]);
      // If the node does not exist for the current character, initialize it
      if (temp->chars[ch - 'a'] == NULL) {
         temp->chars[ch - 'a'] = new Node();
      }
      // Increase count to increment the length of the prefix
      temp->chars[ch - 'a']->cnt++;
      // If p + 1 is equal to K, then get the maximum result and break the loop.
      if (p + 1 == K) {
         ans = max(ans, temp->chars[ch - 'a']->cnt);
         break;
      }
      // Go to the next pointer
      temp = temp->chars[ch - 'a'];
   }
}
int main() {
   vector<string> strs = {"tutorialspoint", "tut", "abcd", "tumn", "tutorial", "PQR", "ttus", "tuto"};
   int K = 3;
   int ans = 0;
   // Node initialization
   head = new Node(); 
   // Insert all the strings into Trie
   for (auto str : strs) {
      insertNode(str, K, ans);
   }
   cout << "The number of strings having a common prefix of length K is " << ans;
   return 0;
}

Output

The number of strings having a common prefix of length K is 4

Time complexity - O(N*K), where N is array length, and K is prefix length.

Space complexity - O(N*K) to store all strings in Trie.

The first approach is more efficient and easy for beginner programmers to understand. The second approach might be complex, but it is necessary to learn the concept of the Trie data structure.

Updated on: 31-Aug-2023

59 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements