Minimum K such that every substring of length at least K contains a character c


A string is given in this problem and we have to find a minimum length ‘k’ such that all the substrings of the given string of length k contains at least one common character. We will see three approaches for this problem, one the naive approach to find all the substrings, another one is the binary search approach and third is by using the minimum difference approach.

Sample Example

string str = “efabc”
Output: 3

Explanation

For the substrings of length 1 and 2 it is not possible to contain the same character for example substring ‘ef’ and ‘bc’ does not have any same character. But for the substring of length 3 or more ‘a’ will always present there.

Naive Approach

In this approach, we will generate all the substrings and for all of them we will check if any character is common or not, but this method will take very much time.

Example

#include <bits/stdc++.h>
using namespace std; 
// function to find the minimum length of the substring 
int minLen(string str){
   int n = str.length(); // length of the given string
   
   // finding all the substrings fo the same length 
   for(int len = 1; len <= n; len++){
      int freq[26] = {0}; // frequency array to store the frequency         
      for(int i=0; i<=n-len; i++){
         int j = i+len; // ending of the current substring 
         int cur[26] = {0};
         for(int k = i; k<j; k++){
    	      if(cur[str[k]-'a']){
    		      continue;
            }
            
            // storing the frequecy of the letters 
            freq[str[k]-'a']++;  
    	      cur[str[k]-'a']++; 
         }
      }        
      
      // total number of substring with this length 
      int scount = n-len+1;
      
      // if any character have the same frequecy then we will return current length 
      for(int i=0; i<26; i++){
         if(freq[i] == scount){
            return len;
         }
      }
   }
   return n;
}

// main function 
int main(){
   string str = "efabc"; // given string    
   
   // calling the function 
   cout<<"The minimum length of the substrings that contains at least a same character is "<<minLen(str)<<endl;
   return 0;
}

Output

The minimum length of the substrings that contains at least a same character is 3

Time and Space Complexity

The time complexity of the above code is O(N^3), where N is the size of the given string. The space complexity of the above code is O(1), as we are using the constant space.

Binary Search

In this approach, we will use the binary search technique to find the minimum length substring. We will traverse over the substring for each character and will check what minimum length substring length will be required and use the binary search over the answer.

Example

#include <bits/stdc++.h>
using namespace std; 
// function to check last occurence 
bool check(int len, string str){
   int n = str.length();    
   for(int i=0; i< 26; i++){
      int last = -1;
      int cur = 'a'+i;
      int j;
      for(j=0; j < n; j++){
         if(str[j] == cur){
            if(j-len > last){
               break;
            }
            else{
               last = j;
            }
         }
      }
      if(j != n || j-len > last){
         continue;
      }
      else{
         return true;
      }
   }
   return false;
}

// function to find the minimum length of the substring 
int minLen(string str){
   int n = str.length(); // length of the given string    
   int l = 1, h = n;
   int ans;
   while (l <= h){
      int len = (l + h) / 2;
      if (check(len, str)) {
         ans = len;
         h = len - 1;
      }
      else
         l = len + 1;
   }
   return ans;
}

// main function 
int main(){
   string str = "aaabaaa"; // given string    
   // calling the function 
   cout<<"The minimum length of the substrings that contains at least a same character is "<<minLen(str)<<endl;
   return 0;
}

Output

The minimum length of the substrings that contains at least a same character is 2

Time and Space Complexity

The time complexity of the above code is O(N*log(N)), where N is the size of the given string. The space complexity of the above code is O(1), as we are using the constant space.

Efficient Approach

In this approach, we have traversed over the string and for each lower case English alphabet character we have maintained the last occurrence and the gap between the each two same character, the starting and the ending of the given string.

Example

#include <bits/stdc++.h>
using namespace std; 
// function to find the minimum length of the substring 
int minLen(string str){
   int last[26]; // array to store the last occurrence, of the characters
   memset(last,-1,sizeof(last)); // making start of each character equal to -1    
   int minLen[26]; // array to store minimum length of substring for each char
   
   // initializing to length of string
   int n = str.length();    
   for(int i=0; i<26; i++){
      minLen[i] =  -1;
   }   
   
   // traversing over the string to get the answer 
   int ans = n;
   for(int i=0; i<n; i++){
      if(minLen[str[i]-'a'] == -1){
         minLen[str[i]-'a'] = i+1;
      }
      else{
         minLen[str[i]-'a'] = max(minLen[str[i]-'a'], i-last[str[i]-'a']);  
      }
      last[str[i]-'a'] = i;
   }
   for(int i=0; i<26; i++){
      minLen[i] = max(minLen[i], n-last[i]);
      ans = min(ans, minLen[i]);
   }
   return ans;
}

// main function 
int main(){
   string str = "efabc"; // given string     
   // calling the function 
   cout<<"The minimum length of the substrings that contains at least a same character is "<<minLen(str)<<endl;
   return 0;
}

Output

The minimum length of the substrings that contains at least a same character is 3

Time and Space Complexity

The time complexity of the above code is O(N), which is the linear time complexity. The space complexity of the above code is O(1), as we are using the constant space.

Conclusion

In this tutorial, we have implemented a code to find the minimum length of the substrings that contains at least a same character. We have implemented three approaches to find the solution. First approach, is the naive approach with very high time complexity. The second approach, is the binary search approach and the last one is efficient with linear time complexity.

Updated on: 25-Jul-2023

88 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements