Minimize hamming distance in Binary String by setting only one K size substring bits


Hamming distance between two strings of equal length is the number of all the positions at which a different value exists at the corresponding position of the other string. We can understand this with an example given below −

S = “ramanisgoing”
T = “dishaisgoing”

Here, 5 is the hamming distance between two strings S and T as raman and disha are two words that make a difference in the strings to become equal.

Problem Statement

However, in this problem, we need to find the hamming distance between two strings that contain binary numbers only. One string would be provided to us by the user, say S, and another string, say T, and initially, we will assume to have only ‘0’ bits in it and would be of equal size as that of the given string. We will be given a number ‘k’ whose value would signify the number of elements a substring can consist of with only ones as its element so that we will put that k-sized substring at any position of our string (T) to minimize the hamming distance between two substrings S and T.

Let’s try to understand this problem with the help of some examples.

Input

S = "100111”   K = 5

Output

3

Explanation

As initial string T would be equal to “000000” and string T would be changed to compare with string S to find minimum hamming distance when k=5 as follows: “111110” and “011111”.

100111 and 000000 will give us 4 as their hamming distance. 100111 and 111110 will provide us with 3 as their hamming distance while 100111 and 011111 will give us 3 as their hamming distance.

But the minimum hamming distance would be 3 as 3 is less than 4. Hence, 3 is our answer.

 − S = "100101”	 K = 5
 − 3

As initial string T would be equal to “000000” and string T would be changed to compare with string S to find minimum hamming distance when k=5 as follows: “111110” and “011111”.

100101 and 000000 will give us 3 as their hamming distance. 100101 and 111110 will provide us with 4 as their hamming distance while 100101 and 011111 will give us 4 as their hamming distance.

But the minimum hamming distance would be 3 as 3 is less than 4. Hence, 3 is our answer.

Problem Explanation

Let’s try to understand the problem and find its solution.

Approach-1: Brute Force Solution

We will alter string T by changing the position of the substring at different initial and ending points so that we can get the minimum hamming distance among all of the possible strings.

Example

The following are implementations of the Brute force solution in various programming languages −

#include <stdio.h>
#include <string.h>
// Function to get minimum hamming distance through iteration
int helper(char* S, int k) {
   // n is the size of the string
   int n = strlen(S);
   // Take another string T and initiate it with zero bits size equal to that of S
   char T[n+1];
   memset(T, '0', sizeof(T));
   T[n] = '\0';
   // Take another string v to initiate it same as T
   char v[n+1];
   memset(v, '0', sizeof(v));
   v[n] = '\0';
   // Define mini as the hamming distance between T and S
   int mini = 0;
   int l = 0;
   while (l < n) {
      if (S[l] != T[l])
      mini++;
      l++;
   }
   for (int i = 0; i < n - k + 1; i++) {
      int j = 0, a = 0, l = 0;
      // Alter string v by changing bits of size k
      while (j < k) {
         v[j + i] = '1';
         j++;
      }
      // Calculate hamming distance
      while (l < n) {
         if (S[l] != v[l])
            a++;
         l++;
      }
      // Check if the previous hamming distance is greater than the current hamming distance, if yes, then replace that distance element
      if (mini > a) {
         mini = a;
      }
      // Again assign v as the T string
      strcpy(v, T);
   }
   // Return the minimum hamming distance found through the above iterations
   return mini;
}
int main() {
   // Give input string S
   char S[] = "100101";
   // Give the value of k that is the substring size
   int K = 5;
   // Call the helper function
   printf("The minimum hamming distance is: %d\n", helper(S, K));
   return 0;
}

Output

The minimum hamming distance is: 3
#include <bits/stdc++.h>
using namespace std;
// Make a function to get minimum hamming distance through iteration
int helper(string S,int k){
   // n is the size of the string
   int n=S.size();
   // Take another string T and initiate it with zero bits size equal to that of S
   string T;
   for(int i=0; i<n; i++) {
      T+="0";
   }
   // Take another string v to initiate it same as T
   string v=T;
   // Define mini as the hamming distance between T and S
   int mini=0;
   int l=0;
   while(l<n) {
      if(S[l]!=T[l])mini++;
      l++;
   }
   for(int i=0; i<n-k+1; i++) {
      int j=0,a=0,l=0;
      // alter string v by changing bits of size k
      while(j<k) {
         v[j+i]='1';
         j++;
      }
      // calculate hamming distance
      while(l<n) {
         if(S[l]!=v[l])a++;
         l++;
      }
      // Check if the previous hamming distance is greater than the current hamming distance, if yes then replace that distance element
      if(mini>a) {
         mini=a;
      }
      // Again assign v as the T string
      v=T;
   }
   // return the minimum hamming distance found through the above iterations
   return mini;
}
int main(){
   // Give input string S
   string S = "100101";
   // Give the value of k that is the substring size
   int K = 5;
   // Call the helper function
   cout << "The minimum hamming distance is: "<< helper(S,K);
   return 0;
}

Output

The minimum hamming distance is: 3
public class Main {
   // Function to get minimum hamming distance through iteration
   static int helper(String S, int k) {
      // n is the size of the string
      int n = S.length();
      // Take another string T and initiate it with zero bits size equal to that of S
      StringBuilder T = new StringBuilder();
      for (int i = 0; i < n; i++) {
         T.append('0');
      }
      // Take another string v to initiate it same as T
      StringBuilder v = new StringBuilder(T);
      // Define mini as the hamming distance between T and S
      int mini = 0;
      int l = 0;
      while (l < n) {
         if (S.charAt(l) != T.charAt(l))
         mini++;
         l++;
      }
      for (int i = 0; i < n - k + 1; i++) {
         int j = 0, a = 0;
         // Alter string v by changing bits of size k
         while (j < k) {
            v.setCharAt(j + i, '1');
            j++;
         }
         // Calculate hamming distance
         l = 0; // Reinitialize l
         while (l < n) {
            if (S.charAt(l) != v.charAt(l))
            a++;
            l++;
         }
         // Check if the previous hamming distance is greater than the current hamming distance, if yes, then replace that distance element
         if (mini > a) {
            mini = a;
         }
         // Again assign v as the T string
         v = new StringBuilder(T);
      }
      // Return the minimum hamming distance found through the above iterations
      return mini;
   }

   public static void main(String[] args) {
      // Give input string S
      String S = "100101";
      // Give the value of k that is the substring size
      int K = 5;
      // Call the helper function
      System.out.println("The minimum hamming distance is: " + helper(S, K));
   }
}

Output

The minimum hamming distance is: 3
def helper(S, k):
   n = len(S)
   # Take another string T and initiate it with zero bits size equal to that of S
   T = '0' * n
   # Take another string v to initiate it same as T
   v = T
   # Define mini as the hamming distance between T and S
   mini = 0
   l = 0
   while l < n:
      if S[l] != T[l]:
         mini += 1
      l += 1
   for i in range(n - k + 1):
      j, a, l = 0, 0, 0
      # Alter string v by changing bits of size k
      while j < k:
         v = v[:j + i] + '1' + v[j + i + 1:]
         j += 1
      # Calculate hamming distance
      while l < n:
         if S[l] != v[l]:
         a += 1
         l += 1
      # Check if the previous hamming distance is greater than the current hamming distance, if yes, then replace that distance element
      if mini > a:
         mini = a
      # Again assign v as the T string
      v = T
   # Return the minimum hamming distance found through the above iterations
   return mini

# Give input string S
S = "100101"
# Give the value of k that is the substring size
K = 5
# Call the helper function
print("The minimum hamming distance is:", helper(S, K))

Output

The minimum hamming distance is: 3

Approach-2: Optimized Solution

Algorithm

  • Count the number of 1’s present using the prefix sum array and store it as our minimum hamming distance

  • Traverse through the S string to find the value of the ones between K different substrings in the string S.

  • if(i-1<0)take value v as arr[i+K-1], otherwise, take value of v as (arr[i+K-1]-arr[i-1])

  • Store the minimum value by finding the minimum between the previous hamming distance and the current hamming distance.

  • The current hamming distance can be found by the manipulation of the addition of the number of zero elements from the K substring that is (K - v) and the number of zeros present in the current S substring that is (cnt - v)

  • Finally, return the overall minimum distance.

Example

The following are implementations of the Optimized Solution in various programming languages −

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
// Make a helper function to get minimum hamming distance through iteration
int helper(char *S, int K){
// n is the size of the string
   int n = strlen(S);
   // initialize an array of size 'n'
   int arr[n];
   if(S[0]=='0')arr[0]=0;
   else arr[0]=1;
   // Count the number of 1's in the string S
   for (int i = 1; i < n; i++) {
      if(S[i]=='0')arr[i]=arr[i-1];
      else arr[i]=arr[i-1]+1;
   }
   int cnt = arr[n - 1];
   // Define mini as the hamming distance between T and S
   int mini = cnt;
   // Traverse through S to find the minimum
   for (int i = 0; i < n - K; i++) {
      int v;
      if(i-1==-1)v=arr[i+K-1];
      else v= arr[i+K-1]-arr[i-1];
      // Store the minimum
      if (cnt - v + (K - v) < mini)
         mini = cnt - v + (K - v);
   }
   // Return the minimum hamming distance
   return mini;
}
int main(){
   // Give input string S
   char *S = "100101";
   // Give the value of k that is the substring size
   int K = 5;
   // Call the helper function
   printf("The minimum hamming distance is: %d", helper(S,K));
   return 0;
}

Output

The minimum hamming distance is: 3
#include <bits/stdc++.h>
using namespace std;
// Make a helper function to get minimum hamming distance through iteration
int helper(string S, int K){
// n is the size of the string
   int n = S.size();
   // initialize an array of size 'n'
   int arr[n];
   if(S[0]=='0')arr[0]=0;
   else arr[0]=1;
   // Count the number of 1's in the string S
   for (int i = 1; i < n; i++) {
      if(S[i]=='0')arr[i]=arr[i-1];
      else arr[i]=arr[i-1]+1;
   }
   int cnt = arr[n - 1];
   // Define mini as the hamming distance between T and S
   int mini = cnt;
   // Traverse through S to find the minimum
   for (int i = 0; i < n - K; i++) {
      int v;
      if(i-1==-1)v=arr[i+K-1];
      else v= arr[i+K-1]-arr[i-1];
      // Store the minimum
      mini = min(mini, cnt - v + (K - v));
   }
   // Return the minimum hamming distance
   return mini;
}
int main(){
   // Give input string S
   string S = "100101";
   // Give the value of k that is the substring size
   int K = 5;
   // Call the helper function
   cout << "The minimum hamming distance is: "<< helper(S,K);
   return 0;
}

Output

The minimum hamming distance is: 3
public class Main {
   // Make a helper function to get the minimum hamming distance through iteration
   static int helper(String S, int K) {
      // n is the size of the string
      int n = S.length();
      
      // Initialize an array of size 'n'
      int[] arr = new int[n];
      if (S.charAt(0) == '0') {
         arr[0] = 0;
      } else {
         arr[0] = 1;
      }
      // Count the number of 1's in the string S
      for (int i = 1; i < n; i++) {
         if (S.charAt(i) == '0') {
            arr[i] = arr[i - 1];
         } else {
            arr[i] = arr[i - 1] + 1;
         }
      }
      int cnt = arr[n - 1];
      // Define mini as the hamming distance between T and S
      int mini = cnt;
      
      // Traverse through S to find the minimum
      for (int i = 0; i < n - K; i++) {
         int v;
         if (i - 1 == -1) {
            v = arr[i + K - 1];
         } else {
            v = arr[i + K - 1] - arr[i - 1];
         }

         // Store the minimum
         mini = Math.min(mini, cnt - v + (K - v));
      }
      // Return the minimum hamming distance
      return mini;
   }
   public static void main(String[] args) {
      // Give input string S
      String S = "100101";
      // Give the value of K, which is the substring size
      int K = 5;
      // Call the helper function
      System.out.println("The minimum hamming distance is: " + helper(S, K));
   }
}

Output

The minimum hamming distance is: 3
def helper(S, K):
   # n is the size of the string
   n = len(S)

   # Initialize an array of size 'n'
   arr = [0] * n
   # Initialize the first element of the array based on the first character of S
   if S[0] == '0':
      arr[0] = 0
   else:
        arr[0] = 1
   # Count the number of 1's in the string S
   for i in range(1, n):
      if S[i] == '0':
         arr[i] = arr[i - 1]
      else:
         arr[i] = arr[i - 1] + 1

   # Calculate the total count of 1's in the string
   cnt = arr[n - 1]

   # Initialize mini as the total count
   mini = cnt
   # Traverse through S to find the minimum hamming distance
   for i in range(n - K):
      v = 0
      # Calculate the difference in counts for the sliding window
      if i - 1 == -1:
         v = arr[i + K - 1]
      else:
         v = arr[i + K - 1] - arr[i - 1]
      # Update mini with the minimum hamming distance
      mini = min(mini, cnt - v + (K - v))

   # Return the minimum hamming distance
   return mini
# Input string S
S = "100101"
# Value of K, which is the substring size
K = 5
# Call the helper function and display the result
print("The minimum hamming distance is:", helper(S, K))

Output

The minimum hamming distance is: 3

Conclusion

In this article, to find the minimum hamming distance we would first see a naive approach but to improve its time complexity, we would use the concept of prefix sum array through which we can avoid repeated counting in different loops all in one loop.

Updated on: 22-Jan-2024

65 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements