Count of pairs of strings which differ in exactly one position


Introduction

A string is composed of alphanumeric characters, each of which is associated with a definite position. The positions of the characters range from 0 to the string length. The characters differing exactly in one position are said to be adjacent.

In this article, we are going to develop a code that takes as input an array of strings which differ in exactly in one position. Let us look at the following example to understand the topic better −

Sample Example

Example 1 − str − {“abc”, “cba”, “dbc” , “acc”}

Output − 2

For instance, in the following example two pairs can be generated {“abc”,”dbc”} and {“abc”,acc”}. These strings differ in only one character position respectively.

In this article, we will develop a code that makes usage of the maps in order to store the similar strings as well as the patterns to fetch the total count of the pairs of strings. C++ maps make usage of the key value pairs in order to store and retrieve the data in constant time complexity.

Syntax

substr()

The substr() method is used to access a substring of the bigger strings from the start to the end-1 position. All the indexes to be accessed should be consecutive and in order.

Parameters −

st − The starting position

end − The ending position to terminate the access of the substring

Algorithm

  • A vector of strings,strings is accepted

  • Initially a counter is maintained to store the count of total pairs satisfying the condition.

  • Two maps are maintained to store the same strings as well as the strings satisfying the pattern keeping a wildcard character. Let us assume this map to be m1.

  • The other map is maintained to store the similar string. Let us assume this map to be m2.

  • An iteration over the input array is performed.

  • Each time the similar type of string is observed, its corresponding count is incremented in the m2 map

  • Substrings are created using substitutions of the respective characters of the string, with wild card character

  • Each time the similar type of pattern is observed, its corresponding count is incremented in the m1 map

  • The summation of strings of pairs observed in m1 and m2 respectively is computed.

  • The count is incremented using these sum values.

Example

The following C++ code snippet is used to take as input an array of strings and compute the total number of pairs differing only in one position −

//including the required libraries
#include <bits/stdc++.h>
using namespace std;
 
// Function to return the count of same pairs
int countPairs(map<string, int> &d) {
   //maintaining the count 
   int sum = 0;
   for (auto i : d)
       sum += (i.second * (i.second - 1)) / 2;
 
   return sum;
}
//called method to calculate strings differing by one character
void chardiff(vector<string> &array, int len , int n ) {
   //count to store pairs
    int cnt = 0;
   //storing strings with wildcard characters
    map<string, int> pattern;
   //storing equivalent strings
    map<string, int> similar;
   //iterating over the array
   for (auto str : array) {
      //if two strings are same , increment the count
       similar[str]+= 1;
 
      // Iterating on a single string
       for (int i = 0; i < len ; i++) {
      // Adding special symbol to the string
       string first = str.substr(0, i);
       string second = str.substr(i + 1);
       string temp =  first + "//" + second ;
 
      //incrementing if similar pattern 
        pattern[temp]+=1;
      }
   }
 
   // Return counted pairs - equal pairs
   int chnged = countPairs(pattern);
   int sim = countPairs(similar);
   cnt =  chnged - sim * len;
   cout << "The count of pairs satisfying the condition are : " << cnt; 
}
 
//calling the main method
int main() {
   int n = 4, len = 3;
   //getting a vector of strings to process
   vector<string> strings = {"get","set","bet","bat"};
   cout << "Vector of strings : " << "\n" ;
   for(auto s : strings){
       cout << s <<"\n";
   }
   //one character different
   chardiff(strings, len , n );
 
   return 0;
}

Output

Vector of strings − 
get
set
bet
bat
The count of pairs satisfying the condition are − 4

Conclusion

Maps simulate the process of insertion and updation of the records in O(1) time complexity. The substring method in C++ can be used to access the characters of the string in order between the specified indexes. The total summation of the pairs uptil any number, n is obtained by the product of n and n-1 divided by 2.

Updated on: 31-Jul-2023

140 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements