Python Program to Check for Almost Similar Strings


Strings in Python are sequences of characters used to represent textual data, enclosed in quotes. Checking for almost similar strings involves comparing and measuring their similarity or dissimilarity, enabling tasks like spell checking and approximate string matching using techniques such as Levenshtein distance or fuzzy matching algorithms.

In this article, we will learn a Python Program to check for almost similar Strings.

Demonstration

Assume we have taken an input string

Input

Input string 1:  aazmdaa
Input string 2:  aqqaccd
k: 2

Output

Checking whether both strings are similar:  True

In this example, ‘a’ occurs 4 times in string1, and 2 times in string2, 4 – 2 = 2, in range, similarly, all chars in range, hence true.

Methods Used

The following are the various methods to accomplish this task:

  • Using for loop, ascii_lowecase, dictionary comprehension, and abs() functions

  • Using Counter() and max() functions

Using for loop, ascii_lowecase, dictionary comprehension, and abs() functions

In this method we are going to learn how to use simple for loop, ascii_lowecase, dictionary comprehension, and abs() functions to check for similar strings

Dictionary Comprehension Syntax

{key_expression: value_expression for item in iterable}

Dictionary comprehension is a compact and concise method in Python to create dictionaries by iterating over an iterable and defining key-value pairs based on expressions, allowing for efficient and readable code.

abs() Function Syntax

abs(number)

The abs() function in Python returns the absolute value of a number, which is the numerical value without considering its sign. It is useful for obtaining the magnitude or distance from zero of a given number.

Algorithm (Steps)

Following are the Algorithm/steps to be followed to perform the desired task

  • Use the import keyword to import ascii_lowercase from the string module.

  • Create a function findFrequency() that returns the frequency of characters of string by accepting input string as an argument

  • Take a dictionary and fill it with all lowercase alphabets as keys and values as 0.

  • Use the for loop to traverse through the input string.

  • Increment the frequency of the current character by 1.

  • Return the frequency of characters.

  • Create a variable to store the input string 1.

  • Create another variable to store the input string 2.

  • Print both the input strings.

  • Create another variable to store the input k value

  • Calling the above findFrequency() function to get the frequency of characters of input string 1 by passing the input string as an argument.

  • Similarly, get the frequency of characters of input string 2.

  • Initialize the result as True.

  • Use the for loop to traverse through the lowercase alphabets.

  • Use the if conditional statement to check whether the absolute difference of frequency of current characters of both strings is greater than k with the abs() function(returns the absolute value of a number)

  • Update the result as False if the condition is true.

  • Break the loop.

  • Print the result.

Example

The following program returns the whether the given strings are almost similar or not using for loop, ascii_lowecase, dictionary comprehension, and abs() functions

# importing ascii_lowercase from the string module
from string import ascii_lowercase
# creating a function that returns the frequency of characters of
# of string by accepting input string as an argument
def findFrequency(inputString):
    # Take a dictionary and filling with all lowercase alphabets as keys
    # With values as 0
    frequency = {c: 0 for c in ascii_lowercase}
    # Traversing in the given string
    for c in inputString:
        # Incrementing the character frequency by 1
        frequency[c] += 1
    # returning the frequency of characters
    return frequency

# input string 1
inputString_1 = 'aazmdaa'
# input string 2
inputString_2 = "aqqaccd"
# printing the input strings
print("Input string 1: ", inputString_1)
print("Input string 2: ", inputString_2)
# input K value
K = 2
# getting the frequency of characters of input string 1
# by calling the above findFrequency() function
stringFrequency1 = findFrequency(inputString_1)
# getting the frequency of characters of input string 2
stringFrequency2 = findFrequency(inputString_2)
# Initializing the result as True
result = True
# traversing through all the lowercase characters
for c in ascii_lowercase:
  # checking whether the absolute difference
  # of frequency of current characters of both strings is greater than k
    if abs(stringFrequency1[c] - stringFrequency2[c]) > K:
        # updating False to the result if the condition is true
        result = False
        # break the loop
        break
# printing the result
print("Checking whether both strings are similar: ", result)

Output

On executing, the above program will generate the following output

Input string 1:  aazmdaa
Input string 2:  aqqaccd
Checking whether both strings are similar:  True

Using Counter() and max() functions

In this method we are going to use the combination of Counter and max function to check for the string that is almost similar to the given string.

Counter() function: a sub-class that counts the hashable objects. It implicitly creates a hash table of an iterable when called/invoked.

counter_object = Counter(iterable)

Algorithm (Steps)

Following are the Algorithm/steps to be followed to perform the desired task

  • Use the import keyword to import the Counter function from the collections module.

  • Create another variable to store the input k value

  • Use the lower() function(converts all uppercase characters in a string to lowercase characters) to convert the input string 1 into lowercase then use the Counter() function to get the frequency of characters of input string 1.

  • In the same way, get the frequency of characters of input string 2 by converting it into lowercase first.

  • Initialize the result as True.

  • Use the if conditional statement to check whether the strings are similar or not.

  • The max() method(returns the highest-valued item/greatest number in an iterable)

  • Update the result as False if the condition is true.

  • Print the result.

Example

The following program returns the whether the given strings are almost similar or not using the counter(),max() functions

# importing Counter from the collections module
from collections import Counter
# input string 1
inputString_1 = 'aazmdaa'
# input string 2
inputString_2 = "aqqaccd"
# printing the input strings
print("Input string 1: ", inputString_1)
print("Input string 2: ", inputString_2)
# input K value
K = 2
# convertig the input string 1 into lowercase and then
# getting the frequency of characters of input string 1
strFrequency_1 = Counter(inputString_1.lower())
# convertig the input string 2 into lowercase and then
# getting the frequency of characters of input string 2
strFrequency_2 = Counter(inputString_2.lower())
# Initializing the result as True
result = True
# Checking whether the strings are similar or not
if(max((strFrequency_1 - strFrequency_2).values()) > K
        or max((strFrequency_2 - strFrequency_1).values()) > K):
    # updating False to the result if the condition is true
    result = False
# printing the result
print("Checking whether both strings are similar: ", result)

Output

On executing, the above program will generate the following output

Input string 1:  aazmdaa
Input string 2:  aqqaccd
Checking whether both strings are similar:  True

Conclusion

In this article, we have learned 2 different methods to check for almost similar Strings. We learned how to iterate through the lowercase alphabet. Using the dictionary(hashing) and counter() functions, we learned how to calculate the frequency of each character of the given string.

Updated on: 17-Aug-2023

82 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements