FuzzyWuzzy Python library

PythonServer Side ProgrammingProgramming

In this tutorial, we are going to learn about the FuzzyWuzzy Python library. FuzzyBuzzy library is developed to compare to strings. We have other modules like regex, difflib to compare strings. But, FuzzyBuzzy is unique in its way. The methods from this library returns score out of 100 of how much the strings matched instead of true, false or string.

To work with the FuzzyWuzzy library, we have to install the fuzzywuzzy and python- Levenshtein. Run the following commands to install them.

pip install fuzzywuzzy

If you run the above command, you will the following success message.

Collecting fuzzywuzzy
Downloading https://files.pythonhosted.org/packages/d8/f1/5a267addb30ab7eaa1beab2
b9323073815da4551076554ecc890a3595ec9/fuzzywuzzy-0.17.0-py2.py3-none-any.whl
Installing collected packages: fuzzywuzzy
Successfully installed fuzzywuzzy-0.17.0

Run the following command in Linux to install python-Levenshtein.

pip install python-Levenshtein

Run the following command in windows.

easy_install python-Levenshtein

fuzz

Now, we will learn about the fuzz module. fuzz is used to compare two strings at a time. It has different methods that return a score out of 100. Let's see some methods of the fuzz module.

fuzz.ratio()

Let's see the first method of fuzz module ratio. It's used to compare two strings that return a score out of 100. See the examples below to get a clear idea.

Example

## importing the module from the fuzzywuzzy library
from fuzzywuzzy import fuzz
## 100 for same strings
print(f"Equal Strings:- {fuzz.ratio('tutorialspoint', 'tutorialspoint')}")
## random score for slight changes in the strings
print(f"Slight Changed Strings:- {fuzz.ratio('tutorialspoint', 'TutorialsPoint')}")
print(f"Slight Changed Strings:- {fuzz.ratio('tutorialspoint', 'Tutorials Point')}"
)
## complete different strings
print(f"Different Strings:- {fuzz.ratio('abcd', 'efgh')}")

Output

Max Score:- 100
Slight Changed Strings:- 86
Slight Changed Strings:- 86
Different Strings:- 0

Experiment with the partial_ratio as much as possible for better understanding.

fuzz.WRatio()

fuzz.WRatio() handles upper and lower cases and some other parameters. Let's see some examples.

Example

## importing the module from the fuzzywuzzy library
from fuzzywuzzy import fuzz
## 100 score even if one string contains more characters than the other
print(f"Max Score:- {fuzz.WRatio('tutorialspoint', 'tutorialspoint!!!')}")
## random score for slight changes in the strings
print(f"Slight Changed Strings:- {fuzz.WRatio('tutorialspoint', 'TutorialsPoint')}")
print(f"Slight Changed Strings:- {fuzz.WRatio('tutorialspoint', 'TutorialsPoint')}")
## complete different strings
print(f"Different Strings:- {fuzz.ratio('abcd', 'efgh')}")

Output

Max Score:- 100
Slight Changed Strings:- 100
Slight Changed Strings:- 100
Different Strings:- 0

WRatio ignores the cases and some extra characters as we see. Using WRatio instead of a simple ratio gives you more close matching strings.

Conclusion

If you have any doubts regarding the tutorial, mention them in the comment section.

raja
Published on 23-Oct-2019 11:31:52
Advertisements