Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Get similar words suggestion using Enchant in Python
There will be times when we misspell some words when we write something. To overcome this problem, we use the PyEnchant module in Python.
This module is used to check the spelling of words and suggest corrections for misspelled words. It is also used in many popular spell checkers, including ispell, aspell, and MySpell. It is very flexible in handling multiple dictionaries and multiple languages.
For example, if the input is 'prfomnc', then the output returned would be − 'prominence', 'performance', 'preform', 'provence', 'preferment', 'proforma'.
Installing PyEnchant
For Windows users, install the pre-built binary packages using pip −
pip install pyenchant
Basic Word Checking
The Dict object is the most important object in the PyEnchant module, which represents a dictionary. These objects are used to check the spelling of words and to get suggestions for misspelled words.
Let's understand the working of the d.check() function −
import enchant
d = enchant.Dict("en_US")
print(d.check("Hello"))
True
Since the word "Hello" is valid, it returns True. Now let's try providing a misspelled word −
import enchant
d = enchant.Dict("en_US")
print(d.check("Helo"))
False
Getting Similar Word Suggestions
The enchant.Dict() function creates a spelling dictionary object. It accepts a language code, in this case "en_US" (representing American English), and returns a dictionary object for spell checking.
To check if the word is an actual English word, the d.check() and d.suggest() functions are used −
import enchant
d = enchant.Dict("en_US")
word = "prfomnc"
print("Is valid word:", d.check(word))
print("Suggestions:", d.suggest(word))
Is valid word: False Suggestions: ['performance', 'prominence', 'preform', 'profane', 'profound', 'pro forma']
Finding the Most Similar Word
To find the most suitable suggestion, we can iterate through each suggestion and calculate the similarity ratio using difflib.SequenceMatcher().
The difflib.SequenceMatcher() class compares sequences and returns a similarity ratio, which helps us pick the best match −
import enchant
import difflib
# Initialize English dictionary
d = enchant.Dict("en_US")
# Misspelled word
my_word = "prfomnc"
# Get suggestions from the dictionary
suggestions = set(d.suggest(my_word))
# Track best match
best_match = ""
max_ratio = 0
# Compare similarity of each suggestion
for suggestion in suggestions:
ratio = difflib.SequenceMatcher(None, my_word, suggestion).ratio()
if ratio > max_ratio:
max_ratio = ratio
best_match = suggestion
print("Best match:", best_match)
print("Similarity ratio:", round(max_ratio, 3))
Best match: performance Similarity ratio: 0.615
Additional Enchant Functions
The enchant module provides additional functions for working with dictionaries −
- dict_exists() − To check whether a dictionary is available for a given language
- request_dict() − To construct and return a new Dict object
- list_languages() − Display the list of languages for which dictionaries are available
import enchant
# Check available languages
print("Available languages:", enchant.list_languages()[:5]) # Show first 5
# Check if a dictionary exists
print("English dictionary exists:", enchant.dict_exists("en_US"))
print("French dictionary exists:", enchant.dict_exists("fr_FR"))
Available languages: ['en', 'en_AU', 'en_CA', 'en_GB', 'en_US'] English dictionary exists: True French dictionary exists: True
Conclusion
PyEnchant provides an easy way to check spelling and get word suggestions in Python. Use d.suggest() for multiple options, and combine with difflib.SequenceMatcher() to find the most similar word automatically.
