Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to get synonyms/antonyms from NLTK WordNet in Python
The WordNet is a part of Python's Natural Language Toolkit. It is a large word database of English Nouns, Adjectives, Adverbs and Verbs. These are grouped into some set of cognitive synonyms, which are called synsets.
To use the Wordnet, at first we have to install the NLTK module, then download the WordNet package.
$ sudo pip3 install nltk
$ python3
>>> import nltk
>>> nltk.download('wordnet')
In the wordnet, there are some groups of words, whose meaning are same. Let's explore how to extract synonyms, antonyms, and word details using NLTK WordNet.
Getting Word Information
In the first example, we will see how wordnet returns meaning and other details of a word. Sometimes, if some examples are available, it may also provide that ?
from nltk.corpus import wordnet #Import wordnet from the NLTK
synset = wordnet.synsets("Travel")
print('Word and Type : ' + synset[0].name())
print('Synonym of Travel is: ' + synset[0].lemmas()[0].name())
print('The meaning of the word : ' + synset[0].definition())
print('Example of Travel : ' + str(synset[0].examples()))
Word and Type : travel.n.01 Synonym of Travel is: travel The meaning of the word : the act of going from one place to another Example of Travel : ['he enjoyed selling but he hated the travel']
Finding Synonyms and Antonyms
In the previous example, we are getting detail information about some words. Here we will see how wordnet can send the synonyms and antonyms of a given word ?
import nltk
from nltk.corpus import wordnet #Import wordnet from the NLTK
synonyms = []
antonyms = []
for synset in wordnet.synsets("bad"):
for lemma in synset.lemmas():
synonyms.append(lemma.name()) #add the synonyms
if lemma.antonyms(): #When antonyms are available, add them into the list
antonyms.append(lemma.antonyms()[0].name())
print('Synonyms: ' + str(set(synonyms))) # Remove duplicates with set()
print('Antonyms: ' + str(set(antonyms)))
Synonyms: {'uncollectible', 'tough', 'high-risk', 'spoilt', 'bad', 'speculative', 'risky', 'defective', 'regretful', 'big', 'forged', 'sorry', 'spoiled', 'unfit', 'unsound'}
Antonyms: {'good', 'unregretful'}
Measuring Word Similarity
The NLTK wordnet has another great feature, by using it we can check whether two words are nearly equal or not. It will return the similarity ratio from a pair of words ?
import nltk
from nltk.corpus import wordnet #Import wordnet from the NLTK
first_word = wordnet.synset("travel.v.01")
second_word = wordnet.synset("walk.v.01")
print('Similarity: ' + str(first_word.wup_similarity(second_word)))
first_word = wordnet.synset("good.n.01")
second_word = wordnet.synset("zebra.n.01")
print('Similarity: ' + str(first_word.wup_similarity(second_word)))
Similarity: 0.6666666666666666 Similarity: 0.09090909090909091
Understanding Synsets
A synset is a set of synonyms that share a common meaning. Each synset has a unique name in the format word.pos.nn where:
- word − the word itself
- pos − part of speech (n=noun, v=verb, a=adjective, r=adverb)
- nn − sense number
from nltk.corpus import wordnet
# Get all synsets for a word
synsets = wordnet.synsets("bank")
for synset in synsets:
print(f"{synset.name()}: {synset.definition()}")
bank.n.01: sloping land (especially the slope beside a body of water) bank.n.02: a financial institution that accepts deposits and channels the money into lending activities bank.v.01: tip laterally bank.v.02: enclose with a bank
Conclusion
NLTK WordNet provides powerful functionality for finding synonyms, antonyms, and word relationships. Use synsets() to get word meanings, lemmas() for synonyms, and wup_similarity() to measure semantic similarity between words.
