 
- Natural Language Toolkit - Home
- Natural Language Toolkit - Introduction
- Natural Language Toolkit - Getting Started
- Natural Language Toolkit - Tokenizing Text
- Training Tokenizer & Filtering Stopwords
- Looking up words in Wordnet
- Stemming & Lemmatization
- Natural Language Toolkit - Word Replacement
- Synonym & Antonym Replacement
- Corpus Readers and Custom Corpora
- Basics of Part-of-Speech (POS) Tagging
- Natural Language Toolkit - Unigram Tagger
- Natural Language Toolkit - Combining Taggers
- Natural Language Toolkit - More NLTK Taggers
- Natural Language Toolkit - Parsing
- Chunking & Information Extraction
- Natural Language Toolkit - Transforming Chunks
- Natural Language Toolkit - Transforming Trees
- Natural Language Toolkit - Text Classification
- Natural Language Toolkit Resources
- Natural Language Toolkit - Quick Guide
- Natural Language Toolkit - Useful Resources
- Natural Language Toolkit - Discussion
Synonym & Antonym Replacement
Replacing words with common synonyms
While working with NLP, especially in the case of frequency analysis and text indexing, it is always beneficial to compress the vocabulary without losing meaning because it saves lots of memory. To achieve this, we must have to define mapping of a word to its synonyms. In the example below, we will be creating a class named word_syn_replacer which can be used for replacing the words with their common synonyms.
Example
First, import the necessary package re to work with regular expressions.
import re from nltk.corpus import wordnet
Next, create the class that takes a word replacement mapping −
class word_syn_replacer(object): def __init__(self, word_map): self.word_map = word_map def replace(self, word): return self.word_map.get(word, word)
Save this python program (say replacesyn.py) and run it from python command prompt. After running it, import word_syn_replacer class when you want to replace words with common synonyms. Let us see how.
from replacesyn import word_syn_replacer
rep_syn = word_syn_replacer ({bday: birthday)
rep_syn.replace(bday)
Output
'birthday'
Complete implementation example
import re from nltk.corpus import wordnet class word_syn_replacer(object): def __init__(self, word_map): self.word_map = word_map def replace(self, word): return self.word_map.get(word, word)
Now once you saved the above program and run it, you can import the class and use it as follows −
from replacesyn import word_syn_replacer
rep_syn = word_syn_replacer ({bday: birthday)
rep_syn.replace(bday)
Output
'birthday'
The disadvantage of the above method is that we should have to hardcode the synonyms in a Python dictionary. We have two better alternatives in the form of CSV and YAML file. We can save our synonym vocabulary in any of the above-mentioned files and can construct word_map dictionary from them. Let us understand the concept with the help of examples.
Using CSV file
In order to use CSV file for this purpose, the file should have two columns, first column consist of word and the second column consists of the synonyms meant to replace it. Let us save this file as syn.csv. In the example below, we will be creating a class named CSVword_syn_replacer which will extends word_syn_replacer in replacesyn.py file and will be used to construct the word_map dictionary from syn.csv file.
Example
First, import the necessary packages.
import csv
Next, create the class that takes a word replacement mapping −
class CSVword_syn_replacer(word_syn_replacer):
   def __init__(self, fname):
      word_map = {}
      for line in csv.reader(open(fname)):
         word, syn = line
         word_map[word] = syn
      super(Csvword_syn_replacer, self).__init__(word_map)
After running it, import CSVword_syn_replacer class when you want to replace words with common synonyms. Let us see how?
from replacesyn import CSVword_syn_replacer rep_syn = CSVword_syn_replacer (syn.csv) rep_syn.replace(bday)
Output
'birthday'
Complete implementation example
import csv
class CSVword_syn_replacer(word_syn_replacer):
def __init__(self, fname):
word_map = {}
for line in csv.reader(open(fname)):
   word, syn = line
   word_map[word] = syn
super(Csvword_syn_replacer, self).__init__(word_map)
Now once you saved the above program and run it, you can import the class and use it as follows −
from replacesyn import CSVword_syn_replacer rep_syn = CSVword_syn_replacer (syn.csv) rep_syn.replace(bday)
Output
'birthday'
Using YAML file
As we have used CSV file, we can also use YAML file to for this purpose (we must have PyYAML installed). Let us save the file as syn.yaml. In the example below, we will be creating a class named YAMLword_syn_replacer which will extends word_syn_replacer in replacesyn.py file and will be used to construct the word_map dictionary from syn.yaml file.
Example
First, import the necessary packages.
import yaml
Next, create the class that takes a word replacement mapping −
class YAMLword_syn_replacer(word_syn_replacer): def __init__(self, fname): word_map = yaml.load(open(fname)) super(YamlWordReplacer, self).__init__(word_map)
After running it, import YAMLword_syn_replacer class when you want to replace words with common synonyms. Let us see how?
from replacesyn import YAMLword_syn_replacer rep_syn = YAMLword_syn_replacer (syn.yaml) rep_syn.replace(bday)
Output
'birthday'
Complete implementation example
import yaml class YAMLword_syn_replacer(word_syn_replacer): def __init__(self, fname): word_map = yaml.load(open(fname)) super(YamlWordReplacer, self).__init__(word_map)
Now once you saved the above program and run it, you can import the class and use it as follows −
from replacesyn import YAMLword_syn_replacer rep_syn = YAMLword_syn_replacer (syn.yaml) rep_syn.replace(bday)
Output
'birthday'
Antonym replacement
As we know that an antonym is a word having opposite meaning of another word, and the opposite of synonym replacement is called antonym replacement. In this section, we will be dealing with antonym replacement, i.e., replacing words with unambiguous antonyms by using WordNet. In the example below, we will be creating a class named word_antonym_replacer which have two methods, one for replacing the word and other for removing the negations.
Example
First, import the necessary packages.
from nltk.corpus import wordnet
Next, create the class named word_antonym_replacer −
class word_antonym_replacer(object):
   def replace(self, word, pos=None):
      antonyms = set()
      for syn in wordnet.synsets(word, pos=pos):
         for lemma in syn.lemmas():
            for antonym in lemma.antonyms():
               antonyms.add(antonym.name())
      if len(antonyms) == 1:
         return antonyms.pop()
      else:
         return None
   def replace_negations(self, sent):
      i, l = 0, len(sent)
      words = []
      while i < l:
         word = sent[i]
         if word == 'not' and i+1 < l:
            ant = self.replace(sent[i+1])
            if ant:
               words.append(ant)
               i += 2
               continue
         words.append(word)
         i += 1
      return words
Save this python program (say replaceantonym.py) and run it from python command prompt. After running it, import word_antonym_replacer class when you want to replace words with their unambiguous antonyms. Let us see how.
from replacerantonym import word_antonym_replacer rep_antonym = word_antonym_replacer () rep_antonym.replace(uglify)
Output
['beautify''] sentence = ["Let us", 'not', 'uglify', 'our', 'country'] rep_antonym.replace _negations(sentence)
Output
["Let us", 'beautify', 'our', 'country']
Complete implementation example
nltk.corpus import wordnet
class word_antonym_replacer(object):
def replace(self, word, pos=None):
   antonyms = set()
   for syn in wordnet.synsets(word, pos=pos):
      for lemma in syn.lemmas():
      for antonym in lemma.antonyms():
         antonyms.add(antonym.name())
   if len(antonyms) == 1:
      return antonyms.pop()
   else:
      return None
def replace_negations(self, sent):
   i, l = 0, len(sent)
   words = []
   while i < l:
      word = sent[i]
      if word == 'not' and i+1 < l:
         ant = self.replace(sent[i+1])
         if ant:
            words.append(ant)
            i += 2
            continue
      words.append(word)
      i += 1
   return words
Now once you saved the above program and run it, you can import the class and use it as follows −
from replacerantonym import word_antonym_replacer rep_antonym = word_antonym_replacer () rep_antonym.replace(uglify) sentence = ["Let us", 'not', 'uglify', 'our', 'country'] rep_antonym.replace _negations(sentence)
Output
["Let us", 'beautify', 'our', 'country']