Natural Language Toolkit - Getting Started



In order to install NLTK, we must have Python installed on our computers. You can go to the link www.python.org/downloads and select the latest version for your OS i.e. Windows, Mac and Linux/Unix. For basic tutorial on Python you can refer to the link www.tutorialspoint.com/python3/index.htm.

Install Natural Language Toolkit

Now, once you have Python installed on your computer system, let us understand how we can install NLTK.

Installing NLTK

We can install NLTK on various OS as follows −

On Windows

In order to install NLTK on Windows OS, follow the below steps −

  • First, open the Windows command prompt and navigate to the location of the pip folder.

  • Next, enter the following command to install NLTK −

pip3 install nltk

Now, open the PythonShell from Windows Start Menu and type the following command in order to verify NLTK’s installation −

Import nltk

If you get no error, you have successfully installed NLTK on your Windows OS having Python3.

On Mac/Linux

In order to install NLTK on Mac/Linux OS, write the following command −

sudo pip install -U nltk

If you don’t have pip installed on your computer, then follow the instruction given below to first install pip

First, update the package index by following using following command −

sudo apt update

Now, type the following command to install pip for python 3 −

sudo apt install python3-pip

Through Anaconda

In order to install NLTK through Anaconda, follow the below steps −

First, to install Anaconda, go to the link https://www.anaconda.com/download and then select the version of Python you need to install.

Anaconda

Once you have Anaconda on your computer system, go to its command prompt and write the following command −

conda install -c anaconda nltk
Anaconda Command

You need to review the output and enter ‘yes’. NLTK will be downloaded and installed in your Anaconda package.

Downloading NLTK’s Dataset and Packages

Now we have NLTK installed on our computers but in order to use it we need to download the datasets (corpus) available in it. Some of the important datasets available are stpwords, guntenberg, framenet_v15 and so on.

With the help of following commands, we can download all the NLTK datasets −

import nltk
nltk.download()
Natural Language Toolkit Datasets

You will get the following NLTK downloaded window.

Natural Language Toolkit Download

Now, click on the download button to download the datasets.

How to run NLTK script?

Following is the example in which we are implementing Porter Stemmer algorithm by using PorterStemmer nltk class. with this example you would be able to understand how to run NLTK script.

First, we need to import the natural language toolkit(nltk).

import nltk

Now, import the PorterStemmer class to implement the Porter Stemmer algorithm.

from nltk.stem import PorterStemmer

Next, create an instance of Porter Stemmer class as follows −

word_stemmer = PorterStemmer()

Now, input the word you want to stem. −

word_stemmer.stem('writing')

Output

'write'

word_stemmer.stem('eating')

Output

'eat'
Advertisements