Create Word Cloud using Python


In this problem, there is a file with some texts. We have to create Word Clouds from those texts and one masking image. The program will store the word cloud image as png format.

To implement this problem, we need to use some libraries of python. The libraries are matplotlib, wordcloud, numpy, tkinter and PIL.

To install these libraries, we need to follow these commands −

Setup the Libraries

$ sudo pip3 install matplotlib
$ sudo pip3 install wordcloud
$ sudo apt-get install python3-tk

After adding these libraries, we can write the python code to perform the task.

Algorithm

Step 1: Read the data from the file and store it into ‘dataset’. 
Step 2: Create pixel array from the mask image. 
Step 3: Create the word cloud from the dataset. Set the background color, mask, and stop-words. 
Step 4: Store the final image into the disk. 

Input:sampleWords.txt file

Python is a high-level, interpreted, interactive and object-oriented scripting language. Python is designed to be highly readable. It uses English keywords frequently where as other languages use punctuation, and it has fewer syntactical constructions than other languages.

Python was developed by Guido van Rossum in the late eighties and early nineties at the National Research Institute for Mathematics and Computer Science in the Netherlands.

Python is derived from many other languages, including ABC, Modula-3, C, C++, Algol-68, SmallTalk, and Unix shell and other scripting languages.

Python is copyrighted. Like Perl, Python source code is now available under the GNU General Public License (GPL).

Python is now maintained by a core development team at the institute, although Guido van Rossum still holds a vital role in directing its progress.

Another input is the mask image (cloud.png). The final Result is on the right side.

Cloud

Example Code

import matplotlib.pyplot as pPlot
from wordcloud import WordCloud, STOPWORDS
import numpy as npy
from PIL import Image
dataset = open("sampleWords.txt", "r").read()
defcreate_word_cloud(string):
   maskArray = npy.array(Image.open("cloud.png"))
   cloud = WordCloud(background_color = "white", max_words = 200, mask = maskArray, stopwords = set(STOPWORDS))
   cloud.generate(string)
   cloud.to_file("wordCloud.png")
dataset = dataset.lower()
create_word_cloud(dataset)

Output

Word Cloud

karthikeya Boyini
karthikeya Boyini

I love programming (: That's all I know

Updated on: 30-Jul-2019

934 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements