Python Program to Get word frequency in percentage


In this article, we will learn how to get word frequency in percentage in python.

Assume we have taken an input list of strings. We will now find the percentage of each word in the given input list of strings.

Formula

(Occurrence of X word / Total words) * 100

Methods Used

  • Using sum(), Counter(), join() and split() functions

  • Using join(),split() and count() functions

  • Using countOf() function from operator module.

Method 1: Using sum(), Counter(), join() and split() functions

join() is a string function in Python that is used to join elements of a sequence that are separated by a string separator. This function connects sequence elements to form a string.

Counter() function is a sub-class that counts the hashable objects. It implicitly creates a hash table of an iterable when called/invoked.

Algorithm (Steps)

Following are the Algorithm/steps to be followed to perform the desired task –.

  • Use the import keyword to import the Counter function from the collections module.

  • Create a variable to store the input list of strings and print the list.

  • Use the join() function to join all the string elements of the input list.

  • Use the split() function(splits a string into a list. We can define the separator; the default separator is any whitespace) to split the joined strings into a list of words and get the frequency of words as key-value pairs using the Counter() function

  • Use the values() function to get all the values(frequencies/counts) from Counter and get the total sum of them using the sum() function(returns the sum of all items in an iterable).

  • Get the percentage of each word from the above counter words using the items() function(returns a view object i.e, it contains the key-value pairs of the dictionary, as tuples in a list).

  • Print the percentage of each word from the input list.

Example

The following program returns the percentage of each word in the given input list of strings using sum(), Counter(), join(), and split() functions –

# importing a Counter function from the collections module
from collections import Counter

# input list of strings
inputList = ["hello tutorialspoint", "python codes", "tutorialspoint for python", "see python codes tutorialspoint"]
print("Input list:\n", inputList)

# Joining all the string elements of the list using the join() function
join_string = " ".join(i for i in inputList)

# splitting the joined string into a list of words and getting the

# frequency of words as key-value pairs using Counter() function

counter_words = Counter(join_string.split())
# getting all the values(frequencies/counts) from counter and

# finding the total sum of them
total_sum = sum(counter_words.values())

# getting the percentage of each word from the above counter words
res_percentage = {key: value / total_sum for key,
value in counter_words.items()}

# printing the percentage of each word from the input list
print("Percentage of each word from the input list:\n", res_percentage)

Output

On execution, the above program will generate the following output –

Input list:
['hello tutorialspoint', 'python codes', 'tutorialspoint for python', 'see python codes tutorialspoint']
Percentage of each word from the input list:
{'hello': 0.09090909090909091, 'tutorialspoint': 0.2727272727272727, 'python': 0.2727272727272727, 'codes': 0.18181818181818182, 'for': 0.09090909090909091, 'see': 0.09090909090909091}

Method 2: Using join(),split() and count() functions

Algorithm (Steps)

Following are the Algorithm/steps to be followed to perform the desired task –.

  • Create an empty dictionary for storing the resultant percentages/word frequencies.

  • Use the for loop to traverse through the list of words.

  • Use the if conditional statement to check whether the current element is not in the keys of the dictionary using the keys() function.

  • If the above condition is true then get the count of this key(word) using the count() function.

  • Divide this by the number of words to get the current word frequency and store this as a key in the above-created new dictionary.

  • Print the percentage of each word from the input list.

Example

The following program returns the percentage of each word in the given input list of strings using join(),split() and count() functions –

# input list of strings
inputList = ["hello tutorialspoint", "python codes", "tutorialspoint for python", "see python codes tutorialspoint"]

# joining all the elements of the list using join()
join_string = " ".join(i for i in inputList)

# splitting the joined string into a list of words
listOfWords = join_string.split()

# Creating an empty dictionary for storing the resultant percentages
resDict = dict()

# traversing through the list of words
for item in listOfWords:
   
   # checking whether the current element is not in the keys of a dictionary
   if item not in resDict.keys():
      
      # getting the percentage of a current word if the condition is true
      resDict[item] = listOfWords.count(item)/len(listOfWords)

# printing the percentage of each word from the input list
print("Percentage of each word from the input list:\n", resDict)

Output

On execution, the above program will generate the following output –

Percentage of each word from the input list:
{'hello': 0.09090909090909091, 'tutorialspoint': 0.2727272727272727, 'python': 0.2727272727272727, 'codes': 0.18181818181818182, 'for': 0.09090909090909091, 'see': 0.09090909090909091}

Method 3: Using countOf() function from operator module

Example

The following program returns the percentage of each word in the given input list of strings using countOf() function –

import operator as op
# input list of strings
inputList = ["hello tutorialspoint", "python codes", "tutorialspoint for python", "see python codes tutorialspoint"]

# joining all the elements of list using join()
join_string = " ".join(i for i in inputList)

# splitting the joined string into list of words
listOfWords = join_string.split()
resDict = dict()
for item in listOfWords:
   
   # checking whether the current element is not in the keys of dictionary
   if item not in resDict.keys():
      resDict[item] = op.countOf(listOfWords,   item)/len(listOfWords)
print("Percentage of each word from the input list:\n", resDict)

Output

On execution, the above program will generate the following output –

Percentage of each word from the input list:
{'hello': 0.09090909090909091, 'tutorialspoint': 0.2727272727272727, 'python': 0.2727272727272727, 'codes': 0.18181818181818182, 'for': 0.09090909090909091, 'see': 0.09090909090909091}

Conclusion

In this article, we learned three different Python methods for calculating word frequency in percentage. We also learned how to get the frequency of list elements by using the operator module's new function countOf().

Updated on: 27-Jan-2023

401 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements