If there is a need to find 10 most frequent words in a data set, python can help us find it using the collections module. The collections module has a counter class which gives the count of the words after we supply a list of words to it. We also use the most_common method to find out the number of such words as needed by the program input.
In the below example we take a paragraph, and then first create a list of words applying split(). We will then apply the counter() to find the count of all the words. Finally the most_common function will give us the appropriate result of how many such words with highest frequency we want.
from collections import Counter word_set = " This is a series of strings to count " \ "many words . They sometime hurt and words sometime inspire "\ "Also sometime fewer words convey more meaning than a bag of words "\ "Be careful what you speak or what you write or even what you think of. "\ # Create list of all the words in the string word_list = word_set.split() # Get the count of each word. word_count = Counter(word_list) # Use most_common() method from Counter subclass print(word_count.most_common(3))
Running the above code gives us the following result −
[('words', 4), ('sometime', 3), ('what', 3)]