- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
Find the k most frequent words from data set in Python
If there is a need to find 10 most frequent words in a data set, python can help us find it using the collections module. The collections module has a counter class which gives the count of the words after we supply a list of words to it. We also use the most_common method to find out the number of such words as needed by the program input.
Examples
In the below example we take a paragraph, and then first create a list of words applying split(). We will then apply the counter() to find the count of all the words. Finally the most_common function will give us the appropriate result of how many such words with highest frequency we want.
from collections import Counter word_set = " This is a series of strings to count " \ "many words . They sometime hurt and words sometime inspire "\ "Also sometime fewer words convey more meaning than a bag of words "\ "Be careful what you speak or what you write or even what you think of. "\ # Create list of all the words in the string word_list = word_set.split() # Get the count of each word. word_count = Counter(word_list) # Use most_common() method from Counter subclass print(word_count.most_common(3))
Output
Running the above code gives us the following result −
[('words', 4), ('sometime', 3), ('what', 3)]
Advertisements