Python Program to Count Words in Text File


When working with text processing and analysis tasks, it is often necessary to count the words in a text file. The objective is to determine the total number of words present in the file. Python offers several modules and functions that can efficiently and effectively perform word-counting tasks.

In this article, we will explore different approaches to obtaining the total word count from a text file using Python programming.

Approach

Following are the steps to count words in a text file −

  • Open the text file − Use the open() function to open the text file in read mode. Specify the file path as the argument.

  • Read the file content − Use the read() method to read the entire content of the file into a string variable.

  • Split the content into words − Split the content string into a list of words. We can use either the split() method or a regular expression pattern (\b\w+\b) to split the content.

  • Count the words − Determine the number of words in the list. We can use the len() function to get the length of the list.

  • And finally, return the word count.

Throughout this article, we will be using the following text file as our input.

Using the split() method

The split() is a Python string method that splits a string into a list of substrings based on a specified delimiter. The split() method can be used to split a string into words by using whitespace as the default delimiter.

Example

Here's an example that counts the number of words in a text file.

def count_words(filename):
    try:
        with open(filename, 'r') as file:
            content = file.read()
            words = content.split()
            word_count = len(words)
            return word_count
    except FileNotFoundError:
        print(f"File '{filename}' not found.")
        return 0

# Provide the path of the text file
file_path = 'Example_text_file.txt'

# Call the function to count words
total_words = count_words(file_path)

print("Total number of words in the file: {}".format(total_words))

Output

File 'Example_text_file.txt' not found.
Total number of words in the file: 0

Using the collections module

In this approach, we use the Counter class from the collections module to count the occurrences of each word in the file.

The Counter object provides a dictionary-like structure where each word is key, and its corresponding value represents the number of occurrences in the text. We then sum up all the values using the sum() function to obtain the total word count.

Example

In this example, we will be using the collections.Counter() method to count the number of words present in a text file.

import collections

def count_words(filename):
    try:
        with open(filename, 'r') as file:
            word_count = collections.Counter(file.read().split())
            return sum(word_count.values())
    except FileNotFoundError:
        print(f"File '{filename}' not found.")
        return 0

# Provide the path of the text file
file_path = 'Example_text_file.txt'

# Call the function to count words
total_words = count_words(file_path)

print("Total number of words in the file: {}".format(total_words))

Output

File 'Example_text_file.txt' not found.
Total number of words in the file: 0

Using regular expressions

Here we will be using the re.findall() function from the re module to extract all the words from the file content using a regular expression pattern. The pattern \b\w+\b matches any sequence of one or more word characters (letters, digits, or underscores) surrounded by word boundaries.

The findall() function returns a list of all matches found in the content. We then determine the length of the list to get the total word count.

Example

Here's another approach to counting the words in a text file using regular expressions in Python.

import re

def count_words(filename):
    try:
        with open(filename, 'r') as file:
            content = file.read()
            words = re.findall(r'\b\w+\b', content)
            word_count = len(words)
            return word_count
    except FileNotFoundError:
        print(f"File '{filename}' not found.")
        return 0

# Provide the path of the text file
file_path = 'Example_text_file.txt'

# Call the function to count words
total_words = count_words(file_path)

print(f"Total number of words in the file: {total_words}")

Output

File 'Example_text_file.txt' not found.
Total number of words in the file: 0

These are the different approaches to counting the words in a text file using Python programming.

Updated on: 29-Aug-2023

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements