How can I iterate over files in a given directory in Python?


Iterating over files in a given directory can be helpful for doing things like finding files that match a certain criterion, or counting the number of files in a directory. Python provides the following five ways to walk through all the existing files in a directory

  • os.listdir() method

  • os.walk() method

  • os.scandir() method

  • Using pathlib module

  • glob.iglob() method

Let us look at these methods in detail.

Using os.listdir() Method

The os.listdir() method is used to list all the files present in a directory. It accepts the path of the directory as an argument and all the entries, apart from special entries like “.” And “..”, are returned as a list.

Following is the syntax of this method –

os.listdir(path)

Example

In the following example, we are trying to list all the files present in the current directory using a for loop.

import os, sys

path = "."
dir = os.listdir( path )

for file in dir:
   print(file)

Output

The output is displayed as follows

main.py

Using os.walk() Method

The os.walk() function generates file names in a directory tree by walking it top-down or bottom-up. It returns a three-tuple for each directory in the tree rooted at directory top: (path, names, and filenames)

The path is a string that represents the path to the directory. The names variable contains a list of the names of the subdirectories in path that do not begin with '.' or '..' The filenames variable contains a list of the names of non-directory files in path.

Example

In the following example, let us use the os.walk() method within a loop statement to display all the files and subdirectories present in the current root directory.

import os
path = "."

for root, d_names, f_names in os.walk(path):
   print(root, d_names, f_names)

Output

Let us compile and run the program above, to produce the following result −

. [] ['main.py']

Example

We can also make a full path for each file. For that, we must use the os.path.join() method. This method will create a path for a file. These paths of each file can be appended together using the append() method as shown below.

import os
path = "./TEST"

fname = []
for root,d_names,f_names in os.walk(path):
   for f in f_names:
      fname.append(os.path.join(root, f))

print("fname = %s" %fname)

Output

fname = []

Example

Using the os.walk() method, we can also choose to display what element of the return value tuple we want to print. Let us look at an example program below.

import os

for dirpath, dirs, files in os.walk("."):
   print(dirpath) # prints paths of all subdirectories present

for dirpath, dirs, files in os.walk("."):
   print(dirs) # prints the names of existing subdirectories

for dirpath, dirs, files in os.walk("."):
   print(files) # prints existing files in the current directory

Output

.
[]
['main.py']

Using os.listdir() Method

The os.listdir(my_path) method will get you everything that's present in the my_path directory, including both files and sub-directories. You can use this method even without a loop statement to list all the files and sub-directories present in a directory. However, to iterate through these files, you must use a loop statement.

Example

In the following example, we will try to use the os.listdir() method in a loop statement to iterate through all the files present in a directory.

import os
path = "."

for file_names in os.listdir(path):
   print(file_names)

Output

main.py

Using pathlib Module

The pathlib module provides the classes representing the filesystem paths. It is similar to the path module, but the path module creates a string to represent a file path while the pathlib module creates objects. In this module, we use the glob() method to list files and subdirectories present in a directory.

The glob() method accepts a pattern as a parameter and matches this pattern with the files present in a directory. If the files are considered a match with the pattern, it is returned. If you want to return all the files from a directory, pass an Asterisk (*) as an argument.

Example

Let us try to print the names of all the files and subdirectories present in the root directory using the glob() method. The example is shown below.

from pathlib import Path

root_directory = Path('.')
size = 0
for f in root_directory.glob("*"):
   print(f)

Output

main.py

Using glob Module

The glob module in python is used to search for files in a directory. It uses a pattern and matches it with the files present in a directory. If the files are considered a match with the pattern, they are listed.

This module generally uses the iglob() method to search for files recursively. Like the glob() method, this also accepts a pattern as a parameter and matches this pattern with the files present in a directory. If the files are considered a match with the pattern, it is returned. If you want to return all the files from a directory, pass an Asterisk (*) as an argument.

Example

In this example, we are trying to list all the files in the current directory. Here, since we are listing all the files and subdirectories, we are passing an Asterisk (*) as the pattern.

import glob

pattern = "*"
for f in glob.iglob(pattern):
   print(f)

Output

The result is produced as follows −

main.py

Updated on: 19-Apr-2023

7K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements