Unix style pathname pattern expansion in Python (glob)


Many a times a program needs to iterate through a list of files in the file system, often with names matching a pattern. The glob module is useful in creating lit of files in specific directory, having a certain extension, or with a certain string as a part of file name.

The pattern matching mechanism used by glob module functions follows UNIX path expansion rules. This module though doesn’t expand tilde (~) and shell variables.

There are mainly three function in glob module

glob()

This function returns a list of files that match the given pattern in pathname parameter. The pathname can be absolute or relative. It can slo include wild cards like * and ?.

The recursive parameter of this function is False by default. If True, subdirectories of current directory are recursively searched to find files matching the given pattern.

Following code prints all file in current directory with ‘.py’ extension.

>>> import glob
>>> for file in glob.glob("*.py"):
print (file)

In following code, recursive=True parameter causes files with ‘.py’ extensions from subdirectories to be printed as well.

>>> for file in glob.glob("*.py", recursive=True):
print (file)

The pattern path name can include wild card character ?. following statement prints list of files whose name is three letters with first two letters are ‘pp’.

>>> for file in glob.glob("pp?.py"):
print (file)

Following code prints files whose name ends with a digit.

>>> for file in glob.glob('*[0-9].py')
print (file)

Following syntax causes files matching given path printed recursively.

>>> glob.glob('**/*.py', recursive=True)

To print names of directories recursively in the current directory,

>>> glob.glob('tcl/**/', recursive=True)

iglob()

This function returns a generator object instead of list of files. Using next() function, subsequent file names can be printed as below.

>>> it=glob.iglob('*.py')
>>> type(it)
<class 'generator'>
>>> while True:
try:
file=next(it)
print (file)
except StopIteration:
break

escape()

This function escapes given characters. This is useful when files with certain characters need to be part of name. Following example searches for files having any of the characters in chars string.

>>> chars='[]()#'
>>> for char in chars:
esc='*'+glob.escape(char)+'.py'
for file in (glob.glob(esc)):
print (file)
xyz[.py
pp[].py
pp(.py
pp#.py

Updated on: 25-Jun-2020

368 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements