How to write Python regular expression to match with file extension?

Using Python, we can easily search for specific types of files from a mixed list of file names using regular expressions. For this, we need to import Python's built-in module called re using the import keyword.

The Regular expression or Regex is a special sequence of characters like \, *, ^, etc, which are used to search for a pattern in a string or a set of strings. It can detect the presence or absence of characters by matching them with a particular pattern and can also split a string into one or more substrings.

Importing the Regex Module

The following is the syntax to import the Regex module ?

import re

Usage of 're.search()' Function

The re.search() is used to search for the first occurrence of a character or pattern in a string. If the search character is not found in the string, it will return None. If the pattern is found, it returns a match object with the following information ?

  • Match Object: This indicates that the search() function has found a match and returned a match object. The match object stores details about the match.
  • span(): This will return the start index and end index values of the matched characters.
  • match: This is the actual substring that was matched in the search.

Syntax

re.search(pattern, string, flags=0)

Parameters

  • pattern: This is the regular expression to be matched
  • string: This is the string that would be searched to match the pattern anywhere in the string.
  • flags: It is an optional parameter; different flags can be specified using bitwise OR (|).

Example

Following is a basic example of the re.search() function ?

import re

s = "Welcome to Tutorialspoint"
res = re.search(r"o", s)
print(res)
<re.Match object; span=(4, 5), match='o'>

Regular Expression to Match with File Extension

To match file extensions, we use the dollar symbol ($), which is a metacharacter used in regular expressions to match the end of a string. It checks whether the string ends with the specified characters.

Matching Single File Extension

In the following example, we have a list of file names with various extensions. Using the regex module, we check if any file in the list ends with the .txt extension ?

# import library 
import re 

# list of different types of file 
filenames = ["tp.html", "tutorial.xml", "tutorialspoint.txt", "tutorials_point.jpg"] 

for file in filenames: 
    # search given pattern in the line 
    match = re.search(r"\.txt$", file) 
    # if match is found 
    if match: 
        print("The file ending with .txt is -", file) 
The file ending with .txt is - tutorialspoint.txt

Matching Multiple File Extensions

You can also match multiple file extensions using the OR operator (|) in regex ?

import re

filenames = ["document.pdf", "image.jpg", "script.py", "data.csv", "page.html"]

# Match files ending with .py, .csv, or .pdf
for file in filenames:
    match = re.search(r"\.(py|csv|pdf)$", file)
    if match:
        print(f"Found {match.group(1).upper()} file: {file}")
Found PDF file: document.pdf
Found PY file: script.py
Found CSV file: data.csv

Using re.findall() for All Matches

To extract all file extensions from a list, you can use re.findall() ?

import re

filenames = ["report.docx", "image.png", "script.py", "data.xlsx"]
filename_string = " ".join(filenames)

# Find all file extensions
extensions = re.findall(r"\.([a-zA-Z0-9]+)", filename_string)
print("File extensions found:", extensions)
File extensions found: ['docx', 'png', 'py', 'xlsx']

Conclusion

Use the $ metacharacter with re.search() to match file extensions at the end of strings. The pattern r"\.ext$" matches files ending with ".ext", while the OR operator | allows matching multiple extensions in a single pattern.

Updated on: 2026-03-24T19:15:53+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements