How to extract file extension using Python?

In a few scenarios, we need to extract the extension of a file to perform specific operations based on its type, such as validating image formats or filtering document files. Python provides different ways to achieve this using the os and pathlib modules. In this article, we'll explore how to get a file's extension with different approaches.

Using os.path.splitext()

The os.path.splitext() method of the os module in Python is used to split the file name into the name and extension. This method returns a tuple containing the filename without extension and the extension (including the dot).

Example

In this example, we are using the os.path.splitext() method to get the extension of the given file ?

import os

filename = "report.pdf"
name, extension = os.path.splitext(filename)

print("Filename:", name)
print("Extension:", extension)

The output of the above code is ?

Filename: report
Extension: .pdf

Using pathlib.Path.suffix

The pathlib module provides a more object-oriented way to work with filesystem paths. The Path.suffix attribute returns the file's extension, including the dot.

Example

Here is an example using pathlib.Path.suffix to extract the extension of a file ?

from pathlib import Path

file_path = Path("image.jpeg")
extension = file_path.suffix
filename = file_path.stem

print("Filename:", filename)
print("Extension:", extension)

Following is the output of the above program ?

Filename: image
Extension: .jpeg

Using the String split() Method

We can also use the split() method of Python string to extract the extension manually. This approach is simple but may not handle all edge cases, such as multiple dots in filenames.

Example

Below is an example where we extract the extension using the split() method ?

filename = "archive.tar.gz"
parts = filename.split(".")
extension = parts[-1]

print("Filename:", ".".join(parts[:-1]))
print("Extension:", extension)

Here is the output of the above program ?

Filename: archive.tar
Extension: gz

This method only gives the part after the last dot and does not include the dot in the extension.

Extracting Extension Without the Dot

If you want to extract just the extension suffix without the dot (such as py, txt, docx), you can use string slicing to remove the first character.

Example

The following program demonstrates how to extract extensions without the dot using both methods ?

import os
from pathlib import Path

# Using os.path.splitext()
filename = "document.txt"
name, extension = os.path.splitext(filename)
print("Using os.path.splitext():", extension[1:])

# Using pathlib.Path.suffix
file_path = Path("document.txt")
print("Using pathlib.Path.suffix:", file_path.suffix[1:])

The output of the above code is ?

Using os.path.splitext(): txt
Using pathlib.Path.suffix: txt

Comparison

Method Includes Dot? Best For Handles Complex Extensions?
os.path.splitext() Yes Traditional file handling Yes
pathlib.Path.suffix Yes Modern object-oriented approach Yes
split() No Simple cases Limited

Conclusion

Use pathlib.Path.suffix for modern Python code as it's more readable and object-oriented. Use os.path.splitext() for compatibility with older code. The split() method works for simple cases but may not handle complex extensions properly.

Updated on: 2026-03-24T18:33:38+05:30

6K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements