How to extract file extension using Python?


An operating system like Microsoft Windows uses a file extension as a suffix to the name of a computer file. It falls under the category of metadata. An operating system's understanding of a file's attributes and, to some extent, its desired usage is supported by the file extension.

We could need to extract file extensions in Python. You can achieve this in a number of ways.

Os.path module

OS file path manipulation is made simple with the help of the Python module os.path. It covers receiving the data from file paths, opening, saving, and updating.To obtain the file extension in Python, we shall make use of this module.

The function splitext() in os.path allows you to separate the root and extension of a specified file path. A tuple made up of the root string and the extension string is the function's output.

Example Using the splitext() method

The function os.path.splitext() returns a tuple with two items: the file extension and the path with the name of the file, respectively. Following is an example to extract the file using os.path module −

# importing the module import os

# Providing the path path = 'D:\Work TP.py'

# declaring the variable to get the result result = os.path.splitext(path) print('Path:', result[0]) print('Extension:', result[1])

Output

The root file path has now successfully returned the extension. Following is an output of the above code−

Path: D:\Work TP
Extension: .py

pathlib module

Pathlib is a Python module that defines useful functions and constants for classes that represent file paths.

Using a path string as a parameter, pathlib.Path() creates a new Path object.

The attribute suffix on the pathlib.Path object returns information about the file extension.

By simply calling the attributes parent and name within the Path object, we can obtain the parent file path and the actual file name of the provided file path in addition to the root.

Example

Following is an example to extract the file using pathlib module:

import pathlib path = pathlib.Path('D:\Work TP.py') print('Parent:', path.parent) print('NameOfFile:', path.name) print('Extension:', path.suffix)

Output

Following is an output of the above code−

Parent: D:\
NameOfFile: Work TP.py
Extension: .py

Extracting just the extension suffix (without dot)

If you want to remove the dot and extract just the extension suffix such as py, txt, docx etc. You need to add “[1:]” after the result[1]) while working with the splitext() method as –

print('Extension:', result[1][1:])

Similarly while working with the pathlib.Path() method add “[1:]” after path.suffix as−

print('Extension:', path.suffix[1:])

Example

The following program demonstrates how to print just the suffixes using both the methods discussed above −

# importing the modules import os import pathlib path = 'D:/test.txt' result = os.path.splitext(path) print('Extension:', result[1][1:]) print('Extension:', pathlib.Path('D:/test.txt').suffix[1:])

Output

Extension: txt
Extension: txt

Updated on: 17-Aug-2022

4K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements