Finding the largest file in a directory using Python


Finding the largest file may be helpful in a number of circumstances, including locating the biggest files on a hard drive to make room for smaller ones or examining the size distribution of files in a directory. The biggest file in a directory may be found using a Python script, which will be covered in this article.

Algorithm

  • Import the os module.

  • Define a function called find_largest_file that takes a directory as input.

  • Initialize a variable called largest_file to None and a variable called largest_size to 0.

  • Use os.walk to traverse the directory tree recursively, starting from the root directory.

  • For each file encountered, get the file size using os.path.getsize and compare it to the current largest size.

  • If the file size is larger than the current largest size, update the largest size and largest file variables.

  • Return the path of the largest file.

Example 1: Print the Size of all Files in a Directory

import os
directory = "./test"
for root, dirs, files in os.walk(directory):
   for file in files:
      file_path = os.path.join(root, file)
      file_size = os.path.getsize(file_path)
      print(f"{file_path}: {file_size} bytes")

Output

You need to make a folder and call it “test” and store some files/folders in it. Output will depend on files you store on your system

./test\Krz_Earthwork_Clean.jpg: 291048 bytes
./test\Krz_Earthwork_Folded.jpg: 3081472 bytes
./test\Krz_Earthwork_Xerox.jpg: 5871915 bytes
./test\Krz_EquusOils.jpg: 1374387 bytes

This example uses os.walk to traverse the directory tree recursively, starting from the root directory. For each file encountered, it gets the file size using os.path.getsize and prints the file path and size in bytes.

Example 2: Find the Largest file in a Directory Using a Lambda Function

import os

directory = "./test"
largest_file = max(
   (os.path.join(root, file) for root, dirs, files in os.walk(directory) for file in files),
   key=os.path.getsize
)
print(largest_file)

Output

./test\Krz_Earthwork_Xerox.jpg

This example uses a lambda function and the built-in max function to find the largest file in a directory. It uses os.walk to traverse the directory tree recursively, starting from the root directory. It creates a generator expression that yields the path of each file in the directory, and then passes that generator expression to the max function. The max function uses the key argument to sort the files by size and return the largest file.

import os
def find_largest_file(directory):
   largest_file = None
   largest_size = 0
   for root, dirs, files in os.walk(directory):
      for file in files:
         file_path = os.path.join(root, file)
         file_size = os.path.getsize(file_path)
         if file_size > largest_size:
            largest_size = file_size
   largest_file = file_path
   return largest_file

directory = "./test"
largest_file = find_largest_file(directory)
if largest_file is not None:
   print(f"The largest file in {directory} is: {largest_file}")
else:
   print(f"No files found in {directory}")

Output

The largest file in ./test is: ./test\Krz_Earthwork_Xerox.jpg

This example defines a function called find_largest_file that takes a directory as input. It initializes a variable called largest_file to None and a variable called largest_size to 0. It uses os.walk to traverse the directory tree recursively, starting from the root directory. For each file encountered, it gets the file size using os.path.getsize and compares it to the current largest size. If the file size is larger than the current largest size, it updates the largest size and largest file variables. Finally, it returns the path of the largest file.

The example then calls the find_largest_file function with a directory as input and prints the path of the largest file, if found. If no files are found in the directory, it prints a message indicating that.

Applications

  • Recognizing the biggest records on a hard drive to let loose space

  • Breaking down the size conveyance of records in a catalog

  • Computerizing record the executives undertakings, like erasing huge documents or moving them to an alternate area

Conclusion

From publishing a directory's contents to determining the largest file using a lambda function and built-in max function, this blog post covered using Python to locate the largest file in a directory and ran through a number of code examples that highlight various aspects of the work. In addition, we discussed a number of applications for locating the largest file in a directory, such as locating large files to make room for smaller ones or examining the size distribution of files in a directory, as well as a comprehensive example of how to construct a function utilizing os.walk and os.path.getsize.

Updated on: 21-Aug-2023

415 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements