- Data Structure
- Networking
- RDBMS
- Operating System
- Java
- MS Excel
- iOS
- HTML
- CSS
- Android
- Python
- C Programming
- C++
- C#
- MongoDB
- MySQL
- Javascript
- PHP
- Physics
- Chemistry
- Biology
- Mathematics
- English
- Economics
- Psychology
- Social Studies
- Fashion Studies
- Legal Studies
- Selected Reading
- UPSC IAS Exams Notes
- Developer's Best Practices
- Questions and Answers
- Effective Resume Writing
- HR Interview Questions
- Computer Glossary
- Who is Who
How to calculate a directory size using Python?
A directory is simply defined as a collection of subdirectories and single files; or either one of them. These subdirectories are separated using a “/” operator in a directory hierarchy.
A directory hierarchy is constructed by organizing all the files and subdirectories within a main directory, also known as “root” directory. When the size of a directory is to be calculated, we will consider it as a root directory and calculate the individual sizes of all the files and subdirectories (if any) present in it.
Hence, to get the size of a directory we must walk through the hierarchy to get the sizes of all the files in it. Python provides several ways to do it.
Using os.path.getsize() method
Using os.stat().st_size property
Using du command in *NIX OSes
Let us discuss all these methods elaborately further in this article.
Using os.path.getsize() Method
The os.path.getsize() method is used to retrieve the size of a single file within a directory. To get the total directory size, we can add up sizes of all the files present in it. But, to walk through all the files in this directory, we use os.walk() method in addition to this method.
This method accepts the file path as its argument and returns the size of a file in bytes ().
Example
Let us see an example to calculate the size of a local directory. Here, using loop statements we are walking through a directory hierarchy with the help of the os.walk() method. Then, the path of each file within this directory is retrieved using the os.path.join() method, which is then passed as an argument to the os.path.getsize() method. The sizes of all the files are then added and displayed.
import os total_size = 0 start_path = '.' # To get size of current directory for path, dirs, files in os.walk(start_path): for f in files: fp = os.path.join(path, f) total_size += os.path.getsize(fp) print("Directory size: " + str(total_size))
Output
If we execute the program above, the output is produced as follows. One must remember that the output varies for different directories
Directory size: 260
Instead of os.walk() method, we can also use the os.scandir() method or os.listdir() method to list the files and retrieve their individual sizes.
Let us look at some examples below −
Example
In this example, we are using the scandir() method to scan the current directory and recursively get the sizes of all the files present in it. The sizes are added together to retrieve the total size of the directory.
import os total_size = 0 start_path = '.' # To get size of current directory with os.scandir(start_path) as d: for f in d: if f.is_file(): fp = os.path.join(start_path, f) total_size += os.path.getsize(fp) print("Directory size: " + str(total_size))
Output
The output for the program above is as follows −
Directory size: 278
Example
Here, let us use the os.listdir() method instead of os.scandir() method.
import os total_size = 0 start_path = '.' # To get size of current directory for f in os.listdir(start_path): f = os.path.join(start_path, f) total_size += os.path.getsize(f) print("Directory size: " + str(total_size))
Output
Let us compile and run the program above to produce the output as follows −
Directory size: 226
Using os.stat().st_size Property
Another way to retrieve the size of a file is by using os.stat().st_size property. The os.stat() method is used to get the size (in bytes) or other file related information. Since we just need the information about size of a file, we are using the st_size property only.
Example
In the example below, we are importing the pathlib module and using the glob() method we will list all the files present in the current directory. Then, if files exist in the directory, their sizes are calculated recursively using the os.stat().st_size property.
from pathlib import Path root_directory = Path('.') size = 0 for f in root_directory.glob("*"): if f.is_file(): sm = f.stat().st_size size = sm + size print("Size of current directory:", size)
Output
If we execute the program above, the result is produced as follows −
Size of current directory: 209
Example
We can also use os.scandir() method to list all the files in the directory instead of the glob() method. The example demonstrating this is given below.
import os def get_dir_size(path): total = 0 with os.scandir(path) as d: for f in d: if f.is_file(): total += f.stat().st_size elif f.is_dir(): total += get_dir_size(f.path) return total print("The size of current directory", get_dir_size('.'))
Output
The output for the given program above is displayed as follows −
The size of current directory 303
In *NIX OS
If you're on *NIX OSes then you could simply call the du command using subprocess module as it is much easier than the way above.
Example
The size of the current directory in a *NIX OS can be simply calculated as shown in the example below.
import subprocess path = '.' size = subprocess.check_output(['du','-sh', path]).split()[0].decode('utf-8') print("Directory size: " + size)
Output
The size of the current directory will be returned as follows. However, the output differs for different directories.
Directory size: 8.0K