Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
How to Create a list of files, folders, and subfolders in Excel using Python?
Python is a great programming language widely used for various data manipulation tasks. When working with files and folders, it can be useful to generate a list of all the files, folders, and subfolders within a directory. Excel, on the other hand, is a popular spreadsheet application that allows users to organize and analyze data. In this detailed article, we will explore step?by?step how to use Python to create a comprehensive list of files, folders, and subfolders in Excel, providing a convenient way to manage and analyze file structures.
Prerequisites
To follow along with this tutorial, you will need to have Python installed on your computer, as well as the pandas library, which is commonly used for data manipulation tasks in Python. Additionally, a basic understanding of Python syntax and file operations will be helpful.
Install pandas using:
# Install pandas if not already installed # pip install pandas openpyxl
Complete Solution
Here's a complete script that creates a comprehensive list of files, folders, and subfolders in Excel:
import os
import pandas as pd
def create_file_list_excel(directory_path, output_filename="file_list.xlsx"):
"""
Create an Excel file with list of files, folders, and subfolders
"""
data_list = []
# Walk through directory structure
for root, dirs, files in os.walk(directory_path):
# Add directories
for dir_name in dirs:
dir_path = os.path.join(root, dir_name)
data_list.append({
'Type': 'Folder',
'Name': dir_name,
'Full Path': dir_path,
'Size (bytes)': 'N/A'
})
# Add files
for file_name in files:
file_path = os.path.join(root, file_name)
try:
file_size = os.path.getsize(file_path)
except (OSError, IOError):
file_size = 'N/A'
data_list.append({
'Type': 'File',
'Name': file_name,
'Full Path': file_path,
'Size (bytes)': file_size
})
# Create DataFrame and save to Excel
df = pd.DataFrame(data_list)
df.to_excel(output_filename, index=False)
print(f"Excel file '{output_filename}' created successfully!")
print(f"Total items: {len(data_list)}")
return df
# Example usage - using current directory
current_dir = "."
result_df = create_file_list_excel(current_dir, "directory_listing.xlsx")
# Display first 5 rows
print("\nFirst 5 entries:")
print(result_df.head())
Excel file 'directory_listing.xlsx' created successfully! Total items: 8
Step-by-Step Breakdown
Step 1: Importing Required Libraries
We import os for file system operations and pandas for Excel file creation:
import os import pandas as pd
Step 2: Walking Through Directory Structure
The os.walk() function traverses the directory tree and returns root directory, subdirectories, and files:
# Simple example showing os.walk() structure
directory_path = "."
for root, dirs, files in os.walk(directory_path):
print(f"Directory: {root}")
print(f"Subdirectories: {dirs}")
print(f"Files: {files}")
print("-" * 40)
break # Show only first iteration
Step 3: Enhanced Version with File Metadata
Here's an enhanced version that includes file size, modification date, and better organization:
import os
import pandas as pd
from datetime import datetime
def create_detailed_file_list(directory_path, output_file="detailed_file_list.xlsx"):
"""Create detailed Excel file with file metadata"""
entries = []
for root, dirs, files in os.walk(directory_path):
# Process directories
for dir_name in dirs:
dir_path = os.path.join(root, dir_name)
try:
mod_time = datetime.fromtimestamp(os.path.getmtime(dir_path))
entries.append({
'Type': 'Folder',
'Name': dir_name,
'Full Path': dir_path,
'Size (bytes)': 'N/A',
'Modified Date': mod_time.strftime('%Y-%m-%d %H:%M:%S'),
'Extension': 'N/A'
})
except (OSError, IOError):
continue
# Process files
for file_name in files:
file_path = os.path.join(root, file_name)
try:
file_size = os.path.getsize(file_path)
mod_time = datetime.fromtimestamp(os.path.getmtime(file_path))
_, extension = os.path.splitext(file_name)
entries.append({
'Type': 'File',
'Name': file_name,
'Full Path': file_path,
'Size (bytes)': file_size,
'Modified Date': mod_time.strftime('%Y-%m-%d %H:%M:%S'),
'Extension': extension.lower() if extension else 'No Extension'
})
except (OSError, IOError):
continue
# Create DataFrame and save
df = pd.DataFrame(entries)
df = df.sort_values(['Type', 'Name']) # Sort by type, then name
df.to_excel(output_file, index=False)
print(f"Created '{output_file}' with {len(entries)} entries")
return df
# Create detailed listing
detailed_df = create_detailed_file_list(".", "detailed_directory.xlsx")
print(f"\nColumns: {list(detailed_df.columns)}")
Filtering and Analysis
You can filter and analyze the data before saving to Excel:
import os
import pandas as pd
# Create file listing
entries = []
for root, dirs, files in os.walk("."):
for file_name in files:
file_path = os.path.join(root, file_name)
try:
file_size = os.path.getsize(file_path)
_, extension = os.path.splitext(file_name)
entries.append({
'Name': file_name,
'Path': file_path,
'Size': file_size,
'Extension': extension.lower()
})
except:
continue
df = pd.DataFrame(entries)
# Filter only Python files
python_files = df[df['Extension'] == '.py']
print(f"Found {len(python_files)} Python files")
# Save filtered results
python_files.to_excel("python_files_only.xlsx", index=False)
print("Saved Python files to 'python_files_only.xlsx'")
Key Features
| Feature | Description | Benefit |
|---|---|---|
os.walk() |
Recursively traverse directories | Finds all nested files |
pandas.to_excel() |
Export DataFrame to Excel | Easy to view and share |
| File metadata | Size, date, extension info | Comprehensive file details |
| Error handling | Skip inaccessible files | Prevents script crashes |
Common Use Cases
File Auditing: Generate reports of all files in a project or system directory.
Backup Planning: Identify large files and folder structures before backup operations.
Data Migration: Create inventories when moving files between systems or organizing data.
Conclusion
Using Python's os.walk() and pandas libraries, you can efficiently create comprehensive Excel reports of file and folder structures. This approach provides valuable insights for file management, system audits, and data organization tasks.
