How to Create a list of files, folders, and subfolders in Excel using Python?

Python is a great programming language widely used for various data manipulation tasks. When working with files and folders, it can be useful to generate a list of all the files, folders, and subfolders within a directory. Excel, on the other hand, is a popular spreadsheet application that allows users to organize and analyze data. In this detailed article, we will explore step?by?step how to use Python to create a comprehensive list of files, folders, and subfolders in Excel, providing a convenient way to manage and analyze file structures.

Prerequisites

To follow along with this tutorial, you will need to have Python installed on your computer, as well as the pandas library, which is commonly used for data manipulation tasks in Python. Additionally, a basic understanding of Python syntax and file operations will be helpful.

Install pandas using:

# Install pandas if not already installed
# pip install pandas openpyxl

Complete Solution

Here's a complete script that creates a comprehensive list of files, folders, and subfolders in Excel:

import os
import pandas as pd

def create_file_list_excel(directory_path, output_filename="file_list.xlsx"):
    """
    Create an Excel file with list of files, folders, and subfolders
    """
    data_list = []
    
    # Walk through directory structure
    for root, dirs, files in os.walk(directory_path):
        # Add directories
        for dir_name in dirs:
            dir_path = os.path.join(root, dir_name)
            data_list.append({
                'Type': 'Folder',
                'Name': dir_name,
                'Full Path': dir_path,
                'Size (bytes)': 'N/A'
            })
        
        # Add files
        for file_name in files:
            file_path = os.path.join(root, file_name)
            try:
                file_size = os.path.getsize(file_path)
            except (OSError, IOError):
                file_size = 'N/A'
            
            data_list.append({
                'Type': 'File',
                'Name': file_name,
                'Full Path': file_path,
                'Size (bytes)': file_size
            })
    
    # Create DataFrame and save to Excel
    df = pd.DataFrame(data_list)
    df.to_excel(output_filename, index=False)
    print(f"Excel file '{output_filename}' created successfully!")
    print(f"Total items: {len(data_list)}")
    return df

# Example usage - using current directory
current_dir = "."
result_df = create_file_list_excel(current_dir, "directory_listing.xlsx")

# Display first 5 rows
print("\nFirst 5 entries:")
print(result_df.head())
Excel file 'directory_listing.xlsx' created successfully!
Total items: 8

Step-by-Step Breakdown

Step 1: Importing Required Libraries

We import os for file system operations and pandas for Excel file creation:

import os
import pandas as pd

Step 2: Walking Through Directory Structure

The os.walk() function traverses the directory tree and returns root directory, subdirectories, and files:

# Simple example showing os.walk() structure
directory_path = "."

for root, dirs, files in os.walk(directory_path):
    print(f"Directory: {root}")
    print(f"Subdirectories: {dirs}")
    print(f"Files: {files}")
    print("-" * 40)
    break  # Show only first iteration

Step 3: Enhanced Version with File Metadata

Here's an enhanced version that includes file size, modification date, and better organization:

import os
import pandas as pd
from datetime import datetime

def create_detailed_file_list(directory_path, output_file="detailed_file_list.xlsx"):
    """Create detailed Excel file with file metadata"""
    entries = []
    
    for root, dirs, files in os.walk(directory_path):
        # Process directories
        for dir_name in dirs:
            dir_path = os.path.join(root, dir_name)
            try:
                mod_time = datetime.fromtimestamp(os.path.getmtime(dir_path))
                entries.append({
                    'Type': 'Folder',
                    'Name': dir_name,
                    'Full Path': dir_path,
                    'Size (bytes)': 'N/A',
                    'Modified Date': mod_time.strftime('%Y-%m-%d %H:%M:%S'),
                    'Extension': 'N/A'
                })
            except (OSError, IOError):
                continue
        
        # Process files
        for file_name in files:
            file_path = os.path.join(root, file_name)
            try:
                file_size = os.path.getsize(file_path)
                mod_time = datetime.fromtimestamp(os.path.getmtime(file_path))
                _, extension = os.path.splitext(file_name)
                
                entries.append({
                    'Type': 'File',
                    'Name': file_name,
                    'Full Path': file_path,
                    'Size (bytes)': file_size,
                    'Modified Date': mod_time.strftime('%Y-%m-%d %H:%M:%S'),
                    'Extension': extension.lower() if extension else 'No Extension'
                })
            except (OSError, IOError):
                continue
    
    # Create DataFrame and save
    df = pd.DataFrame(entries)
    df = df.sort_values(['Type', 'Name'])  # Sort by type, then name
    df.to_excel(output_file, index=False)
    
    print(f"Created '{output_file}' with {len(entries)} entries")
    return df

# Create detailed listing
detailed_df = create_detailed_file_list(".", "detailed_directory.xlsx")
print(f"\nColumns: {list(detailed_df.columns)}")

Filtering and Analysis

You can filter and analyze the data before saving to Excel:

import os
import pandas as pd

# Create file listing
entries = []
for root, dirs, files in os.walk("."):
    for file_name in files:
        file_path = os.path.join(root, file_name)
        try:
            file_size = os.path.getsize(file_path)
            _, extension = os.path.splitext(file_name)
            entries.append({
                'Name': file_name,
                'Path': file_path,
                'Size': file_size,
                'Extension': extension.lower()
            })
        except:
            continue

df = pd.DataFrame(entries)

# Filter only Python files
python_files = df[df['Extension'] == '.py']
print(f"Found {len(python_files)} Python files")

# Save filtered results
python_files.to_excel("python_files_only.xlsx", index=False)
print("Saved Python files to 'python_files_only.xlsx'")

Key Features

Feature Description Benefit
os.walk() Recursively traverse directories Finds all nested files
pandas.to_excel() Export DataFrame to Excel Easy to view and share
File metadata Size, date, extension info Comprehensive file details
Error handling Skip inaccessible files Prevents script crashes

Common Use Cases

File Auditing: Generate reports of all files in a project or system directory.

Backup Planning: Identify large files and folder structures before backup operations.

Data Migration: Create inventories when moving files between systems or organizing data.

Conclusion

Using Python's os.walk() and pandas libraries, you can efficiently create comprehensive Excel reports of file and folder structures. This approach provides valuable insights for file management, system audits, and data organization tasks.

Updated on: 2026-03-27T09:53:15+05:30

1K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements