Python - Read all CSV files in a folder in Pandas?

Reading all CSV files from a folder is a common data processing task. Python's glob module combined with Pandas' read_csv() method provides an efficient solution for batch processing multiple CSV files.

Setting Up the File Path

First, we need to specify the directory path containing our CSV files. For this example, we'll use a relative path that works across different systems ?

import pandas as pd
import glob
import os

# Set the path to your CSV files directory
path = "data/"  # Using relative path for better portability

Finding CSV Files with Glob

The glob module uses pattern matching to find all files with the .csv extension ?

import pandas as pd
import glob

# Create sample CSV files for demonstration
sample_data1 = {'Car': ['Audi', 'Porsche', 'RollsRoyce'], 
                'Place': ['Bangalore', 'Mumbai', 'Pune'], 
                'UnitsSold': [80, 110, 100]}
sample_data2 = {'Car': ['BMW', 'Mercedes', 'Lamborghini'], 
                'Place': ['Delhi', 'Hyderabad', 'Chandigarh'], 
                'UnitsSold': [95, 80, 80]}

# In practice, you would have actual CSV files in a directory
# For this demo, we'll simulate finding files
csv_files = ['sales1.csv', 'sales2.csv']
print('CSV files found:', csv_files)
CSV files found: ['sales1.csv', 'sales2.csv']

Reading Multiple CSV Files

Loop through each CSV file and read it using pd.read_csv() ?

import pandas as pd

# Sample data representing CSV file contents
csv_data = {
    'sales1.csv': pd.DataFrame({
        'Car': ['Audi', 'Porsche', 'RollsRoyce'],
        'Place': ['Bangalore', 'Mumbai', 'Pune'],
        'UnitsSold': [80, 110, 100]
    }),
    'sales2.csv': pd.DataFrame({
        'Car': ['BMW', 'Mercedes', 'Lamborghini'],
        'Place': ['Delhi', 'Hyderabad', 'Chandigarh'],
        'UnitsSold': [95, 80, 80]
    })
}

# Simulate reading CSV files
for filename, data in csv_data.items():
    print(f"\nReading file: {filename}")
    print(data)
Reading file: sales1.csv
         Car      Place  UnitsSold
0       Audi  Bangalore         80
1    Porsche     Mumbai        110
2  RollsRoyce       Pune        100

Reading file: sales2.csv
         Car       Place  UnitsSold
0        BMW       Delhi         95
1   Mercedes   Hyderabad         80
2  Lamborghini  Chandigarh         80

Combining All CSV Files

Often you'll want to combine all CSV files into a single DataFrame for analysis ?

import pandas as pd

# Sample DataFrames representing CSV files
df1 = pd.DataFrame({
    'Car': ['Audi', 'Porsche', 'RollsRoyce'],
    'Place': ['Bangalore', 'Mumbai', 'Pune'],
    'UnitsSold': [80, 110, 100]
})

df2 = pd.DataFrame({
    'Car': ['BMW', 'Mercedes', 'Lamborghini'],
    'Place': ['Delhi', 'Hyderabad', 'Chandigarh'],
    'UnitsSold': [95, 80, 80]
})

# Combine all DataFrames
all_dataframes = [df1, df2]
combined_df = pd.concat(all_dataframes, ignore_index=True)

print("Combined DataFrame:")
print(combined_df)
Combined DataFrame:
         Car       Place  UnitsSold
0       Audi   Bangalore         80
1    Porsche      Mumbai        110
2  RollsRoyce        Pune        100
3        BMW       Delhi         95
4   Mercedes   Hyderabad         80
5  Lamborghini  Chandigarh         80

Complete Working Example

Here's the complete code structure for reading all CSV files in a folder ?

import pandas as pd
import glob

# Set path to your CSV files directory
path = "/path/to/your/csv/files/"

# Find all CSV files in the directory
csv_files = glob.glob(path + "*.csv")
print('CSV files found:', csv_files)

# Read and process each CSV file
dataframes = []
for file in csv_files:
    print(f"\nReading file: {file}")
    df = pd.read_csv(file)
    print(df)
    dataframes.append(df)

# Optional: Combine all DataFrames
if dataframes:
    combined_df = pd.concat(dataframes, ignore_index=True)
    print("\nCombined DataFrame:")
    print(combined_df)

Key Points

  • Use glob.glob() with pattern "*.csv" to find CSV files
  • Loop through the file list to process each CSV individually
  • Use pd.concat() to combine multiple DataFrames if needed
  • Use relative paths for better code portability

Conclusion

The combination of glob and pd.read_csv() provides an efficient way to process multiple CSV files. This approach is essential for batch data processing and analysis workflows.

Updated on: 2026-03-26T13:24:55+05:30

7K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements