Python - Write multiple files data to master file

When working with multiple data files, you often need to combine them into a single master file. Python provides several approaches to merge multiple files, from basic file operations to using pandas for structured data.

Basic File Operations Approach

This method reads multiple text files and writes their content to a master file using standard file operations ?

import os

# Create sample data files
os.makedirs('data_files', exist_ok=True)

# Create sample files
with open('data_files/file1.txt', 'w') as f:
    f.write('John,Developer,50000\n')
    f.write('Alice,Designer,45000\n')

with open('data_files/file2.txt', 'w') as f:
    f.write('Bob,Manager,60000\n')
    f.write('Carol,Analyst,48000\n')

# List all files in the directory
file_list = os.listdir('data_files')
master_file = 'master.txt'

# Create master file with header
with open(master_file, 'w') as output:
    output.write('Name,Position,Salary\n')
    
    # Read each file and append to master
    for filename in file_list:
        if filename.endswith('.txt'):
            file_path = os.path.join('data_files', filename)
            with open(file_path, 'r') as input_file:
                content = input_file.read()
                output.write(content)

print("Files merged successfully!")
Files merged successfully!

Using Pandas for Structured Data

For CSV files or structured data, pandas provides a more efficient approach ?

import pandas as pd
import os

# Create sample CSV files
data1 = {'Name': ['John', 'Alice'], 
         'Position': ['Developer', 'Designer'],
         'Salary': [50000, 45000]}
df1 = pd.DataFrame(data1)
df1.to_csv('emp_1.csv', index=False)

data2 = {'Name': ['Bob', 'Carol'], 
         'Position': ['Manager', 'Analyst'],
         'Salary': [60000, 48000]}
df2 = pd.DataFrame(data2)
df2.to_csv('emp_2.csv', index=False)

# Read and combine CSV files
dataframes = []
csv_files = ['emp_1.csv', 'emp_2.csv']

for file in csv_files:
    df = pd.read_csv(file)
    dataframes.append(df)

# Concatenate all dataframes
combined_df = pd.concat(dataframes, ignore_index=True)

# Write to master file
combined_df.to_csv('master_employees.csv', index=False)
print("Master file created with pandas!")
print(combined_df)
Master file created with pandas!
    Name   Position  Salary
0   John  Developer   50000
1  Alice   Designer   45000
2    Bob    Manager   60000
3  Carol    Analyst   48000

Batch Processing Multiple Files

For processing many files automatically, use glob pattern matching ?

import glob
import pandas as pd

# Create multiple sample files
for i in range(3):
    data = {'ID': [i*2+1, i*2+2], 
            'Value': [10+i, 20+i]}
    df = pd.DataFrame(data)
    df.to_csv(f'data_{i}.csv', index=False)

# Use glob to find all CSV files
csv_files = glob.glob('data_*.csv')
print(f"Found files: {csv_files}")

# Read and combine all files
all_data = []
for file in csv_files:
    df = pd.read_csv(file)
    all_data.append(df)

# Merge all data
final_df = pd.concat(all_data, ignore_index=True)
final_df.to_csv('combined_data.csv', index=False)

print("Combined data:")
print(final_df)
Found files: ['data_0.csv', 'data_1.csv', 'data_2.csv']
Combined data:
   ID  Value
0   1     10
1   2     10
2   3     11
3   4     11
4   5     12
5   6     12

Comparison of Methods

Method Best For Advantages Limitations
Basic File Operations Simple text files No dependencies Manual handling
Pandas Structured data (CSV) Built-in data handling Requires pandas
Glob Pattern Many files Automatic file discovery Pattern-based only

Conclusion

Use pandas for structured data like CSV files as it handles data types and formatting automatically. For simple text files, basic file operations work well. Use glob patterns when processing many files with similar names.

Updated on: 2026-03-25T09:21:45+05:30

645 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements