Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
How to Merge multiple CSV Files into a single Pandas dataframe ?
To merge multiple CSV files into a single Pandas DataFrame, you can use pd.concat() with pd.read_csv(). This approach efficiently combines data from multiple files while preserving the structure.
Basic Setup
First, import the required Pandas library ?
import pandas as pd
Creating Sample CSV Data
Let's create sample CSV files to demonstrate the merging process ?
import pandas as pd
import io
# Create sample data for first CSV
csv1_data = """Car,Place,UnitsSold
Audi,Bangalore,80
Porsche,Mumbai,110
RollsRoyce,Pune,100"""
# Create sample data for second CSV
csv2_data = """Car,Place,UnitsSold
BMW,Delhi,95
Mercedes,Hyderabad,80
Lamborghini,Chandigarh,80"""
# Convert to DataFrames to simulate CSV files
df1 = pd.read_csv(io.StringIO(csv1_data))
df2 = pd.read_csv(io.StringIO(csv2_data))
print("First CSV data:")
print(df1)
print("\nSecond CSV data:")
print(df2)
First CSV data:
Car Place UnitsSold
0 Audi Bangalore 80
1 Porsche Mumbai 110
2 RollsRoyce Pune 100
Second CSV data:
Car Place UnitsSold
0 BMW Delhi 95
1 Mercedes Hyderabad 80
2 Lamborghini Chandigarh 80
Merging CSV Files Using concat()
Use pd.concat() with ignore_index=True to merge the DataFrames and reset the index ?
import pandas as pd
import io
# Sample CSV data
csv1_data = """Car,Place,UnitsSold
Audi,Bangalore,80
Porsche,Mumbai,110
RollsRoyce,Pune,100"""
csv2_data = """Car,Place,UnitsSold
BMW,Delhi,95
Mercedes,Hyderabad,80
Lamborghini,Chandigarh,80"""
# Read CSV data
df1 = pd.read_csv(io.StringIO(csv1_data))
df2 = pd.read_csv(io.StringIO(csv2_data))
# Merge multiple CSV files
merged_df = pd.concat([df1, df2], ignore_index=True)
print("Merged CSV files:")
print(merged_df)
Merged CSV files:
Car Place UnitsSold
0 Audi Bangalore 80
1 Porsche Mumbai 110
2 RollsRoyce Pune 100
3 BMW Delhi 95
4 Mercedes Hyderabad 80
5 Lamborghini Chandigarh 80
Alternative Method Using List Comprehension
For multiple files, you can use list comprehension with file paths ?
import pandas as pd
import io
# Simulate multiple CSV files
csv_files = [
"""Car,Place,UnitsSold
Audi,Bangalore,80
Porsche,Mumbai,110""",
"""Car,Place,UnitsSold
BMW,Delhi,95
Mercedes,Hyderabad,80""",
"""Car,Place,UnitsSold
Tesla,Chennai,70
Ferrari,Kolkata,60"""
]
# Convert to DataFrames (simulating reading from files)
dataframes = [pd.read_csv(io.StringIO(csv_data)) for csv_data in csv_files]
# Merge all DataFrames
merged_df = pd.concat(dataframes, ignore_index=True)
print("Merged multiple CSV files:")
print(merged_df)
Merged multiple CSV files:
Car Place UnitsSold
0 Audi Bangalore 80
1 Porsche Mumbai 110
2 BMW Delhi 95
3 Mercedes Hyderabad 80
4 Tesla Chennai 70
5 Ferrari Kolkata 60
Key Parameters
Important parameters for merging CSV files:
-
ignore_index=True− Resets index in the merged DataFrame -
axis=0− Concatenates along rows (default) -
sort=False− Prevents automatic column sorting
Conclusion
Use pd.concat() with ignore_index=True to merge multiple CSV files into a single DataFrame. This method preserves data structure while combining rows from all files efficiently.
Advertisements
