Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
How to combine dataframes in Pandas?
Pandas provides several methods to combine DataFrames efficiently. The three most common approaches are concat() with inner join for column-wise combination, concat() for vertical stacking, and merge() for database-style joins.
Using concat() with Inner Join
The concat() function with join='inner' combines DataFrames side by side, keeping only matching indices ?
import pandas as pd
# Create sample DataFrames
df1_data = {'Player': ['Jacob', 'Steve', 'David', 'John', 'Kane'],
'Age': [29, 25, 31, 26, 27]}
df2_data = {'Rank': [1, 2, 3, 4, 5],
'Points': [100, 87, 80, 70, 50]}
df1 = pd.DataFrame(df1_data)
df2 = pd.DataFrame(df2_data)
print("DataFrame1:")
print(df1)
print("\nDataFrame2:")
print(df2)
# Combine DataFrames horizontally using inner join
result = pd.concat([df1, df2], axis=1, join='inner')
print("\nCombined DataFrames (Inner Join):")
print(result)
DataFrame1: Player Age 0 Jacob 29 1 Steve 25 2 David 31 3 John 26 4 Kane 27 DataFrame2: Rank Points 0 1 100 1 2 87 2 3 80 3 4 70 4 5 50 Combined DataFrames (Inner Join): Player Age Rank Points 0 Jacob 29 1 100 1 Steve 25 2 87 2 David 31 3 80 3 John 26 4 70 4 Kane 27 5 50
Using concat() for Vertical Combination
The concat() function stacks DataFrames vertically when they have similar columns ?
import pandas as pd
# Create DataFrames with same structure
df1_data = {'Player': ['Steve', 'David'], 'Age': [29, 25]}
df2_data = {'Player': ['John', 'Kane'], 'Age': [31, 27]}
df1 = pd.DataFrame(df1_data)
df2 = pd.DataFrame(df2_data)
print("DataFrame1:")
print(df1)
print("\nDataFrame2:")
print(df2)
# Combine DataFrames vertically
result = pd.concat([df1, df2], ignore_index=True)
print("\nCombined DataFrames (Vertical):")
print(result)
DataFrame1: Player Age 0 Steve 29 1 David 25 DataFrame2: Player Age 0 John 31 1 Kane 27 Combined DataFrames (Vertical): Player Age 0 Steve 29 1 David 25 2 John 31 3 Kane 27
Using merge() for Database-Style Joins
The merge() function performs database-style joins based on common columns ?
import pandas as pd
# Create DataFrames with a common column
df1_data = {'Player': ['Steve', 'David', 'John'], 'Age': [29, 25, 31]}
df2_data = {'Player': ['Steve', 'David', 'Kane'], 'Points': [100, 87, 50]}
df1 = pd.DataFrame(df1_data)
df2 = pd.DataFrame(df2_data)
print("DataFrame1:")
print(df1)
print("\nDataFrame2:")
print(df2)
# Merge DataFrames on 'Player' column
result = pd.merge(df1, df2, on='Player', how='inner')
print("\nMerged DataFrames:")
print(result)
DataFrame1: Player Age 0 Steve 29 1 David 25 2 John 31 DataFrame2: Player Points 0 Steve 100 1 David 87 2 Kane 50 Merged DataFrames: Player Age Points 0 Steve 29 100 1 David 25 87
Comparison of Methods
| Method | Use Case | Key Parameter | Best For |
|---|---|---|---|
concat() with axis=1
|
Side-by-side combination | join='inner'/'outer' |
Same-length DataFrames |
concat() with axis=0
|
Vertical stacking | ignore_index=True |
Same column structure |
merge() |
Database-style joins |
on='column', how='inner'
|
Common key columns |
Conclusion
Use concat() for simple stacking operations and merge() for sophisticated joins based on common columns. Choose the method based on your data structure and combination requirements.
Advertisements
