Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Python – Merge two Pandas DataFrame
To merge two Pandas DataFrames, use the merge() function. By default, it performs an inner join on common columns between the DataFrames.
Basic Syntax
pd.merge(left_df, right_df, on='column_name', how='inner')
Creating Sample DataFrames
First, let's create two DataFrames with a common column ?
import pandas as pd
# Create DataFrame1
dataFrame1 = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Bentley', 'Jaguar'],
"Units": [100, 150, 110, 80, 110, 90]
})
print("DataFrame1:")
print(dataFrame1)
DataFrame1:
Car Units
0 BMW 100
1 Lexus 150
2 Audi 110
3 Mustang 80
4 Bentley 110
5 Jaguar 90
import pandas as pd
# Create DataFrame2
dataFrame2 = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Mercedes', 'Jaguar'],
"Reg_Price": [7000, 1500, 5000, 8000, 9000, 6000]
})
print("DataFrame2:")
print(dataFrame2)
DataFrame2:
Car Reg_Price
0 BMW 7000
1 Lexus 1500
2 Audi 5000
3 Mustang 8000
4 Mercedes 9000
5 Jaguar 6000
Merging DataFrames
Now merge both DataFrames using the merge() function. By default, it performs an inner join on common columns ?
import pandas as pd
# Create DataFrames
dataFrame1 = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Bentley', 'Jaguar'],
"Units": [100, 150, 110, 80, 110, 90]
})
dataFrame2 = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Mercedes', 'Jaguar'],
"Reg_Price": [7000, 1500, 5000, 8000, 9000, 6000]
})
# Merge DataFrames
mergedRes = pd.merge(dataFrame1, dataFrame2)
print("Merged DataFrame:")
print(mergedRes)
Merged DataFrame:
Car Units Reg_Price
0 BMW 100 7000
1 Lexus 150 1500
2 Audi 110 5000
3 Mustang 80 8000
4 Jaguar 90 6000
Merge Types
You can specify different merge types using the how parameter ?
import pandas as pd
dataFrame1 = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Bentley'],
"Units": [100, 150, 110, 80, 110]
})
dataFrame2 = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Mercedes'],
"Price": [7000, 1500, 5000, 12000, 9000]
})
# Left join - keeps all rows from left DataFrame
left_merge = pd.merge(dataFrame1, dataFrame2, how='left')
print("Left Merge:")
print(left_merge)
Left Merge:
Car Units Price
0 BMW 100 7000.0
1 Lexus 150 1500.0
2 Audi 110 5000.0
3 Mustang 80 NaN
4 Bentley 110 NaN
Key Points
- Inner join (default): Returns only matching rows from both DataFrames
- Left join: Returns all rows from left DataFrame, matched rows from right
- Right join: Returns all rows from right DataFrame, matched rows from left
- Outer join: Returns all rows from both DataFrames
Conclusion
The merge() function combines DataFrames based on common columns. Use the how parameter to control the join type and on to specify merge columns explicitly.
Advertisements
