Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Python Pandas – Merge and create cartesian product from both the DataFrames
To merge Pandas DataFrames and create a cartesian product, use the merge() function with the how="cross" parameter. A cartesian product combines every row from the first DataFrame with every row from the second DataFrame.
Syntax
pd.merge(df1, df2, how="cross")
Creating Sample DataFrames
Let's create two DataFrames to demonstrate the cartesian product ?
import pandas as pd
# Create DataFrame1
dataFrame1 = pd.DataFrame({
"Car": ['BMW', 'Mustang', 'Bentley', 'Jaguar'],
"Units": [100, 150, 110, 120]
})
print("DataFrame1:")
print(dataFrame1)
# Create DataFrame2
dataFrame2 = pd.DataFrame({
"Car": ['BMW', 'Tesla', 'Jaguar'],
"Reg_Price": [7000, 8000, 9000]
})
print("\nDataFrame2:")
print(dataFrame2)
DataFrame1:
Car Units
0 BMW 100
1 Mustang 150
2 Bentley 110
3 Jaguar 120
DataFrame2:
Car Reg_Price
0 BMW 7000
1 Tesla 8000
2 Jaguar 9000
Creating Cartesian Product
Now merge the DataFrames using how="cross" to create the cartesian product ?
import pandas as pd
# Create DataFrames
dataFrame1 = pd.DataFrame({
"Car": ['BMW', 'Mustang', 'Bentley', 'Jaguar'],
"Units": [100, 150, 110, 120]
})
dataFrame2 = pd.DataFrame({
"Car": ['BMW', 'Tesla', 'Jaguar'],
"Reg_Price": [7000, 8000, 9000]
})
# Merge DataFrames with cartesian product
mergedRes = pd.merge(dataFrame1, dataFrame2, how="cross")
print("Merged DataFrame with cartesian product:")
print(mergedRes)
Merged DataFrame with cartesian product:
Car_x Units Car_y Reg_Price
0 BMW 100 BMW 7000
1 BMW 100 Tesla 8000
2 BMW 100 Jaguar 9000
3 Mustang 150 BMW 7000
4 Mustang 150 Tesla 8000
5 Mustang 150 Jaguar 9000
6 Bentley 110 BMW 7000
7 Bentley 110 Tesla 8000
8 Bentley 110 Jaguar 9000
9 Jaguar 120 BMW 7000
10 Jaguar 120 Tesla 8000
11 Jaguar 120 Jaguar 9000
How It Works
The cartesian product creates all possible combinations:
- DataFrame1 has 4 rows, DataFrame2 has 3 rows
- Result has 4 × 3 = 12 rows
- Columns with same names get suffixes
_xand_y - Every row from df1 is paired with every row from df2
Conclusion
Use pd.merge(df1, df2, how="cross") to create a cartesian product of two DataFrames. This combines every row from the first DataFrame with every row from the second DataFrame, resulting in n×m rows where n and m are the row counts of the input DataFrames.
Advertisements
