Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python Pandas - Finding the uncommon rows between two DataFrames
To find the uncommon rows between two DataFrames, you can use concat() combined with drop_duplicates(). This approach concatenates both DataFrames and removes duplicate rows, leaving only the uncommon ones.
Syntax
pd.concat([df1, df2]).drop_duplicates(keep=False)
Where keep=False removes all occurrences of duplicated rows, leaving only the unique rows from each DataFrame.
Example
Let's create two DataFrames with car data and find the uncommon rows ?
import pandas as pd
# Create DataFrame1
dataFrame1 = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
"Reg_Price": [1000, 1500, 1100, 800, 1100, 900]
})
print("DataFrame1:")
print(dataFrame1)
# Create DataFrame2
dataFrame2 = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Tesla', 'Bentley', 'Jaguar'],
"Reg_Price": [1000, 1300, 1000, 800, 1100, 800]
})
print("\nDataFrame2:")
print(dataFrame2)
# Finding uncommon rows between two DataFrames
uncommon_rows = pd.concat([dataFrame1, dataFrame2]).drop_duplicates(keep=False)
print("\nUncommon rows between two DataFrames:")
print(uncommon_rows)
DataFrame1:
Car Reg_Price
0 BMW 1000
1 Lexus 1500
2 Audi 1100
3 Tesla 800
4 Bentley 1100
5 Jaguar 900
DataFrame2:
Car Reg_Price
0 BMW 1000
1 Lexus 1300
2 Audi 1000
3 Tesla 800
4 Bentley 1100
5 Jaguar 800
Uncommon rows between two DataFrames:
Car Reg_Price
1 Lexus 1500
2 Audi 1100
5 Jaguar 900
1 Lexus 1300
2 Audi 1000
5 Jaguar 800
How It Works
The process works in two steps:
-
Concatenation:
pd.concat([dataFrame1, dataFrame2])stacks both DataFrames vertically -
Remove duplicates:
drop_duplicates(keep=False)removes all rows that appear more than once
In the result, you can see that identical rows like BMW (1000), Tesla (800), and Bentley (1100) are removed, while different rows for Lexus, Audi, and Jaguar are preserved.
Alternative Method Using merge()
You can also find uncommon rows using an outer merge with indicator ?
import pandas as pd
dataFrame1 = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi'],
"Reg_Price": [1000, 1500, 1100]
})
dataFrame2 = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi'],
"Reg_Price": [1000, 1300, 1000]
})
# Using merge with indicator
merged = dataFrame1.merge(dataFrame2, how='outer', indicator=True)
uncommon = merged[merged['_merge'] != 'both']
print(uncommon.drop('_merge', axis=1))
Car Reg_Price
1 Lexus 1500
2 Audi 1100
4 Lexus 1300
5 Audi 1000
Conclusion
Use concat() with drop_duplicates(keep=False) to find uncommon rows between DataFrames. This method efficiently identifies rows that exist in one DataFrame but not the other, making it useful for data comparison and analysis.
