Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Python Pandas – Find the Difference between two Dataframes
Finding differences between two DataFrames in Pandas involves comparing their structure, values, and content. The equals() method checks for exact equality, while other methods help identify specific differences.
Creating Sample DataFrames
Let's create two DataFrames to demonstrate comparison techniques ?
import pandas as pd
# Create DataFrame1
dataFrame1 = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Bentley', 'Jaguar'],
"Units": [100, 150, 110, 80, 110, 90]
})
print("DataFrame1:")
print(dataFrame1)
DataFrame1:
Car Units
0 BMW 100
1 Lexus 150
2 Audi 110
3 Mustang 80
4 Bentley 110
5 Jaguar 90
# Create DataFrame2 (identical to DataFrame1)
dataFrame2 = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Bentley', 'Jaguar'],
"Units": [100, 150, 110, 80, 110, 90]
})
print("DataFrame2:")
print(dataFrame2)
DataFrame2:
Car Units
0 BMW 100
1 Lexus 150
2 Audi 110
3 Mustang 80
4 Bentley 110
5 Jaguar 90
Checking Column Equality
Compare specific columns between DataFrames using the equals() method ?
# Check equality of specific column
units_equal = dataFrame2['Units'].equals(dataFrame1['Units'])
print("Are Units columns equal?", units_equal)
# Check equality of Car column
car_equal = dataFrame2['Car'].equals(dataFrame1['Car'])
print("Are Car columns equal?", car_equal)
Are Units columns equal? True Are Car columns equal? True
Checking Complete DataFrame Equality
Use equals() to check if entire DataFrames are identical ?
# Check complete DataFrame equality
are_equal = dataFrame1.equals(dataFrame2)
print("Are both DataFrames equal?", are_equal)
Are both DataFrames equal? True
Finding Differences with Different DataFrames
Let's create DataFrames with differences to demonstrate comparison methods ?
# Create DataFrames with differences
df1 = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Audi'],
"Units": [100, 150, 110]
})
df2 = pd.DataFrame({
"Car": ['BMW', 'Lexus', 'Honda'],
"Units": [100, 160, 110]
})
print("DataFrame 1:")
print(df1)
print("\nDataFrame 2:")
print(df2)
# Check equality
print("\nAre they equal?", df1.equals(df2))
# Element-wise comparison
print("\nElement-wise comparison:")
print(df1 == df2)
DataFrame 1:
Car Units
0 BMW 100
1 Lexus 150
2 Audi 110
DataFrame 2:
Car Units
0 BMW 100
1 Lexus 160
2 Honda 110
Are they equal? False
Element-wise comparison:
Car Units
0 True True
1 True False
2 False True
Comparison Methods
| Method | Purpose | Returns |
|---|---|---|
equals() |
Check exact equality | Boolean (True/False) |
== |
Element-wise comparison | DataFrame of booleans |
compare() |
Show differences (Pandas 1.1+) | DataFrame of differences |
Conclusion
Use equals() for checking complete DataFrame equality and element-wise comparison (==) to identify specific differences. These methods are essential for data validation and comparison tasks in data analysis workflows.
