Python Pandas – Find the Difference between two Dataframes

Finding differences between two DataFrames in Pandas involves comparing their structure, values, and content. The equals() method checks for exact equality, while other methods help identify specific differences.

Creating Sample DataFrames

Let's create two DataFrames to demonstrate comparison techniques ?

import pandas as pd

# Create DataFrame1
dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Bentley', 'Jaguar'],
    "Units": [100, 150, 110, 80, 110, 90]
})

print("DataFrame1:")
print(dataFrame1)
DataFrame1:
       Car  Units
0      BMW    100
1    Lexus    150
2     Audi    110
3  Mustang     80
4  Bentley    110
5   Jaguar     90
# Create DataFrame2 (identical to DataFrame1)
dataFrame2 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Bentley', 'Jaguar'],
    "Units": [100, 150, 110, 80, 110, 90]
})

print("DataFrame2:")
print(dataFrame2)
DataFrame2:
       Car  Units
0      BMW    100
1    Lexus    150
2     Audi    110
3  Mustang     80
4  Bentley    110
5   Jaguar     90

Checking Column Equality

Compare specific columns between DataFrames using the equals() method ?

# Check equality of specific column
units_equal = dataFrame2['Units'].equals(dataFrame1['Units'])
print("Are Units columns equal?", units_equal)

# Check equality of Car column
car_equal = dataFrame2['Car'].equals(dataFrame1['Car'])
print("Are Car columns equal?", car_equal)
Are Units columns equal? True
Are Car columns equal? True

Checking Complete DataFrame Equality

Use equals() to check if entire DataFrames are identical ?

# Check complete DataFrame equality
are_equal = dataFrame1.equals(dataFrame2)
print("Are both DataFrames equal?", are_equal)
Are both DataFrames equal? True

Finding Differences with Different DataFrames

Let's create DataFrames with differences to demonstrate comparison methods ?

# Create DataFrames with differences
df1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi'],
    "Units": [100, 150, 110]
})

df2 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Honda'],
    "Units": [100, 160, 110]
})

print("DataFrame 1:")
print(df1)
print("\nDataFrame 2:")
print(df2)

# Check equality
print("\nAre they equal?", df1.equals(df2))

# Element-wise comparison
print("\nElement-wise comparison:")
print(df1 == df2)
DataFrame 1:
    Car  Units
0   BMW    100
1  Lexus    150
2   Audi    110

DataFrame 2:
    Car  Units
0   BMW    100
1  Lexus    160
2  Honda    110

Are they equal? False

Element-wise comparison:
    Car  Units
0  True   True
1  True  False
2  False   True

Comparison Methods

Method Purpose Returns
equals() Check exact equality Boolean (True/False)
== Element-wise comparison DataFrame of booleans
compare() Show differences (Pandas 1.1+) DataFrame of differences

Conclusion

Use equals() for checking complete DataFrame equality and element-wise comparison (==) to identify specific differences. These methods are essential for data validation and comparison tasks in data analysis workflows.

Updated on: 2026-03-26T01:54:28+05:30

868 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements